Role Overview
As a Senior Site Reliability Engineer, you will be central to ensuring the stability, scalability, and performance of our cloud-native platform. You'll work closely with engineering teams to build robust systems that support millions of users, focusing on automation, observability, and operational excellence.
Key Responsibilities
- Investigate and resolve complex issues across applications and infrastructure, minimizing service disruption
- Participate in on-call rotations, sprint planning, and deployment coordination
- Lead root cause analyses and guide teams toward preventive improvements
- Enhance system observability using tools like Prometheus, Grafana, Splunk, and DataDog
- Advocate for secure, scalable, and maintainable architectural patterns
- Develop automation scripts, internal tools, and infrastructure-as-code to streamline operations
- Document processes, runbooks, and technical standards to support team knowledge sharing
- Collaborate across teams to refine CI/CD pipelines and software delivery practices
Required Qualifications
- Minimum of 10 years in site reliability or systems engineering roles
- Deep expertise in Linux systems, scripting, and troubleshooting
- Proven experience with AWS services including EC2, ECS, Fargate, VPC, Route53, and load balancing
- Strong background in infrastructure-as-code using CloudFormation, Terraform, Helm, or Ansible
- Familiarity with containerization, Kubernetes, and microservices architectures
- Hands-on experience with CI/CD and the full software development lifecycle
- Proficiency with observability platforms such as New Relic, Splunk, or Datadog
- Excellent written and verbal communication skills
Preferred Skills
- Experience with AWS CDK
Technology Environment
The platform leverages Java, Kotlin, and C++ alongside Postgres for data storage. Infrastructure runs on AWS with services including ECS, Fargate, and ALB/NLB, orchestrated via Kubernetes. Automation is driven by CloudFormation, Terraform, Helm, and Ansible, while observability is powered by Prometheus, Grafana, and other leading monitoring tools.
Work Environment
This is a fully remote role with the option to work hybrid from our Berlin office. We support teams across the UK and Germany, fostering a flexible and inclusive working model.
Benefits & Culture
- Competitive compensation package
- Access to a corporate benefits platform with discounts on travel, fitness, fashion, and more
- Opportunities for professional growth through dedicated training resources
- Collaborative, international team focused on innovation and customer success
- Commitment to diversity, inclusion, and a supportive workplace
Equal Opportunity
We value diversity and ensure equal consideration for employment regardless of race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, genetic information, marital status, or veteran status.