Berlin, Germany Hybrid

Cephalgo is hiring a Senior Site Reliability Engineer

About the Role

Role Overview

As a Senior Site Reliability Engineer, you will be central to ensuring the stability, scalability, and performance of our cloud-native platform. You'll work closely with engineering teams to build robust systems that support millions of users, focusing on automation, observability, and operational excellence.

Key Responsibilities

  • Investigate and resolve complex issues across applications and infrastructure, minimizing service disruption
  • Participate in on-call rotations, sprint planning, and deployment coordination
  • Lead root cause analyses and guide teams toward preventive improvements
  • Enhance system observability using tools like Prometheus, Grafana, Splunk, and DataDog
  • Advocate for secure, scalable, and maintainable architectural patterns
  • Develop automation scripts, internal tools, and infrastructure-as-code to streamline operations
  • Document processes, runbooks, and technical standards to support team knowledge sharing
  • Collaborate across teams to refine CI/CD pipelines and software delivery practices

Required Qualifications

  • Minimum of 10 years in site reliability or systems engineering roles
  • Deep expertise in Linux systems, scripting, and troubleshooting
  • Proven experience with AWS services including EC2, ECS, Fargate, VPC, Route53, and load balancing
  • Strong background in infrastructure-as-code using CloudFormation, Terraform, Helm, or Ansible
  • Familiarity with containerization, Kubernetes, and microservices architectures
  • Hands-on experience with CI/CD and the full software development lifecycle
  • Proficiency with observability platforms such as New Relic, Splunk, or Datadog
  • Excellent written and verbal communication skills

Preferred Skills

  • Experience with AWS CDK

Technology Environment

The platform leverages Java, Kotlin, and C++ alongside Postgres for data storage. Infrastructure runs on AWS with services including ECS, Fargate, and ALB/NLB, orchestrated via Kubernetes. Automation is driven by CloudFormation, Terraform, Helm, and Ansible, while observability is powered by Prometheus, Grafana, and other leading monitoring tools.

Work Environment

This is a fully remote role with the option to work hybrid from our Berlin office. We support teams across the UK and Germany, fostering a flexible and inclusive working model.

Benefits & Culture

  • Competitive compensation package
  • Access to a corporate benefits platform with discounts on travel, fitness, fashion, and more
  • Opportunities for professional growth through dedicated training resources
  • Collaborative, international team focused on innovation and customer success
  • Commitment to diversity, inclusion, and a supportive workplace

Equal Opportunity

We value diversity and ensure equal consideration for employment regardless of race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, genetic information, marital status, or veteran status.

Required Skills
JavaKotlinC++PostgresAWSVPCEC2ECSRoute53FargateLinux administrationscriptingtroubleshootingObservabilityInfrastructure-as-Code JavaKotlinC++PostgresAWSVPCEC2ECSRoute53FargateLinux administrationscriptingtroubleshootingObservabilityInfrastructure-as-Code
Want to work from Thailand?

Join a remote network built for tech talent

Iglu gives you real employment in Southeast Asia — visa, work permit, and projects included. Pick what you work on, earn performance-based pay, and live where you want.

Legal employment in Thailand & Vietnam
Choose your own projects
Performance-based revenue sharing
Relocation support available
Join Iglu
200+ professionals worldwide
About company
Cephalgo
Cephalgo is a digital health company that uses voice/speech analysis AI to remotely measure symptoms of mental health conditions, particularly depression. They offer peer-reviewed algorithms delivered via API for integration into telehealth platforms and clinical trials. GDPR/HIPAA compliant, EU AI Act ready. Backed by Horizon Europe / European Innovation Council. Based in France (Grand Est region). Their active research projects focus on emotion detection from voice biosignals (ADAPTE), depression severity tracking (OPADE), and biosignal-monitored therapeutic protocols (ReSonate).
All jobs at Cephalgo Visit website
Job Details
Category infrastructure
Posted 4 months ago