Remote (Global)

Scalingo is hiring a Senior Site Reliability Engineer (SRE) - Cloud - Remote - F/H

Scalingo is hiring a Senior Site Reliability Engineer (SRE) to be responsible for the reliability, performance, and resilience of our European cloud platform. You will provide technical leadership, manage incidents, and drive strategic automation projects, with a future path toward team management.

What You'll Do

  • Provide technical leadership and guidance to the SRE team, including mentoring, prioritization, and technical reviews.
  • Analyze system performance, identify bottlenecks, and propose improvements for resource optimization and scalability.
  • Define, implement, and improve observability tools for proactive incident detection.
  • Participate in on-call rotations and manage critical incidents to limit impact.
  • Lead and animate incident post-mortems, identifying root causes and defining corrective actions.
  • Ensure compliance with service commitments and contribute to ISO 27001 and HDS compliance.
  • Plan, execute, and analyze regular tests of business continuity and disaster recovery plans.
  • Collaborate closely with development teams to integrate reliability, performance, and security requirements from the design phase.
  • Contribute to writing, structuring, and maintaining clear and up-to-date operational documentation.

What We're Looking For

  • Solid expertise in cloud environments and distributed infrastructures, with a strong culture of high availability and production reliability.
  • Mastery of observability practices and structured diagnostic skills for complex incidents.
  • Good understanding of containerized environments and their operational challenges.
  • Confirmed skills in production databases: reliability, backups, restoration, replication, and scalability.
  • Practice of Infrastructure as Code and environment automation.
  • Sensitivity to operational security issues.
  • Comfort using AI tools to improve daily efficiency.
  • Ability to work in complex, changing, or uncertain contexts with rigor and reliability.
  • Clear and structured communication, taste for cross-team collaboration and knowledge sharing.
  • Blameless posture, technical curiosity, composure, and attention to user impact.
  • Ability to exercise technical leadership, transmit knowledge, and advance collective practices.

Team & Environment

You will join an SRE team of 2 people and report directly to an Engineering Manager. The role involves strong technical and operational leadership without direct hierarchical responsibility initially.

Benefits & Compensation

  • Full remote with 1 trip per quarter (Strasbourg or other city)
  • Company events: 1 annual Offsite and regular afterworks
  • Remote work allowance (57.60€)
  • Restaurant vouchers (11.52 € per unit) and Swile card with benefits
  • Health insurance fully covered by Scalingo (BENEFIZ)
  • Flexible hours under a time-based agreement (RTT)
  • Laptop running Linux
  • Budget for complementary equipment (participation)

Work Mode

This is a fully remote position.

We firmly believe in equal opportunities.

Required Skills
KubernetesTerraformLinuxPrometheusGrafanaAWSGCPCI/CDGoPythonBashNetworkingDatabasesIncident ManagementObservability KubernetesTerraformLinuxPrometheusGrafanaAWSGCPCI/CDGoPythonBashNetworkingDatabasesIncident ManagementObservability
Want to work from Thailand?

Join a remote network built for tech talent

Iglu gives you real employment in Southeast Asia — visa, work permit, and projects included. Pick what you work on, earn performance-based pay, and live where you want.

Legal employment in Thailand & Vietnam
Choose your own projects
Performance-based revenue sharing
Relocation support available
Join Iglu
200+ professionals worldwide
About company
Scalingo
Jobs at Scalingo. Browse all our open positions and become part of our growing team! We are currently looking for additions to our company. Apply today!
All jobs at Scalingo Visit website
Job Details
Category infrastructure
Posted 3 months ago