Requirements
- Over a decade of professional experience in infrastructure, site reliability, or platform engineering roles.
- At least five years leading teams or serving in technical leadership roles within SRE or platform engineering.
- Proven track record managing distributed teams across multiple regions including the US, Canada, EU, and other global time zones.
- Direct experience guiding or mentoring teams in SRE, infrastructure, or platform functions within production SaaS environments.
- Demonstrated leadership ability with a focus on coaching and developing senior-level engineering talent.
- Deep understanding of core SRE principles including service level objectives, service level indicators, error budgets, incident response, capacity planning, and operational best practices.
- Extensive hands-on experience with cloud platforms such as AWS, Google Cloud Platform, and Azure, particularly managed services.
- Strong coding background with production-level software development in languages like Go, Python, or Bash for automation and tooling.
- Practical experience deploying and managing Kubernetes clusters, custom operators or controllers, containerized applications, and infrastructure-as-code tools like Terraform or Pulumi.
- Familiarity with observability ecosystems including monitoring with Prometheus, visualization with Grafana, and logging and distributed tracing pipelines.
- Exceptional communication skills with the ability to articulate reliability tradeoffs to non-technical stakeholders and produce clear incident reports and postmortems.
- History of identifying operational challenges and converting them into actionable engineering initiatives.
Nice to Have
- Experience integrating or working with AI-driven systems or intelligent tooling.
- Background operating systems that support multi-tenant architectures or require strong customer isolation.
- Knowledge of distributed databases and optimizing performance at large scale.
- Experience designing or building internal developer platforms or standardized development pathways.
Benefits
- Engage with innovative technologies in a high-growth industry.
- Work in a supportive culture where contributions directly influence outcomes.
- Salary commensurate with experience and market standards.
- Equity compensation through stock options at an early-stage startup.
- Full benefits package including health insurance for US employees and additional coverage options.
- Fully remote role with flexible scheduling to support global work patterns.
- Biannual in-person gatherings focused on team connection, collaboration, and shared experiences.
Compensation
Competitive salary based on experience
Work Arrangement
Remote (Worldwide)
Team
Global team of Site Reliability Engineers
Other
Twice-yearly travel for team offsites focused on team bonding, collaboration, and having fun!


