About the Role
This role focuses on maintaining and improving the reliability of systems through automation, monitoring, and incident response. The engineer will collaborate with engineering teams to reduce operational overhead and enhance service resilience.
Responsibilities
- Monitor system performance and respond to incidents
- Develop automation tools to improve operational efficiency
- Collaborate with engineering teams to resolve production issues
- Design and maintain scalable infrastructure
- Implement and manage monitoring and alerting systems
- Participate in on-call rotations for incident response
- Optimize system reliability and uptime
- Contribute to post-incident reviews and follow-up actions
- Improve deployment and release processes
- Support capacity planning and performance tuning
- Maintain documentation for systems and procedures
- Ensure compliance with security and operational standards
- Troubleshoot complex distributed systems
- Evaluate new technologies for operational improvements
- Promote best practices in reliability engineering
- Work across time zones with a global team
- Drive initiatives to reduce technical debt
- Assist in designing fault-tolerant architectures
- Support disaster recovery planning and testing
- Contribute to system architecture discussions
- Enhance observability across services
- Automate routine operational tasks
- Ensure systems meet service level objectives
- Collaborate on incident prevention strategies
- Support secure and reliable CI/CD pipelines
Nice to Have
- Bachelor’s degree in computer science or related field
- Certifications in cloud or systems engineering
- Experience with large-scale distributed systems
- Background in software development
- Public speaking or technical writing experience
- Contributions to open source projects
- Experience in remote-first organizations
Benefits
- Flexible work schedule
- Remote work support
- Paid time off
- Health insurance coverage
- Retirement savings plan
- Home office stipend
- Professional development budget
- Mental health resources
- Parental leave
- Equity compensation
Compensation
Competitive salary and benefits package
Work Arrangement
Remote
Team
Distributed team focused on system reliability and performance
Our Engineering Culture
We value transparency, autonomy, and continuous improvement. Engineers are empowered to make decisions, ship code frequently, and contribute to system design. Collaboration happens asynchronously with a focus on clear documentation and sustainable workflows.
Diversity, Equity, and Inclusion
We are committed to building a diverse and inclusive workplace. We welcome applicants from all backgrounds and experiences, and we do not discriminate on the basis of race, gender, religion, or identity.
Not available


