About the Role
The role involves building and managing cloud platforms, improving deployment pipelines, and ensuring system resilience through automation and monitoring.
Responsibilities
- Design and maintain scalable cloud infrastructure on major providers
- Automate provisioning and configuration using infrastructure-as-code tools
- Optimize system performance, availability, and cost efficiency
- Implement and manage CI/CD pipelines for rapid deployment
- Monitor systems and respond to incidents with minimal downtime
- Enforce security standards across infrastructure and deployment workflows
- Collaborate with development teams to support application deployment
- Troubleshoot complex production issues across distributed environments
- Manage containerized workloads using orchestration platforms
- Ensure compliance with internal and external standards
- Develop tools to streamline operational tasks
- Support disaster recovery and business continuity planning
- Evaluate and integrate new technologies into existing systems
- Document architecture decisions and operational procedures
- Lead initiatives to improve system observability
- Participate in on-call rotations for critical systems
- Mentor engineers on best practices in infrastructure management
- Drive adoption of automation across engineering teams
- Work closely with security teams to address vulnerabilities
- Maintain accurate inventory of cloud resources and services
Nice to Have
- Master’s degree in a technical discipline
- Certifications in cloud platforms or DevOps practices
- Experience in gaming or real-time applications
- Contributions to open-source infrastructure projects
- Familiarity with edge computing technologies
- Knowledge of multi-region deployment patterns
- Experience with service mesh architectures
- Background in site reliability engineering
- Leadership experience in technical projects
- Public speaking or conference participation
Compensation
Competitive salary with equity and benefits package
Work Arrangement
Hybrid with flexible remote options
Team
Collaborative engineering environment focused on scalable systems
Why This Role Matters
- This position plays a critical part in shaping the reliability and scalability of core systems that serve millions of users.
- Engineers in this role directly influence platform stability and developer productivity.
Technology Stack
- Primary cloud provider: AWS
- Container orchestration: Kubernetes
- Infrastructure as code: Terraform
- CI/CD: GitHub Actions, ArgoCD
- Monitoring: Prometheus, Grafana, Datadog
Available for qualified candidates


