Responsibilities
- Formulate and implement the SRE roadmap in alignment with organizational objectives.
- Guide and develop the SRE team, promoting ongoing enhancement and creative problem-solving.
- Work closely with engineering and product departments to synchronize goals and outcomes.
- Supervise the creation, deployment, and upkeep of high-scale, dependable systems.
- Maintain efficient system operations through automation, monitoring, and performance tuning.
- Establish effective incident management protocols, including root cause investigations and post-event reviews.
- Set and track key metrics such as KPIs and SLOs to measure service reliability.
- Anticipate and resolve reliability risks proactively to prevent customer impact.
- Promote the use of advanced monitoring and alerting tools to safeguard system performance.
- Ensure infrastructure meets security requirements and complies with applicable regulations.
- Partner with security specialists to uphold strong, industry-leading security standards.
- Serve as a bridge between technical teams, product units, and other critical stakeholders.
- Deliver clear, tailored communication to both technical and executive audiences.
Work Arrangement
Remote (Country)