Responsibilities
- Maintain high availability and performance of key products and services by achieving or surpassing site reliability engineering targets.
- Deploy and manage production environments using infrastructure-as-code and configuration management systems.
- Implement and oversee comprehensive service monitoring through centralized logging and time series data platforms.
- Automate service deployment, operations, and monitoring workflows in alignment with CI/CD methodologies.
- Collaborate with engineering and security teams to refine, document, and implement processes that improve service operability and security posture.
- Participate in a shared on-call schedule to support system reliability and incident response.
- Perform additional duties as assigned based on organizational priorities and evolving business requirements.
Work Arrangement
Hybrid — Ottawa
Other
Participation in team on-call rotation is required.