Remote (Global)

Ensono is hiring a Senior Site Reliability Engineer

About the Role

This role involves designing and maintaining reliable systems, automating operational processes, and collaborating across teams to improve service resilience and incident response.

Responsibilities

  • Design and implement scalable infrastructure solutions
  • Monitor system performance and respond to incidents
  • Develop automation tools to reduce manual operations
  • Collaborate with development teams to enhance system reliability
  • Troubleshoot and resolve complex technical issues
  • Maintain system documentation and runbooks
  • Participate in on-call rotations for incident management
  • Optimize system availability and reduce downtime
  • Implement proactive alerting and monitoring systems
  • Support cloud infrastructure and migration initiatives
  • Enforce security and compliance standards
  • Drive post-incident reviews and follow-up actions
  • Improve deployment reliability and rollback procedures
  • Contribute to capacity planning and performance tuning
  • Promote best practices in system design and operations
  • Integrate reliability into the software development lifecycle
  • Use data to identify and resolve system bottlenecks
  • Manage configuration and change control processes
  • Support disaster recovery planning and testing
  • Ensure systems meet service level objectives
  • Work with cross-functional teams to resolve production issues
  • Evaluate new technologies for operational improvements
  • Mentor junior engineers and share technical knowledge
  • Maintain focus on customer impact during outages
  • Contribute to engineering standards and operational policies

Nice to Have

  • Master’s degree in a technical field
  • Certifications in cloud or systems engineering
  • Experience with large-scale enterprise systems
  • Background in financial or regulated industries
  • Knowledge of Kubernetes and service mesh technologies
  • Experience with infrastructure as code tools
  • Familiarity with observability platforms
  • Contributions to open-source projects
  • Public speaking or technical writing experience
  • Leadership in incident command roles

Compensation

Competitive salary and benefits package

Work Arrangement

Hybrid work model with flexible location options

Team

Part of a global engineering team focused on system reliability and performance

Why This Role Matters

  • This position plays a critical role in maintaining the stability and performance of systems that support enterprise clients.
  • Engineers in this role directly influence uptime, scalability, and the overall customer experience.

What to Expect

  • You will work across time zones with global teams.
  • Expect a mix of strategic planning and hands-on technical problem solving.
  • Opportunities for professional growth and technical leadership are built into the role.

Available for qualified candidates

Required Skills
TerraformAzure DevOpsGitHub ActionsGitLabDatadogCloud InfrastructureCI/CDMonitoring
About company
Ensono
An expert technology adviser and managed service provider with cross-platform certifications, empowering clients to embrace innovation and achieve key business outcomes.
All jobs at Ensono Visit website
Job Details
Category infrastructure
Posted 6 months ago