Responsibilities
- Oversee, monitor, and provide technical support for an instant payment platform across development, testing, and live environments.
- Serve as the escalation contact for production incidents, conducting troubleshooting and root cause investigations.
- Conduct root cause analysis using observability platforms, log reviews, and reverse engineering methods as required.
- Participate in scheduled on-call rotations with 24/7 availability to respond to critical production issues.
- Assist in preparing releases, including managing pull requests, branches, configurations, and validations in coordination with development and infrastructure units.
- Examine repeated system issues and implement long-term solutions to enhance application reliability and efficiency.
- Work closely with development, quality assurance, and infrastructure teams to streamline the application lifecycle.
- Keep up-to-date documentation for operational procedures, system setups, and support protocols.
- Enhance monitoring, alerting, and diagnostic workflows using tools like Splunk and Dynatrace, identifying and mitigating risks proactively.
Other
- Participate in On-Call shifts on a rota basis (24/7) and ensure support for Production critical incidents
- Ability to work in On-call rotations and handle high-pressure production issues