As the Manager of Database Operations, you will oversee the full lifecycle of cloud-hosted databases within a large-scale AWS RDS environment. Your primary responsibility is to ensure systems remain highly available, secure, performant, and efficiently scaled to meet evolving business demands.
Key Responsibilities
- Manage and optimize database platforms including Aurora, MySQL, and SQL Server across multi-AZ and disaster recovery configurations
- Design and implement infrastructure as code using tools such as Terraform, CloudFormation, or CDK
- Integrate database changes into CI/CD pipelines to enable reliable, automated deployments
- Enforce operational standards, security policies, and compliance requirements across all database environments
- Lead database maintenance, including patching, version upgrades, and vulnerability remediation
- Collaborate with engineering teams on schema deployment automation, migration strategies, and zero-downtime release models
- Implement monitoring, alerting, and observability using CloudWatch, Datadog, or similar tools
- Conduct root cause analysis for critical incidents and lead post-incident reviews
- Drive performance tuning, capacity planning, and cost optimization initiatives
- Mentor and lead a team of database engineers, fostering a culture of automation and continuous improvement
- Establish KPIs for system reliability, performance, and deployment efficiency
- Work closely with Infrastructure, Security, DevOps, and SRE teams to align database practices with broader platform goals
Qualifications
Applicants should hold a bachelor’s degree in computer science, Information Systems, or a related field, or demonstrate equivalent experience. A minimum of 10 years in database operations is required, with at least 3 years in a leadership capacity.
Essential skills include hands-on experience with AWS RDS services, infrastructure as code, CI/CD integration, and scripting in Python, Bash, or PowerShell. Expertise in high availability, backup and recovery, and performance troubleshooting at scale is critical.
Preferred qualifications include AWS certifications, experience with Delphix, containerized environments (EKS, ECS), GitOps, and familiarity with regulated or 24x7 operational settings.