Responsibilities
- Architect, deploy, and oversee AWS cloud environments supporting AI, data, and application systems
- Create and sustain Infrastructure as Code using Terraform, CloudFormation, or equivalent tools
- Develop and operate CI/CD pipelines for automated application and data service builds, testing, and deployments
- Operate containerized platforms using Docker and container orchestration technologies
- Establish monitoring, logging, and alerting systems to ensure system reliability and uptime
- Enhance cloud infrastructure for performance, scalability, and cost efficiency
- Enforce security standards including identity and access management, network protection, and secrets handling
- Contribute to AI-related projects and support infrastructure requirements for AI processing workflows
- Partner with data engineering teams to deploy and scale AI and data platforms
- Improve platform stability through automation, system oversight, and incident management
- Keep current documentation for infrastructure designs, deployment methods, and operational protocols
Responsibilities
- Design, implement, and manage cloud infrastructure on AWS to support AI, data, and application workloads
- Develop and maintain Infrastructure as Code (IaC) using tools such as Terraform, CloudFormation, or similar
- Build and manage CI/CD pipelines to support automated build, test, and deployment of applications and data services
- Manage containerized environments using Docker and container orchestration platforms
- Implement monitoring, logging, and alerting frameworks to ensure high availability and reliability of systems
- Optimize infrastructure performance, scalability, and cost across cloud environments
- Implement and maintain security best practices, including IAM policies, network security, and secrets management
- Exposure to AI initiatives and Experience supporting infrastructure needs for AI workflows.
- Collaborate with data engineers to support deployment and scaling of AI and data platforms
- Support platform reliability through automation, system monitoring, and incident response processes
- Maintain documentation for infrastructure architecture, deployment processes, and operational procedures


