Responsibilities
- Set up, configure, and manage AWS services such as EC2, S3, IAM, and VPCs following AWS Well-Architected best practices.
- Conduct regular system upkeep, including patching and updating configurations across cloud platforms.
- Operate and maintain the TWS/IWS system, monitoring job flows, managing dependencies, and verifying agent status.
- Ensure consistent and timely execution of daily and real-time production workloads.
- Track batch processing jobs and confirm all workflows complete successfully.
- Design and deploy scheduling solutions to meet application team needs.
- Adhere to enterprise-wide change control policies and protocols.
- Diagnose and fix job failures, scheduling clashes, and dependency problems.
- Support ongoing maintenance of workload automation systems.
- Apply least-privilege principles in AWS IAM to manage access securely.
- Monitor security alerts and address vulnerabilities using tools like AWS Security Hub.
- Assist in meeting audit and compliance standards for cloud and automation environments.
- Investigate, analyze, and resolve incidents tied to AWS infrastructure or delayed/failed scheduled jobs.
- Escalate technical challenges as needed and contribute to root-cause investigations.
- Identify inefficiencies and performance issues in cloud resource utilization and job processing speed.
- Optimize performance and reduce costs through auto-scaling, scheduling refinements, and script enhancements.
- Develop, test, and document backup and disaster recovery plans for AWS assets and TWS databases.
- Engage in disaster recovery drills and verify recovery processes are effective.
Other
- Candidates must have resided in the United States for at least three of the past five years to qualify for CMS program roles.
- Applicants must be eligible to obtain and retain a public trust clearance.
