Responsibilities
- Create and manage robust ETL and ELT pipelines for processing both structured and unstructured clinical datasets
- Design and refine data models to support reporting, analytics, and machine learning initiatives
- Build and sustain cloud-based data systems using AWS infrastructure
- Develop data workflows that enable AI and machine learning model development and deployment
- Transition machine learning models from experimentation to production environments
- Maintain high standards for data quality, consistency, governance, and adherence to regulations
- Enhance the performance, stability, and scalability of large data platforms
- Work closely with data scientists, AI specialists, software developers, and product teams
- Convert clinical and business needs into efficient, scalable data solutions
- Implement monitoring, observability tools, and automated checks across data pipelines
- Support the development of data engineering best practices, architectural standards, and platform improvements
Responsibilities
- Design, build, and maintain scalable ETL/ELT pipelines for structured and unstructured clinical data
- Develop and optimize data models supporting analytics, reporting, and machine learning workflows
- Build and maintain cloud‑native data architectures within AWS environments
- Develop pipelines that support AI and machine learning model development and deployment
- Operationalize and productionize machine learning models developed by Data Science teams
- Ensure data quality, integrity, governance, and regulatory compliance
- Improve performance, reliability, and scalability of large‑scale data platforms
- Collaborate closely with data scientists, AI engineers, software engineers, and product teams
- Translate clinical and business requirements into scalable data engineering solutions
- Implement monitoring, observability, and automated validation across data pipelines
- Contribute to data engineering standards, architecture design, and platform evolution
