About the Role
Build and optimize systems that support training, deploying, and monitoring machine learning models at scale.
Responsibilities
- Develop tools and platforms for efficient model training and evaluation
- Improve reliability and speed of ML pipeline execution
- Collaborate with researchers to productionize experimental models
- Monitor system performance and troubleshoot infrastructure issues
- Implement scalable data processing workflows
- Ensure infrastructure supports reproducible experiments
- Maintain version control for models and datasets
- Optimize resource allocation across computing environments
- Support deployment of models into production settings
- Integrate security practices into ML systems
Nice to Have
- Prior work on ML infrastructure teams
- Contributions to open-source ML tools
- Experience with Kubernetes in production
- Knowledge of GPU-accelerated computing
- Familiarity with MLOps best practices
- Exposure to model monitoring and observability
- Understanding of data versioning systems
Benefits
- Health, dental, and vision insurance
- Flexible paid time off
- Remote work support stipend
- Professional development budget
- Equity compensation
- 401(k) matching
- Parental leave
- Mental health resources
Compensation
Competitive salary and equity package
Work Arrangement
Hybrid
Team
Small, cross-functional team focused on machine learning systems
About the Team
Work alongside engineers and scientists building next-generation AI tools for drug discovery.
Technology Stack
Python, Kubernetes, Docker, GCP, TensorFlow, PyTorch, Airflow, Prometheus
Available


