About the Role
Design and build data pipelines, maintain data architecture, and support analytics initiatives through robust engineering practices.
Responsibilities
- Develop and manage data pipelines for reliable information flow
- Design scalable database structures to support business needs
- Optimize data storage and retrieval processes
- Collaborate with analytics teams to deliver usable datasets
- Ensure data accuracy and consistency across systems
- Implement data validation and quality checks
- Support the integration of new data sources
- Monitor data workflows for performance and errors
- Improve data accessibility for non-technical users
- Document data models and pipeline logic
- Work with cross-functional teams to define data requirements
- Maintain metadata and data lineage records
- Troubleshoot data-related issues in production
- Contribute to data governance practices
- Automate repetitive data processing tasks
- Participate in code reviews and system design discussions
- Ensure compliance with data security policies
- Evaluate new data tools and technologies
- Support data warehouse operations
- Assist in migration of legacy data systems
- Enhance data processing efficiency
- Integrate machine learning outputs into data workflows
- Build APIs for data access
- Maintain version control for data scripts
- Support real-time data streaming solutions
Nice to Have
- Master's degree in a technical field
- Experience with real-time data processing
- Knowledge of machine learning pipelines
- Familiarity with data governance frameworks
- Experience in regulated industries
- Contributions to open-source data projects
- Certifications in cloud platforms
- Experience with streaming technologies like Kafka
- Background in data security
- Leadership in technical projects
Compensation
Competitive salary based on experience
Work Arrangement
Hybrid work model with remote and office options
Team
Collaborative engineering team focused on scalable data solutions
Technology Stack
- Uses modern cloud infrastructure for data hosting
- Relies on Python and SQL for data processing
- Implements Airflow for workflow orchestration
- Leverages AWS and GCP for scalable storage
- Employs Docker for environment consistency
- Utilizes Spark for large-scale data computation
- Integrates Kafka for event streaming
Growth Opportunities
- Access to training programs in data technologies
- Opportunities to lead engineering initiatives
- Mentorship from senior data professionals
- Chance to contribute to architectural decisions
- Support for conference participation
- Internal mobility across technical roles
Available for qualified candidates
