About the Role
The engineer will build and optimize systems that unify data storage and processing, enabling efficient data access, transformation, and analysis across structured and unstructured data sources.
Responsibilities
- Design and implement data lakehouse architectures
- Develop pipelines for ingesting and processing large-scale datasets
- Optimize query performance across distributed storage systems
- Ensure data consistency, reliability, and freshness
- Collaborate with data scientists and analysts to understand requirements
- Maintain data security and access controls
- Monitor system performance and troubleshoot issues
- Improve data lineage and metadata management
- Support real-time and batch processing workflows
- Integrate machine learning models into data pipelines
- Evaluate and adopt new data technologies
- Write clean, maintainable code with thorough documentation
- Participate in code reviews and system design discussions
- Ensure compliance with data governance policies
- Scale infrastructure to meet growing data demands
- Automate operational tasks to reduce manual overhead
- Work closely with platform engineering for infrastructure needs
- Implement observability for data workflows
- Contribute to disaster recovery and backup strategies
- Stay current with advancements in data processing frameworks
Nice to Have
- Experience with real-time streaming platforms like Kafka
- Familiarity with machine learning pipelines
- Contributions to open-source data projects
- Experience with data catalog tools such as Amundsen or DataHub
- Knowledge of Kubernetes for data workloads
- Background in observability tools for data systems
- Experience in startup or fast-paced environments
Benefits
- Health, dental, and vision insurance
- Retirement savings plan with company match
- Unlimited paid time off
- Home office stipend
- Professional development budget
- Flexible parental leave policy
- Mental health and wellness support
- Stock option grant
Compensation
Competitive salary and equity package
Work Arrangement
Remote with flexible hours
Team
Small, cross-functional engineering team focused on data infrastructure
Tech Stack
Delta Lake, Apache Spark, AWS S3, Airflow, Docker, Kubernetes, GCP, Parquet, Git, Python, SQL
Culture
We value transparency, ownership, and continuous learning. Engineers are expected to take initiative, ship high-quality work, and contribute to a collaborative environment.
Available for qualified candidates


