Responsibilities
- Develop secure and scalable data pipelines supporting both batch and real-time data workflows.
- Maintain and enhance data infrastructure that enables machine learning operations, model inference, and analytical reporting.
- Produce clean, production-grade code using Java, Scala, or Python.
- Utilize big data technologies including Apache Spark, Kafka, Flink, Airflow, and AWS EMR.
- Implement graph-based data models and work with graph databases such as Neo4j or Amazon Neptune.
- Improve data architecture to enhance performance, reduce costs, and simplify ongoing maintenance.
- Collaborate with Data Science, Product, and Security teams to translate business requirements into technical data solutions.
- Engage in system design discussions, code reviews, and knowledge sharing to promote engineering excellence.
