About the Role
The role involves using data to improve operational efficiency and system reliability by building models, analyzing infrastructure performance, and working closely with engineering teams to implement data-driven solutions.
Responsibilities
- Analyze complex operational datasets to identify trends and inefficiencies
- Develop and deploy statistical models to forecast system behavior
- Collaborate with engineering teams to integrate data-driven insights
- Design experiments to evaluate operational changes
- Create dashboards and reporting tools for key performance metrics
- Translate business questions into analytical frameworks
- Optimize data pipelines for reliability and speed
- Apply machine learning techniques to detect anomalies in infrastructure
- Support root cause analysis for system incidents
- Work closely with product and platform teams to define success metrics
- Ensure data accuracy and consistency across reporting systems
- Conduct deep-dive analyses on service performance and uptime
- Identify opportunities for automation in monitoring workflows
- Present findings to technical and non-technical stakeholders
- Maintain documentation for models and analytical processes
- Evaluate the impact of infrastructure upgrades using data
- Build scalable solutions for real-time data processing
- Contribute to the development of internal data science tools
- Assess risks associated with operational decisions
- Improve data quality through validation and cleansing methods
- Use statistical inference to support capacity planning
- Develop classification systems for incident categorization
- Monitor model performance and implement updates as needed
- Support post-mortem reviews with quantitative analysis
- Drive best practices in measurement and evaluation
Nice to Have
- PhD in a relevant technical field
- Experience in high-growth technology environments
- Background in cybersecurity or network operations
- Publications or presentations in data science or operations research
- Contributions to open-source data science tools
- Experience with real-time streaming data platforms
- Knowledge of reliability engineering principles
- Familiarity with incident response workflows
- Experience leading data science initiatives
- Strong grasp of distributed tracing and telemetry systems
Compensation
Competitive salary and comprehensive benefits package
Work Arrangement
Hybrid work model with flexibility for remote and in-office collaboration
Team
Part of the data science and operations analytics team focused on optimizing performance and scalability
About the Team
This team focuses on leveraging data science to enhance the performance, scalability, and resilience of large-scale systems. Members work at the intersection of data analysis, engineering, and operations to solve complex challenges.
What We Value
We prioritize technical rigor, clear communication, collaboration across disciplines, and a commitment to continuous learning and improvement in data-driven decision-making.
Available for qualified candidates
