About the Role

The role involves developing core infrastructure that powers machine learning applications across services, enabling teams to train, deploy, and manage models efficiently at scale.

Responsibilities

Design and build scalable infrastructure for machine learning workflows
Collaborate with data scientists and researchers to operationalize models
Optimize training and inference pipelines for performance and cost
Develop tools to automate deployment, monitoring, and scaling of ML systems
Ensure platform reliability, security, and compliance across environments
Integrate new hardware and distributed computing technologies into the platform
Support versioning, reproducibility, and experiment tracking for ML workflows
Work closely with product teams to understand requirements and deliver solutions
Improve data ingestion and processing frameworks for model training
Contribute to architectural decisions for cloud-native and on-premise systems
Maintain documentation and best practices for platform usage
Troubleshoot and resolve issues in production ML environments
Evaluate and adopt open-source and internal ML tools
Drive improvements in observability and debugging capabilities
Support model governance, including lineage and auditability
Help define standards for model performance and quality assurance
Participate in code reviews and system design discussions
Mentor junior engineers and promote technical excellence
Stay current with advancements in ML infrastructure and distributed systems
Contribute to long-term roadmap planning for platform evolution

Nice to Have

Master’s or PhD in computer science or related field
Direct experience scaling ML platforms in high-traffic environments
Deep knowledge of Kubernetes and cloud-native architectures
Hands-on experience with GPU-accelerated computing
Contributions to open-source ML or infrastructure projects
Experience with MLOps tools and platforms
Background in systems performance tuning and resource optimization
Familiarity with security practices in ML systems
Prior work in gaming, media, or interactive entertainment

Compensation

Competitive salary and benefits package

Work Arrangement

Hybrid

Team

Part of a dedicated platform engineering team focused on machine learning systems within a global interactive technology environment

About the Team

This group builds foundational systems that support machine learning initiatives across the organization, focusing on scalability, automation, and developer experience.
Engineers work on challenges involving distributed training, real-time inference, and integration with diverse data sources and applications.

What We Value

Technical rigor and attention to detail
Collaborative problem solving
Ownership of system performance and reliability
Continuous learning and knowledge sharing

Available for qualified candidates

Sony Interactive Entertainment (PlayStation) is hiring a Senior Software Engineer, ML Platform

About the Role

Responsibilities

Nice to Have

Compensation

Work Arrangement

Team

About the Team

What We Value

Similar Jobs

Senior Infrastructure Engineer /DevOps

IT Software Engineer - Monks

Contract: AI Operations Specialist

Senior Database Reliability Engineer

Staff / Senior Infrastructure Engineer (relocation)

Senior Infrastructure Engineer /DevOps (relocation)

Related Articles

Platform Engineering: Kubernetes for All

Become an AI Developer: Your Career Guide

Developer Experience Platform: Lessons from Europe