Hybrid

Nvidia is hiring a Senior System Software Engineer, NCCL - Partner Enablement

About the Role

This role involves developing and optimizing system software to support partner integration and deployment of high-performance computing solutions, with a focus on improving performance, scalability, and collaboration across distributed systems.

Responsibilities

  • Design and optimize low-level software components for distributed computing environments
  • Collaborate with partner engineering teams to integrate communication libraries
  • Improve performance and scalability of system-level software in GPU-accelerated clusters
  • Diagnose and resolve complex software issues impacting partner deployments
  • Develop tools and frameworks to streamline integration workflows
  • Support debugging and tuning of communication primitives across hardware platforms
  • Contribute to the evolution of collective communication algorithms
  • Work closely with hardware and driver teams to ensure compatibility
  • Produce technical documentation for internal and external stakeholders
  • Assist partners in adopting optimized communication libraries
  • Analyze system bottlenecks and propose architectural improvements
  • Ensure software reliability under high-load conditions
  • Participate in code reviews and maintain code quality standards
  • Implement testing strategies for cross-platform validation
  • Stay current with advancements in parallel computing and networking
  • Optimize software for diverse data center configurations
  • Support performance benchmarking and profiling activities
  • Integrate feedback from partners into product enhancements
  • Contribute to open-source projects related to communication layers
  • Facilitate knowledge transfer between internal and external teams
  • Ensure compliance with software interface standards
  • Develop proof-of-concept implementations for new features
  • Collaborate on defining roadmap priorities for system software
  • Troubleshoot interoperability issues across software stacks
  • Promote best practices in system-level software development

Compensation

Competitive salary and benefits package commensurate with experience

Work Arrangement

Hybrid work model with flexibility based on role and location

Team

Part of a global engineering team focused on system software and partner collaboration

About the Team

This team focuses on developing core communication libraries that power large-scale AI and high-performance computing systems, enabling seamless integration across diverse hardware and software environments.

Why This Role Matters

The work directly impacts the efficiency and scalability of distributed computing solutions used by leading research and enterprise organizations worldwide.

Limited sponsorship may be available for qualified candidates

Required Skills
C/C++PythonLinuxDockerKubernetesDistributed SystemsNetworkingPerformance Optimization
About company
Nvidia
NVIDIA's invention of the GPU sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing for science and engineering. Today, the company is known as 'the AI computing company,' with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world.
All jobs at Nvidia Visit website
Job Details
Category other
Posted 10 months ago