NVIDIA is hiring a Senior Software Engineer, DevOps - Server Infrastructure

About the Role

NVIDIA is looking for a Senior Software Engineer, DevOps - Server Infrastructure to be responsible for architecting the build and deployment process of GPU-based servers for our Metropolis platforms. This role focuses on automating the delivery pipeline and managing infrastructure for AI and machine learning applications in streaming video and data analytics.

What You'll Do

  • Build, deploy, and maintain GPU-based Servers for use in Metropolis blueprints, platforms, and machine learning applications for test, development, and production environments.
  • Lead design and be responsible for infrastructure components on Network topologies, Streaming Servers, and Security.
  • Collaborate with different software, IT, Security, and hardware teams across geographies to solve critical problems and performance issues.
  • Establish configuration environment for servers by creating processes and tools for software development, debugging, testing, benchmarking, and documentation.
  • Automate provisioning and management of bare-metals, internal cloud, Microsoft Azure, and Amazon AWS.
  • Implement automated monitoring and operating procedures for a range of domains across on-premise/cloud environments.
  • Build and maintain infrastructures related to the delivery of software artifacts produced by Metropolis application development teams.
  • Create detailed documentation to allow customers, partners, and system integrators to replicate the deployment architecture prototyped.

What We're Looking For

  • BS or MS in Computer Science, Computer Engineering, Electrical Engineering, or related field (or equivalent experience).
  • 8+ years of proven ability in Configuration Management and Server administration (Linux) in an Engineering Hardware Lab environment.
  • Good programming skills in Python, Shell Scripting, Ansible, Terraform, Helm Template, Docker, Docker Compose.
  • Good understanding of configuring and managing Elasticsearch, Logstash, Kibana, and the Kafka ecosystem.
  • Software build, package, and delivery skills with Jenkins, Pipeline Scripting, Dockerfile, Artifactory integration, Container Registry, and Helm Package repositories.
  • Good understanding of the Kubernetes ecosystem and helm-based application deployment patterns.
  • Infrastructure provisioning automation with AWS, GCP, Azure.
  • Experience building configuration management, monitoring, and automation tools.
  • Familiarity with management of large scale of edge servers deployed in indoor and outdoor environments.
  • Strong interpersonal skills.

Technical Stack

  • Languages & Scripting: Python, Shell Scripting
  • Infrastructure as Code: Ansible, Terraform
  • Containers & Orchestration: Docker, Docker Compose, Kubernetes, Helm Template
  • Observability: Elasticsearch, Logstash, Kibana, Kafka
  • CI/CD & Delivery: Jenkins, Artifactory
  • Cloud Platforms: AWS, GCP, Azure
  • Operating System: Linux

Team & Environment

You will be a key member of the Metropolis team, collaborating with software, IT, Security, and hardware teams across geographies.

Benefits & Compensation

  • Compensation: $184,000 USD - $287,500 USD + equity: Eligible for equity
  • Highly competitive salaries
  • Comprehensive benefits package
  • Equity eligibility

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Required Skills
PythonShell ScriptingAnsibleTerraformHelm TemplateDockerDocker ComposeElasticsearchLogstashKibanaLinux System AdministrationCI/CDCloud InfrastructureNetworkingMonitoring PythonShell ScriptingAnsibleTerraformHelm TemplateDockerDocker ComposeElasticsearchLogstashKibanaLinux System AdministrationCI/CDCloud InfrastructureNetworkingMonitoring
Starting a business in Thailand?

Company registration done right

Foreign ownership rules, licenses, tax registration — Thai business setup has many moving parts. SVBL guides you through every step with full legal compliance.

Company registration & structure
Foreign ownership solutions
License & tax registration
BOI promotion eligibility
Start your business
100% foreign ownership possible
About company
NVIDIA
NVIDIA builds accelerated computing platforms and AI technologies that power advancements in areas such as generative AI, data centers, robotics, and digital twins.
All jobs at NVIDIA Visit website
Job Details
Category infrastructure
Posted 6 months ago