NVIDIA is hiring a Solution Architect – AI Factory

About the Role

NVIDIA is looking for a Solution Architect – AI Factory to guide customers in adopting our compute, networking, and software stacks to deliver end-to-end GenAI and Agentic AI solutions. You will solve complex problems to bring NVIDIA's premiere technologies to life in the cloud and datacenter.

What You'll Do

  • Guide customers in adoption of NVIDIA's compute, networking, and software stacks for GenAI and Agentic AI solutions.
  • Use cloud native methodologies, low latency networks, and accelerated compute to build modern AI factories.
  • Share knowledge by delivering demos, assisting with proof-of-concepts, or writing papers and developer blogs.
  • Collaborate with executives and engineering to solve complex problems and deploy NVIDIA technologies.
  • Solve unsolved problems in the industry and help deploy and operationalize AI solutions at scale.

What We're Looking For

  • MS, or PhD in Engineering, Computer Science, or a related field (or equivalent experience).
  • Established track record working with AI and HPC clusters, both on-premises and cloud based.
  • 4 plus years of proven experience with cluster management and related tools, including Docker Containers, Slurm, Kubernetes, and Ansible.
  • Hands-on experience with Datacenter MEP, network, storage, cluster configuration and debugging.
  • Strong analytical and problem-solving skills, along with an ability to articulate what you know to others.
  • Ability to multitask efficiently in a dynamic environment.

Nice to Have

  • Strong coding and debugging skills, including experience with CUDA, Python, C/C++, Bash, AI frameworks and Linux utilities.
  • Demonstrated expertise through projects or Open Source contributions involving GPU workloads, Kubernetes, InfiniBand, Ethernet, or other areas related to high-performance clusters and hybrid cloud solutions.
  • Exhibit hands on experience with NVIDIA Enterprise software products, Base Command Manager, Run:ai and NVIDIA NIMs.
  • Willingness and ability to learn quickly and solve advanced problems.

Technical Stack

  • Docker Containers, Slurm, Kubernetes, Ansible
  • CUDA, Python, C/C++, Bash, AI frameworks, Linux utilities
  • InfiniBand, Ethernet
  • NVIDIA Enterprise software, Base Command Manager, Run:ai, NVIDIA NIMs

Team & Environment

You will be joining NVIDIA's Strategic Enterprise AI Factory team. We offer a creative and autonomous environment where you'll work alongside some of the most forward-thinking and hardworking people in the world.

NVIDIA is an equal opportunity employer.

Required Skills
Docker ContainersSlurmKubernetesAnsibleCUDAPythonC/C++BashAI frameworksLinux utilitiesSolution ArchitectureAI InfrastructureHigh-Performance ComputingDistributed SystemsCloud Platforms Docker ContainersSlurmKubernetesAnsibleCUDAPythonC/C++BashAI frameworksLinux utilitiesSolution ArchitectureAI InfrastructureHigh-Performance ComputingDistributed SystemsCloud Platforms
Ready to relocate and code from paradise?

Thailand or Vietnam — your office, your rules

Iglu offers relocation to Bangkok, Chiang Mai, Ho Chi Minh City, or Hong Kong. Full employment, legal setup, and a community of 200+ digital professionals.

Relocation to 5 countries
Full legal work setup
Developer community access
Work-life balance culture
Explore locations
Relocation support included
About company
NVIDIA
NVIDIA builds accelerated computing platforms and AI technologies that power advancements in areas such as generative AI, data centers, robotics, and digital twins.
All jobs at NVIDIA Visit website
Job Details
Category management
Posted 4 months ago