NVIDIA is hiring a Senior Deep Learning Engineer, Visual Generative AI

NVIDIA is looking for a Senior Deep Learning Engineer to focus on Visual Generative AI. You will optimize and deploy deep learning models for efficient inference across GPU platforms, working closely with research and engineering teams to transition AI models from prototype to production.

What You'll Do

  • Optimize deep learning models for low-latency, high-throughput inference, with a focus on diffusion models for Visual Generative AI applications.
  • Convert, deploy, and optimize models for efficient inference using frameworks such as TensorRT, TensorRT-LLM, and vLLM.
  • Understand, analyze, profile, and optimize performance of deep learning workloads on state-of-the-art NVIDIA GPU hardware and software platforms.
  • Collaborate with internal and partner research scientists and software engineers to ensure seamless integration of AI models from training to deployment.
  • Contribute to the development of automation and tooling for NVIDIA Inference Microservices (NIMs) and inference optimization, including creating automated benchmarks to track performance regressions.

What We're Looking For

  • 3+ years of experience in DL model implementation and SW Development.
  • A BSc, MS or PhD degree in Computer Science, Computer Architecture, or a related technical field.
  • Extensive knowledge of at least one DL Framework (PyTorch, JAX, TensorFlow) with practical experience in PyTorch required.
  • Deep understanding of transformer architectures, attention mechanisms, Visual Generative AI foundational models architectures (e.g., U-Net, DiT) and inference bottlenecks.
  • Excellent Python programming skills.
  • Strong problem solving and analytical skills.
  • Algorithms and DL fundamentals.
  • Docker containerization fundamentals.

Nice to Have

  • Experience in performance measurements and profiling.
  • Hands-on experience with model optimization and serving frameworks, such as: TensorRT, TensorRT-LLM, vLLM, SGLang, and ONNX.
  • Deep understanding of distributed systems for large-scale model inference and serving.
  • Experience with extending and leveraging open-source tools for Visual Generative AI workflow creation.
  • Familiarity with the latest trends in Visual Generative AI for content creation.

Technical Stack

  • PyTorch, TensorRT, TensorRT-LLM, vLLM, SGLang, ONNX, Docker, Python

Team & Environment

You will work with world class research scientists, software engineers, and hardware specialists.

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Required Skills
PyTorchTensorRTTensorRT-LLMvLLMSGLangONNXDockerPythonDeep LearningGenerative AIComputer VisionLLMDiffusion ModelsModel OptimizationDistributed Training PyTorchTensorRTTensorRT-LLMvLLMSGLangONNXDockerPythonDeep LearningGenerative AIComputer VisionLLMDiffusion ModelsModel OptimizationDistributed Training
Freelancing without stability?

Get steady projects, keep your freedom

Iglu connects you with international clients and handles contracts, payments, and admin. You get consistent work and flexibility — no more chasing invoices or worrying about gaps.

Consistent client projects
Contract & payment management
Flexible work schedule
Revenue-sharing compensation
See open positions
Work from anywhere
About company
NVIDIA
NVIDIA builds accelerated computing platforms and AI technologies that power advancements in areas such as generative AI, data centers, robotics, and digital twins.
All jobs at NVIDIA Visit website
Job Details
Category data
Posted 8 months ago