Role Overview

As a Senior Systems Development Engineer, you will drive the design, validation, and performance optimization of advanced computing platforms engineered for artificial intelligence workloads. Based in Austin, Texas, you will ensure system-level readiness for demanding AI applications, from infrastructure bring-up to deployment at scale.

Key Responsibilities

Lead the deployment, configuration, and functional validation of high-performance computing systems, including GPU servers, accelerator racks, and high-speed networking fabrics
Perform deep co-validation across hardware and software layers, ensuring compatibility and stability of CPUs, GPUs, DPUs, NICs, memory subsystems, and I/O interfaces under AI-intensive workloads
Diagnose and resolve complex issues spanning BIOS/UEFI, BMC firmware, kernel subsystems, device drivers, container environments, orchestration frameworks, and AI model runtimes
Validate PCIe topology, NUMA alignment, and data-path efficiency critical to model training and inference performance
Analyze system telemetry, kernel logs, hardware events, GPU health metrics, and fabric diagnostics to identify root causes of failures or bottlenecks
Conduct root-cause analysis on training instability, model divergence, and hardware degradation under sustained AI loads
Collaborate with silicon, firmware, operating system, and AI software teams to implement rapid resolutions and drive platform improvements
Deploy and manage AI clusters integrating GPU servers, accelerators, InfiniBand or RoCE networking, and scalable storage solutions
Verify cluster readiness for distributed training by evaluating bandwidth, latency, network topology, and gradient synchronization efficiency
Integrate with orchestration platforms such as Kubernetes, Slurm, Ray, Docker, and Singularity to optimize AI pipeline execution
Partner with data center engineering on rack integration, power and thermal planning, and capacity forecasting
Run and interpret industry-standard AI benchmarks including MLPerf Training, MLPerf Inference, and SPEC AI suites
Develop custom benchmarking tools for transformer models, large language models, computer vision, multimodal systems, and recommendation engines
Deliver actionable optimization recommendations across hardware, OS, drivers, and AI frameworks based on benchmark results
Document technical findings and lead cross-functional initiatives to enhance platform performance and reliability

Required Qualifications

Bachelor’s or Master’s degree in Computer Engineering, Computer Science, Electrical Engineering, or a related technical discipline
Minimum of five years of experience in system development, platform engineering, or hardware-software validation
Strong grasp of computer architecture, including CPU/GPU/accelerator design, memory hierarchies, and I/O subsystems

Technical Environment

BIOS/UEFI, BMC, firmware, kernel drivers, PCIe, NUMA, InfiniBand, RoCE, Kubernetes, Slurm, Ray, Docker, Singularity, MLPerf Training, MLPerf Inference, SPEC AI Benchmarks

Dell Technologies is hiring a Senior System Development Engineer – AI Technologies

Key Responsibilities

Required Qualifications

Technical Environment

Similar Jobs

Logical Design Engineer

Principal Software Engineer / Network Drivers

On Site Server Field Application Engineer (FAE) in Eindhoven

SerDes Firmware Engineer

Applications Engineer

Principal BMC Firmware Engineer

Related Articles

Platform Engineering: Kubernetes for All

Developer Experience Platform: Lessons from Europe

Kubernetes Remote Jobs: AI & Cloud-Native Careers in 2026