United States of America Remote (Country) USD 125,000 - 180,000 Yearly

Nebius is hiring a Hardware Support Engineer

About the Role

As a Senior Hardware Support Engineer, you will play a central role in maintaining the reliability of large-scale production hardware infrastructure. You'll lead deep-dive investigations into complex hardware and firmware issues, identifying root causes and implementing corrective actions to minimize downtime and improve system resilience.

Key Responsibilities

  • Lead end-to-end analysis of critical hardware failures, tracing issues from initial symptoms to resolution
  • Identify recurring failure patterns and drive systemic fixes to improve fleet-wide reliability
  • Serve as the primary technical escalation point during high-severity hardware incidents
  • Collaborate with hardware vendors to coordinate diagnostics, replacements, firmware updates, and long-term remediation
  • Work alongside internal engineering teams to validate hardware fixes and prevent future issues
  • Perform pre-deployment validation of server hardware and firmware across diverse platforms
  • Apply structured problem-solving frameworks to diagnose and document hardware-related outages
  • Support on-site operations during critical events with clear technical guidance and coordination
  • Enhance monitoring, failure tracking, and reporting systems to improve hardware observability
  • Contribute to strategic initiatives aimed at increasing long-term platform stability

Qualifications

Candidates must have extensive experience with server hardware in production environments, including deep knowledge of core components such as CPUs, memory, storage, power systems, and BMCs. You should have a proven ability to analyze telemetry and log data to diagnose failure modes, and experience applying formal incident management methodologies to resolve issues efficiently.

Strong communication skills are essential, as the role involves coordinating across engineering, operations, and vendor teams. You must be comfortable managing multiple concurrent investigations under pressure and delivering clear technical documentation.

Preferred Experience

  • Work with GPU-intensive systems, AI workloads, or high-performance computing infrastructure
  • Experience managing firmware lifecycles and validating large-scale rollouts
  • Familiarity with Linux-based environments and infrastructure automation tools
  • Track record of improving hardware reliability metrics across large fleets

Work Environment

This role is remote within the United States, with occasional travel required for on-site support during critical hardware events. The position operates in a fast-paced, innovation-driven culture focused on advancing AI and machine learning technologies.

Compensation & Benefits

Base salary ranges from $125,000 to $180,000 annually, with an annual performance-based bonus. Benefits include comprehensive medical, dental, and vision coverage; a 401(k) plan with company contribution; flexible paid time off; paid parental leave; and support for professional development.

Required Skills
server hardwareroot cause analysisCPUmemorystoragenetworkingpowerBMCincident managementvendor management server hardwaredata center operationsroot cause analysishardware failure diagnosisfirmware troubleshootingCPU architecturememory systemsstorage systemsnetworking hardwarepower systemsBMC (Baseboard Management Controller)incident managementIT service managementvendor coordinationstructured problem solving
Your first international client?

Don't lose them over invoicing

Clients ghost freelancers with unprofessional invoicing. Glopay gives you a real EU company partnership so they take you seriously from invoice #1.

Instant EU company partnership
Invoice builder with your branding
Automated payment reminders
Real-time payment tracking
Get EU company now
Ready in 24 hours
About company
Nebius
Nebius is leading a new era in cloud computing to serve the global AI economy. It creates tools and resources for customers to solve real-world challenges without massive infrastructure costs.
All jobs at Nebius Visit website
Job Details
Department Engineering
Category embedded
Posted 2 months ago