Austin, Texas, United States On-site

Celestica is hiring a Compute Server & Storage test engineer

About the Role

Celestica is hiring a Senior Lead Storage and Server Test Engineer to play a pivotal role in validating the core infrastructure for AI data centers. You will design, develop, and execute comprehensive test strategies for storage and server hardware, firmware, and software, ensuring they meet the high-performance and reliability demands of modern AI/ML workloads.

What You'll Do

  • Define, develop, and implement comprehensive test plans and strategies for all storage and server components within the AI data center environment.
  • Lead the test team in designing, executing, and analyzing complex test cases, including functional, performance, reliability, stress, and endurance testing.
  • Mentor and provide technical guidance to junior test engineers.
  • Design and implement automated test frameworks and scripts using languages like Python, Go, or similar.
  • Conduct in-depth performance analysis and bottleneck identification for storage systems and server platforms, including debugging issues related to BMC functionality.
  • Develop and maintain robust testbeds and infrastructure for continuous integration and validation.
  • Utilize open-source and commercial test tools relevant to storage, server, and OpenBMC validation.
  • Collaborate closely with hardware design, software development, infrastructure, and AI/ML engineering teams to integrate testing throughout the product lifecycle.
  • Communicate test progress, results, and critical issues effectively to stakeholders, including executive leadership.
  • Develop specialized test methodologies to validate performance and reliability under heavy AI/ML workloads.
  • Understand and test the interactions between GPU-accelerated computing, high-speed networking, and storage systems.

What We're Looking For

  • Bachelor's or Master's degree in Computer Science, Electrical Engineering, or a related technical field.
  • 3+ years of experience in hardware and/or software testing, with at least 5 years focused on enterprise-level storage and server systems.
  • Proven experience in a lead or senior technical role, mentoring and guiding other engineers.
  • Deep expertise in storage technologies including NVMe, SAS/SATA SSDs/HDDs, RAID, distributed file systems (e.g., Ceph, Lustre, GPFS), SAN, and NAS.
  • Strong understanding of server architectures (x86, ARM, GPU servers), CPU/memory subsystems, PCIe, power management, and Baseband Management Controllers (BMC) functionality.
  • Proficiency in scripting languages (Python, Bash) for test automation and data analysis.
  • Experience with Linux operating systems (e.g., Ubuntu, CentOS, RHEL) and command-line tools.
  • Familiarity with networking concepts (Ethernet, TCP/IP, InfiniBand) and network testing methodologies.
  • Experience with test methodologies such as performance testing, reliability testing, stress testing, and fault injection.
  • Excellent problem-solving, analytical, and debugging skills.
  • Strong communication and interpersonal skills, with the ability to collaborate effectively across diverse teams.

Nice to Have

  • Familiarity with OCP (Open Compute Project).
  • Experience with cloud environments (AWS, Azure, GCP) and virtualization technologies.
  • Knowledge of containerization technologies (Docker, Kubernetes).
  • Familiarity with AI/ML frameworks (e.g., TensorFlow, PyTorch) and their infrastructure requirements.
  • Experience with performance profiling tools (e.g., fio, Iometer, Perf, VTune).
  • Contributions to open-source projects related to storage, servers, or testing.
  • Certifications in relevant technologies (e.g., NetApp, Dell EMC, HPE, NVIDIA).

Technical Stack

  • Languages & Scripting: Python, Go, Bash
  • Operating Systems: Linux (Ubuntu, CentOS, RHEL)
  • Storage Tech: NVMe, SSD, HDD, RAID, Ceph, Lustre, GPFS, SAN, NAS
  • Server & Hardware: x86, ARM, GPU, PCIe, OpenBMC
  • Networking: Ethernet, TCP/IP, InfiniBand
  • Tools & Frameworks: Docker, Kubernetes, TensorFlow, PyTorch, fio, Iometer, Perf, VTune

Team & Environment

This is an Individual Contributor role.

Work Mode

This position is onsite in Austin, Texas, USA.

Celestica's policy on equal employment opportunity prohibits discrimination based on race, color, creed, religion, national origin, gender, sexual orientation, gender identity, age, marital status, veteran or disability status, or other characteristics protected by law.

Required Skills
PythonGoBashLinuxUbuntuCentOSRHELNVMeSSDHDDRAIDCephLustreSASSATAstorage testingserver testinghardware testingsoftware testing PythonGoBashLinuxUbuntuCentOSRHELNVMeSSDHDDRAIDCephLustreSASSATAstorage testingserver testinghardware testingsoftware testing
Landing international contracts?

Invoice globally with an EU company

GloPay creates an Estonian partnership for you automatically. Your clients get proper invoices, you keep 95% of payments. Setup takes 5 minutes, works in 100+ currencies.

EU-registered company for compliance
Multi-currency invoicing & payments
Expense tracking & tax reports
Money in your bank in 1 business day
Start invoicing free
5% per invoice • No subscriptions
About company
Celestica
Celestica enables the world’s best brands. Through a customer-centric approach, they partner with leading companies in Aerospace and Defense, Communications, Enterprise, HealthTech, Industrial, Capital Equipment and Energy to deliver solutions for their most complex challenges. They provide design, manufacturing, hardware platform and supply chain solutions.
All jobs at Celestica Visit website
Job Details
Department Engineering
Category qa_testing
Posted 2 months ago