San Francisco, United States of America On-site

Lambda is hiring a Staff Storage Engineer

About the Role

Lead the evolution of storage systems powering next-generation AI infrastructure. In this role, you'll define the strategic direction for storage architecture, ensuring scalability, performance, and reliability across multi-petabyte environments. Your expertise will directly influence how storage integrates with distributed GPU clusters running intensive machine learning workloads.

Key Responsibilities

  • Lead vendor selection and request-for-proposal processes, using performance data and technical analysis to guide decisions
  • Analyze AI and ML workload behaviors to shape storage design, tuning, and capacity planning
  • Drive operational improvements and coordinate deployment strategies across engineering teams
  • Collaborate with leadership to translate business requirements into technical specifications
  • Oversee engineering execution by delegating complex tasks and maintaining alignment with senior stakeholders

Requirements

  • Minimum of 8 years designing, deploying, and managing large-scale production storage systems
  • Proven experience with enterprise storage platforms including Vast, Weka, DDN, NetApp, or equivalent
  • Familiarity with file, block, and object storage architectures and use cases
  • Working knowledge of NFS, SMB, and POSIX-compliant access protocols
  • Hands-on experience with NVMe over Fabrics—via TCP, InfiniBand, or RoCE
  • Understanding of RDMA, GPUDirect Storage, and parallel file systems for high-throughput environments
  • Background in storage security, encryption, data reduction, and multi-tenancy models
  • Experience with backup, disaster recovery, and data protection frameworks
  • At least 5 years using Infrastructure as Code tools such as Terraform or Ansible

Preferred Qualifications

  • Experience with Kubernetes storage interfaces, including CSI and COSI drivers
  • Deep knowledge of storage performance analysis and optimization
  • Familiarity with public cloud storage services and networking (SDN, identity, distributed systems)
  • Track record deploying and managing Software Defined Storage solutions
  • Implementation experience with monitoring platforms for storage and related infrastructure
Required Skills
VastWekaDDNNetAppPureStorageDellIBMHPENFSSMBFile StorageBlock StorageObject StorageNVME/TCPNVME/IBNVME/RoCE Storage EngineeringLarge-Scale Storage SystemsFile and Block StorageObject StorageNFSSMBPOSIXNetAppPureStorageDellIBMHPEVastWekaDDN
Earn more as a remote developer

Performance pay that rewards your skills

Iglu's revenue-sharing model means top performers earn significantly more than traditional salaries. Choose your projects, deliver great work, and see it reflected in your pay.

Revenue-sharing compensation
Project choice & autonomy
International client base
Career growth support
Check compensation
Top earners exceed market rate
About company
Lambda
Lambda, The Superintelligence Cloud, is a leader in AI cloud infrastructure serving tens of thousands of customers. The company builds and scales AI cloud infrastructure, including high-performance storage, networking, and compute systems for AI training and inference. Lambda's mission is to make compute as ubiquitous as electricity and give everyone the power of superintelligence.
All jobs at Lambda Visit website
Job Details
Category infrastructure
Posted a month ago