Alexandria, Virginia, United States USD 92,300 - 166,850 Yearly

Leidos is hiring a Reliability Engineer

Leidos is hiring a Senior Reliability Engineer to join a mission-critical program supporting a Department of War enterprise data and analytics initiative. You will work directly with government partners and fellow engineers to translate operational needs into scalable, resilient, and production-ready solutions, playing a key role in product planning, execution, and continuous improvement within our mission-driven culture focused on outthinking, outbuilding, and outpacing the status quo.

What You'll Do

  • Develop and implement strategies leveraging FOSS, COTS, and GOTS technologies to enhance platform reliability, resiliency, and scalability.
  • Conduct lab-based SWIL and HWIL testing to validate system performance and ensure components meet scalability and operational requirements.
  • Identify performance bottlenecks, analyze usage patterns, and recommend improvements to enhance system efficiency and scalability.
  • Identify, diagnose, and address recurring incidents, performing root cause analysis, and implementing preventative measures.
  • Produce and brief comprehensive resiliency and scalability assessments, providing insights into system behavior under load, failure modes, and recovery conditions.
  • Translate findings into inputs for SLAs and KPPs to support informed decision-making by leadership.
  • Prepare, maintain, and execute a System Engineering Plan (SEP) for managing all systems architecture and system engineering aspects of the program.
  • Conduct systems engineering activities required to specify, build, and maintain system engineering designs for the System.
  • Design, prepare, and document systems engineering and cybersecurity artifacts for the System.
  • Support the Government in recommending and conducting enterprise system architecture activities.
  • Define, document, maintain, and promulgate APIs and technical standards for using and interoperating within and outside the System.
  • Design, engineer, integrate, and continuously improve the underlying infrastructure of the System.
  • Identify, prepare, track, secure, and integrate government, commercial, and open-source tools and services into the System.
  • Design, architect, engineer, and continuously improve the UI and UX components of the Platform.
  • Perform site reliability engineering to build and maintain a reliable, scalable, and efficient System by applying software engineering principles to operational tasks.

What We're Looking For

  • Active Top Secret (TS) clearance with SCI eligibility.
  • Bachelor’s degree in Computer Science, Engineering, Information Systems, or related technical discipline and 8–12 years of relevant experience OR Master’s degree in a related field and 6–10 years of relevant experience.
  • Experience engineering and supporting enterprise cloud environments (AWS, Azure, or GCP).
  • Experience implementing monitoring, observability, and performance management solutions.
  • Experience conducting root cause analysis and implementing systemic reliability improvements.
  • Experience integrating reliability engineering practices into DevSecOps pipelines.
  • Experience operating within SAFe or large-scale Agile frameworks supporting enterprise systems.
  • Experience with FOSS, COTS, and GOTS technologies.
  • Proven experience in conducting SWIL and HWIL testing.
  • Strong understanding of system performance analysis and optimization.
  • Experience in root cause analysis and implementing preventative measures.
  • Ability to produce and brief comprehensive technical assessments.
  • Experience in preparing and maintaining System Engineering Plans (SEP).
  • Strong documentation and communication skills.

Nice to Have

  • Active TS/SCI clearance.
  • Experience with DoD systems and environments.
  • Familiarity with NIST security controls and Zero Trust compliance.
  • Experience in defining and tracking KPIs and SLOs.
  • Knowledge of AI/ML model serving and deployment.
  • Experience in participating in Engineering Control Board (ECB) processes.
  • Familiarity with cloud environments and DevSecOps practices.
  • SAFe Agilist (SA) or related SAFe certification.
  • Experience supporting multi-enclave DoD cloud environments.
  • Experience implementing automated failover, redundancy, and capacity management solutions.
  • Experience supporting enterprise-scale data, analytics, or AI platforms.
  • Experience implementing Zero Trust-aligned resiliency patterns.
  • Relevant cloud certification (AWS, Azure, or GCP).

Technical Stack

  • FOSS, COTS, GOTS
  • AWS, Azure, GCP

Benefits & Compensation

  • Salary Range: $92,300.00 - $166,850.00

Leidos is an equal opportunity employer.

Required Skills
AWSAzureGCPFOSSCOTSGOTSMonitoringObservabilityPerformance ManagementRoot Cause AnalysisReliability EngineeringCloud EngineeringEnterprise Systems
Freelancing without stability?

Get steady projects, keep your freedom

Iglu connects you with international clients and handles contracts, payments, and admin. You get consistent work and flexibility — no more chasing invoices or worrying about gaps.

Consistent client projects
Contract & payment management
Flexible work schedule
Revenue-sharing compensation
See open positions
Work from anywhere
About company
Leidos
Leidos Engineering provides consulting services to public-owned and investor-owned electric utilities for comprehensive engineering design of electrical transmission, substation, and distribution projects nationwide. The Power Delivery Services Team supports utilities and mobile operators with reliable power and telecommunication expertise.
All jobs at Leidos Visit website
Job Details
Department Engineering
Category infrastructure
Posted 2 months ago

Related Articles

Insights related to this role