GDIT

HPC Technical Lead

Any Location / Remote Full time

Type of Requisition:

Regular

Clearance Level Must Currently Possess:

None

Clearance Level Must Be Able to Obtain:

None

Public Trust/Other Required:

None

Job Family:

IT Infrastructure and Operations

Job Qualifications:

Skills:

High Performance Computing (HPC), lustre, Portable Batch System (PBS), Python Software Development, Red Hat Enterprise Linux (RHEL)

Certifications:

None

Experience:

10 + years of related experience

US Citizenship Required:

Yes

Job Description:

GDIT is looking for an HPC Technical Lead that will provide deep technical expertise and leadership for the WCOSS production environment, ensuring stability, performance, and scalability of NOAA’s operational HPC systems. This role focuses on technical execution, troubleshooting, and guiding the team in implementing best practices for HPC operations.

Key Responsibilities

  • Technical Leadership
    • Serve as the primary technical authority for all HPC operational matters.
    • Lead and drive root‑cause analysis and problem resolution for critical incidents across compute, storage, and interconnect components.
  • System Performance & Reliability
    • Monitor, analyze, and optimize performance across compute nodes, interconnects, and storage subsystems.
    • Conduct proactive health checks and performance tuning to ensure system readiness for 24×7 mission‑critical NOAA workloads.
  • Change & Configuration Management
    • Lead technical planning and readiness reviews for all system upgrades, patches, and enhancements.
    • Maintain configuration baselines and ensure compliance with security requirements (e.g., RMF/STIG).
  • Collaboration & Customer Technical Interface
    • Act as the primary technical point of contact for NWS/NOAA for detailed HPC discussions, system behavior, and operational issues.
    • Coordinate with NOAA scientific teams to understand modeling workload needs and optimize HPC resources accordingly.

Required Skills

  • 10+ years of hands‑on HPC systems administration and troubleshooting experience, including Cray, SGI, or comparable large‑scale systems.
  • Extensive experience supporting Federal HPC environments, demonstrating readiness for NOAA/NWS operational environments.
  • Deep Linux expertise, including SLES, RHEL, and CentOS across multiple HPC platforms.
  • Strong technical experience with HPC storage (e.g., Lustre), interconnects (e.g., InfiniBand), and performance tuning of large‑scale computing systems.
  • Proven leadership of HPC technical teams, including mentoring and directing system administrators and engineers supporting very large core.
  • Demonstrated success performing root cause analysis, escalated troubleshooting, and incident recovery in production HPC environments.
  • Experience implementing STIG/RMF security controls across HPC systems and applying DoW‑grade configuration compliance.
  • Excellent communication skills, capable of translating complex technical issues to customers and stakeholders.

Preferred Qualifications

  • Prior experience supporting NOAA/NWS operational HPC systems, especially in real‑time weather or climate modeling environments.
  • Experience designing or improving HPC system architectures, monitoring frameworks, or performance analysis pipelines.
  • Experience presenting technical findings to scientific, engineering, or federal customer groups.

The likely salary range for this position is $182,750 - $247,250. This is not, however, a guarantee of compensation or salary. Rather, salary will be set based on experience, geographic location and possibly contractual requirements and could fall outside of this range.

Scheduled Weekly Hours:

40

Travel Required:

Less than 10%

Telecommuting Options:

Remote

Work Location:

Any Location / Remote

Additional Work Locations:

Total Rewards at GDIT:

Our benefits package for all US-based employees includes a variety of medical plan options, some with Health Savings Accounts, dental plan options, a vision plan, and a 401(k) plan offering the ability to contribute both pre and post-tax dollars up to the IRS annual limits and receive a company match. To encourage work/life balance, GDIT offers employees full flex work weeks where possible and a variety of paid time off plans, including vacation, sick and personal time, holidays, paid parental, military, bereavement and jury duty leave. GDIT typically provides new employees with 15 days of paid leave per calendar year to be used for vacations, personal business, and illness and an additional 10 paid holidays per year. Paid leave and paid holidays are prorated based on the employee’s date of hire. The GDIT Paid Family Leave program provides a total of up to 160 hours of paid leave in a rolling 12 month period for eligible employees. To ensure our employees are able to protect their income, other offerings such as short and long-term disability benefits, life, accidental death and dismemberment, personal accident, critical illness and business travel and accident insurance are provided or available. We regularly review our Total Rewards package to ensure our offerings are competitive and reflect what our employees have told us they value most.

We are GDIT. A global technology and professional services company that delivers consulting, technology and mission services to every major agency across the U.S. government, defense and intelligence community. Our 30,000 experts extract the power of technology to create immediate value and deliver solutions at the edge of innovation. We operate across 50 countries worldwide, offering leading capabilities in digital modernization, AI/ML, Cloud, Cyber and application development. Together with our clients, we strive to create a safer, smarter world by harnessing the power of deep expertise and advanced technology.

Join our Talent Community to stay up to date on our career opportunities and events at

gdit.com/tc.

Equal Opportunity Employer / Individuals with Disabilities / Protected Veterans