KLA

Sr. HPC Systems Architect (Storage)

Ann Arbor, MI Full time

Company Overview

KLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem. Virtually every electronic device in the world is produced using our technologies. No laptop, smartphone, wearable device, voice-controlled gadget, flexible screen, VR device or smart car would have made it into your hands without us. KLA invents systems and solutions for the manufacturing of wafers and reticles, integrated circuits, packaging, printed circuit boards and flat panel displays. The innovative ideas and devices that are advancing humanity all begin with inspiration, research and development. KLA focuses more than average on innovation and we invest 15% of sales back into R&D. Our expert teams of physicists, engineers, data scientists and problem-solvers work together with the world’s leading technology providers to accelerate the delivery of tomorrow’s electronic devices. Life here is exciting and our teams thrive on tackling really hard problems. There is never a dull moment with us.

Job Description/Preferred Qualifications

About the role:

In this senior role, you will own the architecture, deployment, and long‑term scalability of enterprise HPC storage and compute platforms. You’ll lead systems from early design through production, partnering across engineering, manufacturing, and vendors to deliver high‑performance, highly available HPC infrastructure at scale.

This role is ideal for someone who enjoys deep technical ownership, architectural influence, and solving complex infrastructure challenges in real production environments. You’ll influence architectural decisions, build storage systems that truly scale, and work on HPC platforms used in real‑world, mission‑critical environments, not proofs of concept!

Job Duties, but not limited to:

  • Own the design, implementation, and ongoing support of high‑performance compute (HPC) clusters, taking accountability for system performance, reliability, and scalability
  • Serve as a technical authority for HPC storage, with deep hands‑on expertise in parallel file systems such as Lustre, GPFS, and BeeGFS
  • Apply advanced systems knowledge across CPU and GPU architectures, high‑bandwidth interconnects, and robust storage subsystems to deliver balanced, high‑performance solutions
  • Lead the creation of hardware BOMs for HPC clusters, working directly with vendors and coordinating hardware release activities
  • Design, configure, and optimize Linux operating systems for HPC environments, applying strong, distro‑agnostic Linux expertise
  • Translate project specifications and performance requirements into subsystem‑ and system‑level designs, driving execution while meeting technical and schedule commitments
  • Support the design, release, and transition of new systems to manufacturing and customers, providing high‑quality golden images, procedures, scripts, and documentation
  • Manage EOL part re‑qualification activities to ensure long‑term system viability and supportability
  • Act as a senior escalation point for complex in‑house and in‑field issues, providing hands‑on troubleshooting and resolution

Qualifications, but not limited to:

  • BS or MS in Computer Science, Computer Engineering, or a related field
  • 5+ years of progressive experience in HPC systems, storage, or large‑scale Linux infrastructure
  • Deep, hands‑on expertise in HPC storage and Linux‑based infrastructure
  • Strong, distro‑agnostic Linux experience (Rocky, RHEL, SuSE, Ubuntu)
  • Proven experience designing and operating large‑scale parallel storage systems
  • Strong understanding of HPC hardware platforms (servers, GPUs, networking, storage, BIOS/BMC)
  • Advanced Linux systems knowledge (PXE/netboot, systemd, HA concepts)
  • Solid networking fundamentals (TCP/IP, DNS, DHCP, LDAP, HTTP)
  • Strong scripting skills in Shell and Python
  • Experience with configuration management and automation (Salt, Puppet, Chef, etc.)
  • Ability to lead complex work independently while influencing cross‑functional teams

Preferred Qualifications:

  • Strong DevOps and automation mindset (CI/CD pipelines, Git, infrastructure as code)
  • Experience with containers for HPC (Singularity, Docker)
  • Monitoring and observability experience (Prometheus, Grafana)
  • Familiarity with Apache/Nginx and supporting infrastructure services

        Minimum Qualifications

        Requires minimum of 8 years of related experience with a Bachelor's degree; or 6 years and a Master's degree; or a PhD with 3 years experience; or equivalent experience.

        Base Pay Range: $129,600.00 - $220,300.00 Annually

        Primary Location: USA-MI-Ann Arbor-KLA

        KLA’s total rewards package for employees may also include participation in performance incentive programs and eligibility for additional benefits including but not limited to: medical, dental, vision, life, and other voluntary benefits, 401(K) including company matching, employee stock purchase program (ESPP), student debt assistance, tuition reimbursement program, development and career growth opportunities and programs, financial planning benefits, wellness benefits including an employee assistance program (EAP), paid time off and paid company holidays, and family care and bonding leave.

        Interns are eligible for some of the benefits listed. Our pay ranges are determined by role, level, and location. The range displayed reflects the pay for this position in the primary location identified in this posting. Actual pay depends on several factors, including state minimum pay wage rates, location, job-related skills, experience, and relevant education level or training. We are committed to complying with all applicable federal and state minimum wage requirements where applicable. If applicable, your recruiter can share more about the specific pay range for your preferred location during the hiring process.

        KLA is proud to be an Equal Opportunity Employer. We will ensure that qualified individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us at talent.acquisition@kla.com or at +1-408-352-2808 to request accommodation.

        Be aware of potentially fraudulent job postings or suspicious recruiting activity by persons that are currently posing as KLA employees.  KLA never asks for any financial compensation to be considered for an interview, to become an employee, or for equipment. Further, KLA does not work with any recruiters or third parties who charge such fees either directly or on behalf of KLA. Please ensure that you have searched KLA’s Careers website for legitimate job postings.  KLA follows a recruiting process that involves multiple interviews in person or on video conferencing with our hiring managers.  If you are concerned that a communication, an interview, an offer of employment, or that an employee is not legitimate, please send an email to talent.acquisition@kla.com to confirm the person you are communicating with is an employee. We take your privacy very seriously and confidentially handle your information.