NVIDIA

Senior Manager, Datacenter Software - Firmware Release

US, CA, Santa Clara Full time

NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing, and Visualization. Our invention serves as the visual cortex of modern computers and is at the heart of our products and services. Our work reveals new frontiers to explore, inspires remarkable creativity and discovery, and fuels what were once science fiction inventions from artificial intelligence to autonomous cars. NVIDIA is seeking exceptional individuals like you to help us drive the next wave of artificial intelligence. NVIDIA DGX, HGX, and MGX servers deliver the world's leading solutions for enterprise AI infrastructure at scale. Enterprise needs a computing infrastructure that can be easily managed in a data center.

We are the Datacenter Software Tools team at NVIDIA. We deliver Infrastructure and Tools for data center deployment, firmware and software package deployment and server manageability. We are looking for a hard-working and experienced senior manager having experience with Datacenter Software and Firmware release management and infrastructure. In this role, you will be driving the release of software and firmware for the world’s best resilient GPU based datacenter servers. This is a highly transparent role at NVIDIA to guarantee high quality infrastructure and tooling for software and firmware release features for NVIDIA's scale up and scale out solutions – spanning frontend, backend, infrastructure, and CI/CD based automation. This role requires you to work closely with multi-functional teams including system architects, firmware developers, compliance and security teams, and product management to deliver exceptional software and firmware release solutions. Join us at the forefront of technological advancement.

What you’ll be doing:

  • In this technical role, you will be bringing in leadership on how releases should be delivered to end customers of rack-scale computing based on tightly coupled compute and switch trays and build end to end infra and workflows to ensure the highest quality releases for data center firmware and software.

  • Define release scope for rack scale products working cross functionally with product management, technical architects and program management. Deliver these releases that flow through the validation matrix for customer end use cases, ensuring delivered firmware and software is of the highest quality. Solutions must scale, be resilient, and support secure upgrades or rollbacks across diverse customer scenarios.

  • Influence architecture, design and implementation decisions for compute and switch trays software and firmware - ensuring quality across nightly, dev and production drops for all customer use cases, with the right release-validation strategy at each phase of development life cycle.

  • Partner with all matrixed organizations: Developers, SWQA, Product engineering to left-shift release quality from dev to QA in a very fast-moving environment with end-to-end CI/CD to ensure no bug is found at customer site. Enforce it with well-placed quality metrics for any product milestone and track KPIs published at regular cadence that are enforced. Monitor and report progress of releases to all stakeholders.

  • Own ingestion and packaging of software and firmware binaries, readying them for deployment across multiple platforms at scale across different CSP environments.

  • Document procedures and engage in collaborative discussions to refine software and firmware release workflows, including identifying and resolving issues in release milestone packaging and deployment procedures and remove bottlenecks. Shape the team's roadmap and drive innovation — including self-service interfaces, automation, AI-assisted validation and triage, and sophisticated release-compliance reporting.

  • Continuously review and identify improvement opportunities in established release processes, infrastructure, and practices. Ensure the teams are performing in the most efficient and transparent way with a strong focus on automation and measurable targets.

What we need to see:

  • 12+ overall years in the software industry with specialization in system software and/or firmware development.

  • 5+ years of proven technical hands-on leadership for multi-team organizations across data center firmware like BMC, FPGA, CPLDs, network switches, building Infrastructure for continue improvement for quality of releases.

  • BS/MS/PhD in CS, CE, EE, or a related technical field — or equivalent experience

  • Prior experience in systems software or firmware development with a proven history of guiding complex software features or products throughout the entire product life cycle. Ideally, on rack-scale datacenter products.

  • Strong understanding of computer system architecture, operating systems principles, HW-SW interactions, and performance analysis/optimizations.

  • Working fluency in Python and Linux sufficient to review designs, prototype tooling, and debug production issues alongside the team. Hands-on experience with web application frameworks and CI/CD platforms (Jenkins, GitLab, Artifactory).

  • Track record of balancing multiple projects with competing priorities and delivering against measurable benchmarks (MTTR, specification compliance, release cadence, automation coverage).

  • Excellent communication and collaboration skills across teams and time zones.

Ways to stand out from the crowd:

  • Familiarity with the architecture of datacenter server software and experience with the in-band and out-of-band management of firmware and hardware components.

  • Understanding REST architecture style especially JSON over HTTPs with OAuth and DMTF / PLDM / SPDM firmware management protocols.

  • Proven experience in developing a self-service release infrastructure, resulting in clear reductions in onboarding SLA times.

  • Experience integrating AI/LLM tooling into engineering workflows – for triage, test generation, code review, or release validation.

  • Experience leading engineering teams with geographically distributed teams across US and APAC.

NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing, and Visualization. Our invention serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars. NVIDIA is seeking exceptional individuals like you to help us drive the next wave of artificial intelligence.

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you!

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 272,000 USD - 431,250 USD.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until May 4, 2026.

This posting is for an existing vacancy. 

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.