H1

Staff Data Engineer

New York Full Time
At H1, we believe access to the best healthcare information is a basic human right. Our mission is to provide a platform that can optimally inform every doctor interaction globally. This promotes health equity and builds needed trust in healthcare systems. To accomplish this our teams harness the power of data and AI-technology to unlock groundbreaking medical insights and convert those insights into action that result in optimal patient outcomes and accelerates an equitable and inclusive drug development lifecycle.  Visit h1.co to learn more about us.

Data Engineering is responsible for the development and delivery of our most important asset, our data. Across thousands of data sources globally, the team ensures that only accurate, normalized data flows to our customers, at the speed required to match real-world changes. As we expand the markets we serve and increase the breadth and depth of data we capture, we need senior technical leaders who can drive execution, scalability, and architectural excellence.

WHAT YOU'LL DO AT H1
As a Staff Data Engineer on the Real World Evidence (RWE) team, you’ll be one of the most senior individual contributors and a key technical leader for our largest datasets and pipelines. You’ll drive some of H1’s most visible data initiatives and help reduce bottlenecks across teams, providing critical technical leadership support during US hours.

You will:
- Act as a self-starter who drives execution independently, taking ownership and initiative with minimal need for day-to-day direction.
- Lead high-visibility RWE projects, starting with claims data, and keep multiple initiatives moving by proactively unblocking teams.
- Own the end-to-end architecture for critical data assets, ensuring solutions are scalable, reliable, and aligned with H1’s long-term vision.
- Design, build, and optimize large-scale data pipelines (hundreds of TBs) for performance, reliability, and cost efficiency.
- Partner with Product, Data Science, and downstream engineering teams to align priorities, manage dependencies, and deliver high-value outcomes.
- Represent engineering in cross-functional forums, shaping roadmaps and reducing reliance on senior leadership for day-to-day decisions.
- Develop deep domain expertise and mentor other engineers, helping raise the technical bar and influence the evolution of our data products.

ABOUT YOU
You’re a hands-on Staff IC and technical leader who thrives in complex data environments. You bring clarity to ambiguity, turn messy problems into reliable systems, and operate with a strong sense of ownership and impact. You collaborate effectively across functions, help others move faster, and are comfortable working across the full data and infrastructure stack.

- You have a proven track record of leading large, complex technical projects from concept to production.
- You bring deep experience building and evolving large-scale data architectures, pipelines, or distributed systems.
- You operate well in high-ambiguity environments, make pragmatic trade-offs, and keep execution moving.
- You communicate clearly with both technical and non-technical partners and influence direction without needing formal authority.
- You raise the bar for engineering excellence through thoughtful design, high-quality code, and strong documentation.
- You invest in others through mentorship, pairing, and constructive feedback.

REQUIREMENTS
-8+ years as a software, data, or backend engineer building and operating scalable, production-grade systems.
- Experience with large-scale data processing (e.g., Spark/PySpark on EMR or similar) or scalable distributed backend systems, with the ability to quickly deepen expertise in our data stack (PySpark, EMR, Hudi/Delta).
- Strong proficiency in SQL, including writing and optimizing complex queries over large datasets.
- Strong programming experience in Python (or a modern language with the ability to quickly ramp up in Python).
- Experience designing systems or large-scale datasets/pipelines with attention to performance, reliability, and maintainability.
- Hands-on experience with modern engineering workflows and tooling such as Git, JIRA, and CI/CD systems (e.g., CircleCI).
- Comfort deploying and troubleshooting distributed workloads in cloud environments such as AWS EMR or Kubernetes.
- Experience with workflow orchestration or job scheduling tools (e.g., Airflow, Argo).
- Demonstrated ability to independently drive complex, cross-team technical initiatives and influence stakeholders without formal authority.
- Experience with streaming/messaging technologies (e.g., Kafka, Kinesis) nice to have
- Background in RWE, healthcare data, or other complex/regulated data domains is preferred
- Experience using AI-assisted coding tools (e.g., GitHub Copilot, Claude Code) to accelerate development while maintaining quality is encouraged

COMPENSATION
This role pays $170,000 to $190,000 per year, based on experience, in addition to stock options.

Anticipated role close date: 01/05/2026