At Lilly, we unite caring with discovery to make life better for people around the world. We are a global healthcare leader headquartered in Indianapolis, Indiana. Our employees around the world work to discover and bring life-changing medicines to those who need them, improve the understanding and management of disease, and give back to our communities through philanthropy and volunteerism. We give our best effort to our work, and we put people first. We’re looking for people who are determined to make life better for people around the world.

Position Summary

The Human Genomics and Translational Data Sciences team within Cardiometabolic Research Data Science is hiring a Bioinformatics Pipeline Engineer to help build, solidify, and scale the analytical pipelines our scientists rely on every day. Our work spans multiple omics workflows, including target discovery and target due diligence, single cell sequencing, genomics, proteomics and, increasingly, AI-assisted workflows that pull these analyses together into faster, more reproducible products for therapeutic area partners across Lilly Research Labs.

This role sits at the intersection of two worlds. On one side, we employ classical bioinformatics and statistical genetics pipelines — the kind of robust, reproducible, well-tested workflows that turn messy public and proprietary genomics data into trustworthy answers. On the other, the rapidly evolving stack of AI tooling — large language models like Claude, agentic workflows, building AI-friendly connectors like MCP (Model Context Protocol), and the code that lets scientists query complex datasets in natural language. We want someone who is genuinely curious about both, and keen to use both to improve the value we derive from our datasets to enable target support and novel target discovery.

You will not be expected to be a senior expert in either domain on day one. You will be expected to bring strong software engineering instincts, and a keen curiosity and creativity to enhance the value of the tools and datasets at our disposal. You will work closely with statistical geneticists, computational biologists, and other engineers — both within our team and across Lilly — to ship tools that make the science faster and more reliable.

Key Responsibilities

Pipeline Development and Engineering

Support for computational biology workflows, including single cell, spatial, and other multi-omics analysis workflows for clinical and preclinical applications
Use modern workflow managers (e.g. Nextflow, Snakemake, or similar) and containerization (Docker, Singularity) to make pipelines portable, testable, and reusable across projects and teams
Help build and maintain reproducible analytical pipelines for statistical genetics and bioinformatics workflows
Wrap and harden ad-hoc analytical scripts written by scientists into production-quality tools that can be re-run reliably by others
Write tests, documentation, and clear examples so the pipelines you build are usable by colleagues with a range of technical backgrounds

AI-Enabled Tooling and Workflows

Prototype agentic workflows that automate established and routine analytical tasks — for example, pulling target evidence across data sources, generating standardized due-diligence reports, or letting scientists interrogate complex datasets in natural language
Build and maintain MCP connectors that expose internal data, public resources, and analytical pipelines to LLM-based agents and tools like Claude
Identify and develop use cases where LLMs and agentic AI workflows can improve the speed, quality, consistency, or accessibility of work across therapeutic areas, focusing on end-to-end capabilities rather than isolated task completion
Contribute to a shared library of reusable AI tooling, prompt patterns, and integration code that the team can build on. Define technical standards for evaluation, documentation, guardrails, and workflow quality so that AI-based solutions are trusted, reproducible, and suitable for repeated use across teams and projects
Know the latest with the AI tooling landscape and bring back ideas the team can put to work. Help improve AI fluency among collaborators by demonstrating practical workflows

Collaboration Across Lilly Research Labs

Partner closely with statistical geneticists, computational biologists, and software engineers within the Cardiometabolic Data Science group and across other Lilly Research Labs teams
Work with therapeutic area partners to understand their analytical needs and translate them into pipeline requirements
Coordinate with platform and engineering groups to ensure your pipelines integrate cleanly with broader Lilly infrastructure
Contribute to internal knowledge sharing — code reviews, demos, documentation, and helping colleagues get unblocked

Basic Requirements

B.S. in computer science, computational biology, bioinformatics, biological sciences, statistics, or a related field, with 10+ years relevant work experience,
OR M.S. in computer science, computational biology, bioinformatics, biological sciences, statistics, or a related field, with 7+ years relevant work experience
OR Ph.D. in computer science, computational biology, bioinformatics, biological sciences, statistics, or a related field, with 1+ years relevant work experience.

Additional Skills/Preferences

Strong programming skills in Python and/or R including comfort with version control (Git), code review, testing, and writing maintainable code
Demonstrated experience building data analysis pipelines, ideally using a workflow manager such as Nextflow, Snakemake, or WDL
Working familiarity with bioinformatics file formats (VCF, BED, GTF, BAM, etc.) and standard tools (PLINK, samtools, bcftools, or similar)
Familiarity with typical data types in high-throughput biology, including NGS data
Hands-on experience or strong demonstrated interest in modern AI tooling — using LLMs through APIs, building MCP servers/connectors, prompt engineering, or wiring up agentic workflows
Demonstrated ability to build stable and practical, reusable workflows and not just code for one-off analyses, with strong implementation skills in Python and modern AI/ML tooling
A collaborative, low-ego mentality; you enjoy building tools that other people use and you take feedback well
Comfort with cloud computing environments (AWS, GCP, or Azure) and Linux/command-line work
Ability to work successfully in a matrixed environment
Prior experience with statistical workflows/biomedical statistics
Prior exposure to statistical genetics methods (GWAS, fine-mapping, MR, colocalization, burden testing) or large-scale genomic datasets (UK Biobank, gnomAD, GTEx, Open Targets)
Prior experience with complex high-throughput biological data or experiments such as spatial transcriptomics, large-scale screens, or multi-omics studies
Familiarity with R in addition to Python, particularly for statistical genetics packages
Experience with relational and/or graph databases, and with biomedical ontologies
Contributions to open-source projects or a public portfolio (GitHub, blog posts, demos)
Prior experience in pharma, biotech, or academic genomics research

Resources Managed

This is an individual contributor role with no direct reports. The Applied Bioinformatics Engineer, Pipelines & AI will work closely with scientists, engineers, and external partners across Lilly Research Labs.

Lilly is dedicated to helping individuals with disabilities to actively engage in the workforce, ensuring equal opportunities when vying for positions. If you require accommodation to submit a resume for a position at Lilly, please complete the accommodation request form (https://careers.lilly.com/us/en/workplace-accommodation) for further assistance. Please note this is for individuals to request an accommodation as part of the application process and any other correspondence will not receive a response.

Lilly is proud to be an EEO Employer and does not discriminate on the basis of age, race, color, religion, gender identity, sex, gender expression, sexual orientation, genetic information, ancestry, national origin, protected veteran status, disability, or any other legally protected status.

Our employee resource groups (ERGs) offer strong support networks for their members and are open to all employees. Our current groups include: Africa, Middle East, Central Asia Network, Black Employees at Lilly, Chinese Culture Network, Japanese International Leadership Network (JILN), Lilly India Network, Organization of Latinx at Lilly (OLA), PRIDE (LGBTQ+ Allies), Veterans Leadership Network (VLN), Women’s Initiative for Leading at Lilly (WILL), enAble (for people with disabilities). Learn more about all of our groups.

Actual compensation will depend on a candidate’s education, experience, skills, and geographic location. The anticipated wage for this position is

$166,500 - $266,200

Full-time equivalent employees also will be eligible for a company bonus (depending, in part, on company and individual performance). In addition, Lilly offers a comprehensive benefit program to eligible employees, including eligibility to participate in a company-sponsored 401(k); pension; vacation benefits; eligibility for medical, dental, vision and prescription drug benefits; flexible benefits (e.g., healthcare and/or dependent day care flexible spending accounts); life insurance and death benefits; certain time off and leave of absence benefits; and well-being benefits (e.g., employee assistance program, fitness benefits, and employee clubs and activities).Lilly reserves the right to amend, modify, or terminate its compensation and benefit programs in its sole discretion and Lilly’s compensation practices and guidelines will apply regarding the details of any promotion or transfer of Lilly employees.

#WeAreLilly

Applied Bioinformatics Engineer, Pipelines & AI

Position Summary

Key Responsibilities

Basic Requirements

Additional Skills/Preferences

Resources Managed

Related Jobs

Sales Contractor

Senior Software Engineer - Advertising Analytics

Senior Software Engineer - Advertising Analytics

Data Scientist II

Cloud Platform Engineer

Geospatial Visualization Engineer