Lilly

Data Analyst / Data Engineer

India, Hyderabad Full time

At Lilly, we unite caring with discovery to make life better for people around the world. We are a global healthcare leader headquartered in Indianapolis, Indiana. Our employees around the world work to discover and bring life-changing medicines to those who need them, improve the understanding and management of disease, and give back to our communities through philanthropy and volunteerism. We give our best effort to our work, and we put people first. We’re looking for people who are determined to make life better for people around the world.

Position Summary

We are looking for a Data Analyst / Data Engineer to join our Manufacturing Quality Analytics team supporting External Manufacturing operations in the pharmaceutical industry.

In this role, you will design and manage data pipelines, analytics datasets, and reporting solutions that support quality oversight across global manufacturing partners. You will work with manufacturing quality data from systems such as MES, LIMS, QMS (Veeva Vault), ERP (SAP), and process historians, transforming raw data into analytics-ready datasets within the Azure Databricks Lakehouse platform.

The ideal candidate will combine data engineering capabilities with strong analytical skills to help quality teams, manufacturing scientists, and leadership gain actionable insights from complex manufacturing data.

Key Responsibilities

Data Engineering & Pipeline Development

  • Design, build, and maintain ETL/ELT data pipelines using Azure Databricks (PySpark, Spark SQL, Delta Live Tables).
  • Ingest and transform data from MES, LIMS, QMS (Veeva Vault), ERP (SAP), and manufacturing process historians.
  • Implement Delta Lake architecture (Bronze → Silver → Gold layers) to organize and optimize manufacturing quality datasets.
  • Develop data transformation logic, schema management, and partitioning strategies to support large-scale manufacturing datasets.
  • Implement data validation and quality monitoring checks to ensure reliable and compliant datasets.
  • Build and maintain data workflows using Azure Data Factory (ADF) or Databricks Workflows for scheduling and orchestration.
  • Support ingestion of structured and semi-structured data including SQL databases, APIs, JSON, XML, and flat files.
  • Experience implementing CI/CD pipelines for data workflows.
  • Familiarity with Git version control.

Data Analysis & Reporting

  • Perform exploratory data analysis (EDA) and statistical analysis on manufacturing quality data including:
    • Batch manufacturing records
    • Deviation trends
    • CAPA performance metrics
    • Certificate of Analysis (CoA) data
    • Stability study results
  • Develop interactive dashboards and reports using Power BI or Databricks SQL.
  • Translate business questions from quality engineers, TSMS teams, and manufacturing stakeholders into data-driven insights.
  • Perform trend analysis and root cause analysis to identify quality risks and improvement opportunities.
  • Support supplier performance monitoring, quality scorecards, and periodic business reviews.
  • Provide traceable data extracts and analytics outputs for regulatory reporting and compliance activities.

Data Governance & Compliance (GxP)

  • Maintain data lineage, metadata, and data dictionaries using tools such as Microsoft Purview and Unity Catalog.
  • Ensure analytics solutions comply with GxP requirements including FDA 21 CFR Part 11 and ALCOA+ data integrity principles.
  • Participate in Computer System Validation (CSV) activities including documentation for new data pipelines and analytics tools.
  • Ensure audit readiness by maintaining clear documentation, transformation traceability, and reproducible data logic.

Collaboration & Agile Delivery

  • Collaborate with Quality Engineers, Manufacturing Scientists, Data Architects, and IT teams to deliver analytics solutions.
  • Align manufacturing data solutions with enterprise data architecture and Azure Lakehouse strategy.
  • Continuously improve data quality, pipeline performance, and analytics usability.
  • Participate in Agile delivery processes, including sprint planning, stand-ups, backlog grooming, and retrospectives.
  • Manage development tasks and tracking through JIRA.

Required Qualifications

  • Bachelor’s or Master’s degree in Data Science, Computer Science, Engineering, Statistics, or related field.
  • 3–5 years of experience in Data Engineering, Data Analytics, or Data Platform development.
  • Hands-on experience with Azure Databricks, PySpark, Spark SQL, and Delta Lake.
  • Experience building data pipelines and data models for analytics platforms.
  • Strong experience with SQL and large-scale data transformation.
  • Experience building Power BI dashboards or equivalent BI tools.
  • Experience working with cloud platforms (Azure preferred).
  • Understanding of data governance and data quality principles.

Preferred Qualifications (Highly Valued in Pharma)

  • Experience working in pharmaceutical, biotech, or life sciences environments.
  • Familiarity with manufacturing and quality systems such as MES, LIMS, Veeva Vault QMS, and SAP.
  • Knowledge of GxP, GMP data governance, ALCOA+, and FDA 21 CFR Part 11.
  • Experience supporting manufacturing analytics, quality analytics, or regulatory reporting.
  • Experience with Azure Data Factory, Databricks Workflows, and data orchestration tools.
  • Exposure to process historian data (OSI PI, etc.).

Key Skills

  • Azure Databricks
  • PySpark / Spark SQL
  • Delta Lake Architecture
  • Data Pipeline Development
  • Power BI / Data Visualization
  • Manufacturing & Quality Data Analytics
  • Data Governance & Compliance (GxP)
  • SQL & Data Modeling
  • Agile Delivery & JIRA

Lilly is dedicated to helping individuals with disabilities to actively engage in the workforce, ensuring equal opportunities when vying for positions. If you require accommodation to submit a resume for a position at Lilly, please complete the accommodation request form (https://careers.lilly.com/us/en/workplace-accommodation) for further assistance. Please note this is for individuals to request an accommodation as part of the application process and any other correspondence will not receive a response.

Lilly does not discriminate on the basis of age, race, color, religion, gender, sexual orientation, gender identity, gender expression, national origin, protected veteran status, disability or any other legally protected status.

#WeAreLilly