A healthier future. It’s what drives us to innovate. To continuously advance science and ensure everyone has access to the healthcare they need today and for generations to come. Creating a world where we all have more time with the people we love. That’s what makes us Roche.
Advances in AI, data, and computational sciences are transforming drug discovery and development. Roche’s Research and Early Development organisations at Genentech (gRED) and Pharma (pRED) have demonstrated how these technologies accelerate R&D, leveraging data and novel computational models to drive impact. Seamless data sharing and access to models across gRED and pRED are essential to maximising these opportunities. The new Computational Sciences Center of Excellence (CoE) is a strategic, unified group whose goal is to harness this transformative power of data and Artificial Intelligence (AI) to assist our scientists in both pRED and gRED to deliver more innovative and transformative medicines for patients worldwide.
The Opportunity
Within the CoE organisation, the Data and Digital Catalyst (DDC) organisation drives the modernisation of our computational and data ecosystems and integration of digital technologies across Research and Early Development to enable our stakeholders, power data-driven science and accelerate decision-making. As a Senior Machine Learning Scientist for the AI team within the Engineering - Lab Automation capability, you will be a key scientific leader, responsible for developing, deploying, and operationalizing the predictive models and optimization frameworks that power intelligent, closed-loop experimentation in production. You will solve complex technical challenges in multimodal data modeling and autonomous decision-making, ensuring our systems reliably translate the latest advances in AI into executable actions that power research lab automation. Your work will be vital in shaping our closed-loop strategy and enabling our scientists with robust, self-optimizing discovery engines to accelerate drug discovery.
In this role, you will:
Develop, deploy, and validate machine learning (ML) models using diverse, multimodal data (analytical data, images, and metadata) to accurately predict experimental outcomes in production.
Apply advanced statistical experimental design and optimization techniques to build and deploy the core algorithms for autonomous decision-making systems in automated workflows.
Develop and integrate ML models for intelligent quality control and experimental error modeling to provide real-time automated detection of anomalies, system drifts, or physical error in lab workflows.
Drive the roadmap and technical implementation of leveraging large language models (LLMs) to translate high-level scientific intent and assay protocols into validated, machine-executable automation programs.
Partner with scientific experts and engineers to lead exploratory data analysis (EDA) and data QC efforts, defining the data requirements and modeling approaches necessary for reliable deployment.
Act as an internal subject matter expert on cutting-edge AI/ML trends and ensure all models are production-ready, robustly documented, version-controlled, and meet high standards of scientific rigor.
Who you are
MS/BS in Computer Science, Statistics, or related field with 5+ years of industry experience in AI/ML, or a PhD with 2+ years of experience focused on deploying ML solutions in a production or translational research setting.
Proven record of impact as evidenced by publications, patents, or significant technical contributions to large-scale scientific or automation projects.
Hands-on experience developing and operationalizing complex ML models for unstructured and multimodal data (e.g., deep learning for imaging, time-series analysis for instrument data, and metadata fusion).
Expert proficiency in the Python data science stack and modern ML frameworks (e.g., PyTorch).
Passion for translating cutting-edge AI/ML research into high-impact, real-world scientific applications that accelerate therapeutic discovery.
A dynamic individual who excels in a rapidly evolving environment, taking initiative on projects and continuously learning and adapting to new challenges and opportunities.
Preferred
Practical experience in computational design of experiments (DoE) and optimization methods (e.g., Bayesian optimization, Reinforcement Learning) for closed-loop systems.
Practical experience with the application of large language models (LLMs) for scientific text abstraction.
Public portfolio of projects available on GitHub/GitLab demonstrating production or research capabilities.
Strong understanding of lab workflows and the drug discovery pipeline, ideally with foundational training in biology or chemistry.
Relocation benefits are NOT available for this job posting.
The expected salary range for this position, based on the location of California, is $147,500 - 273,900. Actual pay will be determined based on experience, qualifications, geographic location, and other job-related factors permitted by law. A discretionary annual bonus may be available based on individual and Company performance. This position also qualifies for the benefits detailed at the link provided below.
#ComputationCoE
#tech4lifeComputationalScience
Genentech is an equal opportunity employer. It is our policy and practice to employ, promote, and otherwise treat any and all employees and applicants on the basis of merit, qualifications, and competence. The company's policy prohibits unlawful discrimination, including but not limited to, discrimination on the basis of Protected Veteran status, individuals with disabilities status, and consistent with all federal, state, or local laws.
If you have a disability and need an accommodation in relation to the online application process, please contact us by completing this form Accommodations for Applicants.