Machine Learning Scientist / Sr Scientist, Federated Benchmarking & Validation Engineering

Eli LillySouth San Francisco, CA, US

10 days ago

Job type

Full-time

Job description

JOB DESCRIPTION

At Lilly, we unite caring with discovery to make life better for people around the world. We are a global healthcare leader headquartered in Indianapolis, Indiana. Our employees around the world work to discover and bring life-changing medicines to those who need them, improve the understanding and management of disease, and give back to our communities through philanthropy and volunteerism. We give our best effort to our work, and we put people first. We’re looking for people who are determined to make life better for people around the world.

Purpose

Lilly TuneLab is an AI-powered drug discovery platform that provides biotech companies with access to machine learning models trained on Lilly's extensive proprietary pharmaceutical research data. Through federated learning, the platform enables Lilly to build models on broad, diverse datasets from across the biotech ecosystem while preserving partner data privacy and competitive advantages. This collaborative approach accelerates drug discovery by creating continuously improving AI models that benefit both Lilly and our biotech partners.

The Machine Learning Scientist / Sr Scientist, Federated Benchmarking & Validation Engineering plays an essential role within the TuneLab platform, responsible for identifying, assessing, and implementing cutting-edge algorithmic solutions that leverage diverse datasets while ensuring data privacy and security for our biotech partners. This position requires comprehensive knowledge in small molecule drug development, ADME / Tox, antibody engineering, and / or genetic medicine, combined with expertise in data science and statistical analysis to develop sophisticated models utilizing federated learning. This position will be instrumental in advancing both Lilly's pipeline and our partners' drug discovery efforts by designing critical algorithms and workflows that expedite the creation of transformative therapies.

This role centers on constructing robust validation frameworks for federated models, creating privacy-preserving test sets across partner datasets, establishing standardized benchmarks against public datasets, and ensuring model reproducibility and generalization in diverse deployment scenarios.

Key Responsibilities

Federated Test Set Design : Architect and implement privacy-preserving protocols for constructing representative test sets across distributed partner datasets, ensuring statistical validity while maintaining data isolation.
Benchmark Suite Development : Create comprehensive benchmark suites covering small molecules (ADMET, solubility, permeability), antibodies (affinity, stability, immunogenicity), and RNA therapeutics (stability, delivery, off-target effects).
Cross-Domain Validation : Develop validation strategies that assess model generalization across different experimental protocols, cell lines, species, and therapeutic indications while respecting partner data boundaries.
Public Dataset Integration : Systematically benchmark federated models against public datasets (ChEMBL, PubChem, PDB, Therapeutic Antibody Database) to establish performance baselines and identify gaps.
Validation Frameworks : Implement time-split or proper scaffold-split validation protocols that assess model performance on prospective data, simulating real-world deployment scenarios and detecting concept drift.
Reproducibility Infrastructure : Build robust MLOps pipelines ensuring complete reproducibility of federated experiments, including versioning of data snapshots, model checkpoints, and hyperparameter configurations.
Statistical Rigor : Design statistically powered validation studies accounting for multiple testing, hierarchical data structures, and non-independent observations common in drug discovery datasets.
Performance Profiling : Develop comprehensive performance profiling across diverse molecular scaffolds, target classes, and property ranges, identifying systematic biases and failure modes.
Platform Integration : Collaborate with engineering teams to integrate validation frameworks with the TuneLab federated learning platform built on NVIDIA FLARE, ensuring scalable and automated testing across partner networks.

Basic Qualifications

PhD in Computational Biology, Bioinformatics, Cheminformatics, Computer Science, Statistics, or related field from an accredited college or university

Minimum of 2 years of experience in the biopharmaceutical industry or related fields, with demonstrated expertise in drug discovery and early development

Strong foundation in experimental design, statistical validation, and hypothesis testing

Experience with ML model validation, cross-validation strategies, and performance metrics

Proficiency in data engineering, pipeline development, and automation

Additional Preferences

Experience with federated learning platforms and distributed computing

Knowledge of regulatory requirements for AI / ML in pharmaceutical development

Expertise in ADMET assay development and validation

Understanding of antibody engineering and characterization methods

Familiarity with RNA therapeutic design and delivery systems

Experience with clinical biomarker validation and translational research

Proficiency in workflow orchestration tools (Airflow, Kubeflow, Prefect)

Strong knowledge of containerization and cloud computing (Docker, Kubernetes)

Publications on model validation, benchmarking, or reproducibility

Experience with GxP compliance and quality management systems

Exceptional attention to detail and commitment to scientific rigor

Strong technical writing skills for regulatory documentation

Portfolio mindset balancing rigorous validation with rapid deployment for partner value

This role is based at a Lilly site in Indianapolis, South San Francisco, or Boston with up to 10% travel (attendance expected at key industry conferences). Relocation is provided.

Lilly is dedicated to helping individuals with disabilities to actively engage in the workforce, ensuring equal opportunities when vying for positions. If you require accommodation to submit a resume for a position at Lilly, please complete the accommodation request form (

) for further assistance. Please note this is for individuals to request an accommodation as part of the application process and any other correspondence will not receive a response.

Lilly is proud to be an EEO Employer and does not discriminate on the basis of age, race, color, religion, gender identity, sex, gender expression, sexual orientation, genetic information, ancestry, national origin, protected veteran status, disability, or any other legally protected status.

Our employee resource groups (ERGs) offer strong support networks for their members and are open to all employees. Our current groups include : Africa, Middle East, Central Asia Network, Black Employees at Lilly, Chinese Culture Network, Japanese International Leadership Network (JILN), Lilly India Network, Organization of Latinx at Lilly (OLA), PRIDE (LGBTQ+ Allies), Veterans Leadership Network (VLN), Women’s Initiative for Leading at Lilly (WILL), enAble (for people with disabilities). Learn more about all of our groups.

Actual compensation will depend on a candidate’s education, experience, skills, and geographic location. The anticipated wage for this position is

$151,500 - $244,200

Full-time equivalent employees also will be eligible for a company bonus (depending, in part, on company and individual performance). In addition, Lilly offers a comprehensive benefit program to eligible employees, including eligibility to participate in a company-sponsored 401(k); pension; vacation benefits; eligibility for medical, dental, vision and prescription drug benefits; flexible benefits (e.g., healthcare and / or dependent day care flexible spending accounts); life insurance and death benefits; certain time off and leave of absence benefits; and well-being benefits (e.g., employee assistance program, fitness benefits, and employee clubs and activities).Lilly reserves the right to amend, modify, or terminate its compensation and benefit programs in its sole discretion and Lilly’s compensation practices and guidelines will apply regarding the details of any promotion or transfer of Lilly employees.

#WeAreLilly

Explore Location Close the popup

Apply Now Save job

Create a job alert for this search

Machine Learning Scientist • South San Francisco, CA, US

Related jobs

Promoted

Sr. Staff Machine Learning Engineer, Applied Research Science

PinterestSan Francisco, CA, United States

Full-time

Millions of people around the world come to our platform to find creative ideas, dream about new possibilities and plan for memories that will last a lifetime. At Pinterest, we're on a mission to br...Show moreLast updated: 11 days ago

Promoted

Machine Learning Research Scientist / Engineer, Reasoning

Scale AI, Inc.San Francisco, CA, United States

Full-time

At Scale AI, our mission is to accelerate the development of AI applications.For 8 years, Scale has been the leading AI data foundry, fueling the most exciting advancements in AI, including generat...Show moreLast updated: 30+ days ago

Promoted

Sr Machine Learning Engineer, Applied Research Science

PinterestSan Francisco, CA, United States

Full-time

Promoted

Tech Lead / Manager, Machine Learning Research Scientist- LLM Evals

Scale AI, Inc.San Francisco, CA, United States

Full-time

As the leading data and evaluation partner for frontier AI companies, Scale is dedicated to advancing the evaluation and benchmarking of large language models (LLMs). We are building industry-leadin...Show moreLast updated: 9 days ago

Promoted

Machine Learning Research Scientist / Engineer, Agents

Scale AI, Inc.San Francisco, CA, United States

Full-time

At Scale AI, our mission is to accelerate the development of AI applications.For 8 years, Scale has been the leading AI data foundry, helping fuel the most exciting advancements in AI, including : g...Show moreLast updated: 30+ days ago

Promoted

Machine Learning Engineer, GenAI Applied ML

Scale AI, Inc.San Francisco, CA, United States

Full-time

Promoted

Senior Machine Learning Scientist, BRAID (Clinical Sciences ML)

GenentechSan Francisco, CA, United States

Full-time

It’s what drives us to innovate.To continuously advance science and ensure everyone has access to the healthcare they need today and for generations to come. Creating a world where we all have more ...Show moreLast updated: 30+ days ago

Machine Learning Scientist / Sr Scientist - Uncertainty Quantification & Influencer Analysis

Eli LillySouth San Francisco, CA, US

Full-time

Promoted

Machine Learning Research Scientist

SentraSan Francisco, CA, United States

Full-time

Sentra is building organizational superintelligence through memory infrastructure that reasons across time, causality, and context. As a Research Scientist, you will tackle fundamental problems in k...Show moreLast updated: 27 days ago

Promoted

Sr. Machine Learning Engineer (Recommendation Systems)

PhiloSan Francisco, CA, United States

Full-time

At Philo, we’re a group of technology and product people who set out to build the future of television, marrying the best in modern technology with the most compelling medium ever invented.We lever...Show moreLast updated: 30+ days ago

Promoted

Sr. Machine Learning Engineer, Monetization Engineering

PinterestPalo Alto, CA, United States

Full-time

Promoted

Machine Learning Scientist (All Levels)

AbridgeSan Francisco, CA, United States

Full-time

Abridge was founded in 2018 with the mission of powering deeper understanding in healthcare.Our AI-powered platform was purpose-built for medical conversations, improving clinical documentation eff...Show moreLast updated: 10 days ago

Promoted

Staff Machine Learning Research Scientist, LLM Evals

Scale AI, Inc.San Francisco, CA, United States

Full-time

Promoted

Machine Learning Research Scientist / Research Engineer, Post-Training

Scale AI, Inc.San Francisco, CA, United States

Full-time

Scale works with the industry's leading AI labs to provide high quality data and accelerate progress in GenAI research.We are looking for Research Scientists and Research Engineers with expertise i...Show moreLast updated: 30+ days ago

Machine Learning Scientist / Sr Scientist - Small Molecule Property Prediction and Generative Design

Eli LillySouth San Francisco, CA, US

Full-time

Promoted

Principal Machine Learning Scientist - Trajectory Generation

General MotorsMountain View, CA, United States

Full-time

At General Motors, our product teams are redefining mobility.Through a human-centered design process, we create vehicles and experiences that are designed not just to be seen, but to be felt.We’re ...Show moreLast updated: 3 days ago

Promoted

Senior Machine Learning Scientist

Expedia, Inc.San Jose, CA, United States

Full-time

Senior Machine Learning Scientist.Expedia Group brands power global travel for everyone, everywhere.We design cutting-edge tech to make travel smoother and more memorable, and we create groundbreak...Show moreLast updated: 30+ days ago

Promoted

Research Scientist - Machine Learning

ExtropicSan Francisco, CA, United States

Full-time

Extropic’s hardware massively accelerates certain kinds of probabilistic inference.Our ML team works on the science of training models in the thermodynamic paradigm, and we are looking for senior r...Show moreLast updated: 13 days ago