Position : Biostatistics / Data Science Lead (Can be remote)
Overview :
Our client seeking a hands-on Biostatistics / Data Science leader to drive the design and implementation of scalable, compliant, and efficient statistical workflows supporting Gilead’s drug development programs. This role combines technical excellence in R and data engineering with leadership in cross-functional collaboration, reproducibility, and innovation within regulated environments.
Key Responsibilities :
- Lead the development of R-based statistical tools , pipelines, and reproducible workflows to support clinical trial analysis and reporting.
- Collaborate with statisticians to translate complex methodologies into robust, validated R code for simulations, adaptive designs, and advanced analyses.
- Oversee R package development , GitHub workflows, and CI / CD pipelines, ensuring code integrity, documentation, and compliance.
- Define solution architectures integrating open-source tools and cloud-based infrastructures for large-scale, compliant data management.
- Partner with IT, data engineering, and business teams to streamline processes, enhance automation, and maintain data governance standards.
- Guide standardization and best practices for code development, validation, and regulatory readiness.
- Provide technical leadership and mentorship, managing timelines, deliverables, and quality expectations.
Qualifications :
PhD in Statistics, Biostatistics, Computer Science, or related field with 5 + years of experience (MS with 10+, BS with 12+).Deep expertise in R programming , R package development, and statistical implementation in clinical trials.Proven experience with GitHub version control , CI / CD, and collaborative coding workflows.Strong background in data engineering, automation, and reproducibility in a regulated (pharma / life sciences) environment.Familiarity with Pharmaverse packages and open-source software development practices.Excellent problem-solving, project management, and communication skills.Preferred :
Experience with AWS / cloud data pipelines , distributed computing, and additional programming languages (Python, SQL).Demonstrated ability to bridge between statisticians, data scientists, and engineers.