LLM Engineer - Transcriptome Analysis Platform

Ayass BioScience, LLCFrisco, TX, US

1 day ago

Job type

Full-time

Job description

Job Description

We are seeking an innovative LLM Engineer to develop and optimize large language model systems for our cutting-edge transcriptome differential expression gene (DEG) analysis platform. This role is critical in building the reasoning foundation that will transform biological data analysis from statistical correlation to mechanistic understanding. You will work at the intersection of advanced AI and precision medicine, creating systems that can reason about complex biological relationships and generate actionable insights from petabytes of genomic data.

Key Responsibilities

Core LLM Development

Design and implement specialized LLM architectures for biological reasoning and causal inference
Fine-tune foundation models (GPT-4, Claude, Gemma, etc.) for domain-specific transcriptome analysis tasks
Develop custom prompting strategies that enable complex reasoning about gene regulatory networks
Create RAG (Retrieval-Augmented Generation) pipelines integrating scientific literature with experimental data
Implement chain-of-thought (CoT) and tree-of-thoughts (ToT) prompting for multi-step biological reasoning

Model Optimization & Scaling

Optimize LLM inference for production environments handling 20,000+ gene analyses

Implement distributed processing using Ray Serve or similar frameworks for sub-second response times

Design context compression techniques for handling large-scale genomic datasets

Develop model ensembling strategies to reduce output variability from 30% to

Create efficient token management strategies for processing lengthy biological contexts

Biological Domain Integration

Build knowledge graphs connecting genes, pathways, diseases, and literature findings

Implement causal reasoning capabilities for identifying driver vs. passenger gene mutations

Develop specialized embeddings for biological entities (genes, proteins, pathways)

Create explanation generation systems that produce clinician-friendly interpretations

Design validation frameworks ensuring biological accuracy of LLM outputs

Quality & Reliability

Implement uncertainty quantification for model predictions

Develop robust evaluation metrics beyond traditional NLP measures

Create testing frameworks for biological reasoning accuracy

Design fallback mechanisms for handling edge cases in genomic data

Build monitoring systems for production model performance

Required Qualifications

Technical Expertise

MS / PhD in Computer Science, AI, Computational Biology, or related field

3+ years of experience with LLM development and deployment

Expert proficiency in Python and ML frameworks (PyTorch, TensorFlow, Hugging Face)

Proven experience with prompt engineering and fine-tuning techniques

Strong understanding of transformer architectures and attention mechanisms

Experience with distributed computing frameworks (Ray, Dask, or similar)

Domain Knowledge

Understanding of biological terminology and genomics concepts

Experience with scientific text processing and literature mining

Familiarity with causal inference and reasoning frameworks

Knowledge of medical / clinical NLP applications is a plus

Production Experience

Track record of deploying LLM systems at scale

Experience with model optimization techniques (quantization, pruning, distillation)

Knowledge of MLOps practices and model versioning

Experience with API design for AI services

Preferred Qualifications

Experience with biomedical language models (BioBERT, PubMedBERT, BioGPT)

Knowledge of transcriptomics and differential expression analysis

Familiarity with clinical regulatory requirements (FDA / EMA)

Publications in NLP, computational biology, or related fields

Experience with multi-modal AI systems

Understanding of graph neural networks for biological applications

Key Performance Metrics

Achieve

Reduce LLM output variability to

Improve biological reasoning accuracy to >

90% on benchmark datasets

Successfully integrate 1M+ scientific papers into knowledge base

Deploy production systems handling 10,000+ analyses per day

What We Offer

Opportunity to work on transformative AI technology with direct patient impact

Collaboration with leading scientists and AI researchers

Access to state-of-the-art computational resources and datasets

Comprehensive benefits and equity participation

Professional development and conference attendance support

Remote-first culture with flexible working arrangements

Integration with Team

You will work closely with :

Agentic AI Engineers to enable autonomous biological discovery systems

Software Engineers to build scalable, production-ready platforms

Bioinformaticians to ensure biological accuracy and relevance

Clinical researchers to translate findings into therapeutic insights

Create a job alert for this search

Platform Engineer • Frisco, TX, US