Talent.com
LLM Engineer - Transcriptome Analysis Platform

LLM Engineer - Transcriptome Analysis Platform

Ayass BioScience, LLCFrisco, TX, US
1 day ago
Job type
  • Full-time
Job description

Job Description

We are seeking an innovative LLM Engineer to develop and optimize large language model systems for our cutting-edge transcriptome differential expression gene (DEG) analysis platform. This role is critical in building the reasoning foundation that will transform biological data analysis from statistical correlation to mechanistic understanding. You will work at the intersection of advanced AI and precision medicine, creating systems that can reason about complex biological relationships and generate actionable insights from petabytes of genomic data.

Key Responsibilities

Core LLM Development

  • Design and implement specialized LLM architectures for biological reasoning and causal inference
  • Fine-tune foundation models (GPT-4, Claude, Gemma, etc.) for domain-specific transcriptome analysis tasks
  • Develop custom prompting strategies that enable complex reasoning about gene regulatory networks
  • Create RAG (Retrieval-Augmented Generation) pipelines integrating scientific literature with experimental data
  • Implement chain-of-thought (CoT) and tree-of-thoughts (ToT) prompting for multi-step biological reasoning

Model Optimization & Scaling

  • Optimize LLM inference for production environments handling 20,000+ gene analyses
  • Implement distributed processing using Ray Serve or similar frameworks for sub-second response times
  • Design context compression techniques for handling large-scale genomic datasets
  • Develop model ensembling strategies to reduce output variability from 30% to
  • Create efficient token management strategies for processing lengthy biological contexts
  • Biological Domain Integration

  • Build knowledge graphs connecting genes, pathways, diseases, and literature findings
  • Implement causal reasoning capabilities for identifying driver vs. passenger gene mutations
  • Develop specialized embeddings for biological entities (genes, proteins, pathways)
  • Create explanation generation systems that produce clinician-friendly interpretations
  • Design validation frameworks ensuring biological accuracy of LLM outputs
  • Quality & Reliability

  • Implement uncertainty quantification for model predictions
  • Develop robust evaluation metrics beyond traditional NLP measures
  • Create testing frameworks for biological reasoning accuracy
  • Design fallback mechanisms for handling edge cases in genomic data
  • Build monitoring systems for production model performance
  • Required Qualifications

    Technical Expertise

  • MS / PhD in Computer Science, AI, Computational Biology, or related field
  • 3+ years of experience with LLM development and deployment
  • Expert proficiency in Python and ML frameworks (PyTorch, TensorFlow, Hugging Face)
  • Proven experience with prompt engineering and fine-tuning techniques
  • Strong understanding of transformer architectures and attention mechanisms
  • Experience with distributed computing frameworks (Ray, Dask, or similar)
  • Domain Knowledge

  • Understanding of biological terminology and genomics concepts
  • Experience with scientific text processing and literature mining
  • Familiarity with causal inference and reasoning frameworks
  • Knowledge of medical / clinical NLP applications is a plus
  • Production Experience

  • Track record of deploying LLM systems at scale
  • Experience with model optimization techniques (quantization, pruning, distillation)
  • Knowledge of MLOps practices and model versioning
  • Experience with API design for AI services
  • Preferred Qualifications

  • Experience with biomedical language models (BioBERT, PubMedBERT, BioGPT)
  • Knowledge of transcriptomics and differential expression analysis
  • Familiarity with clinical regulatory requirements (FDA / EMA)
  • Publications in NLP, computational biology, or related fields
  • Experience with multi-modal AI systems
  • Understanding of graph neural networks for biological applications
  • Key Performance Metrics

  • Achieve
  • Reduce LLM output variability to
  • Improve biological reasoning accuracy to >
  • 90% on benchmark datasets

  • Successfully integrate 1M+ scientific papers into knowledge base
  • Deploy production systems handling 10,000+ analyses per day
  • What We Offer

  • Opportunity to work on transformative AI technology with direct patient impact
  • Collaboration with leading scientists and AI researchers
  • Access to state-of-the-art computational resources and datasets
  • Comprehensive benefits and equity participation
  • Professional development and conference attendance support
  • Remote-first culture with flexible working arrangements
  • Integration with Team

    You will work closely with :

  • Agentic AI Engineers to enable autonomous biological discovery systems
  • Software Engineers to build scalable, production-ready platforms
  • Bioinformaticians to ensure biological accuracy and relevance
  • Clinical researchers to translate findings into therapeutic insights
  • Create a job alert for this search

    Platform Engineer • Frisco, TX, US