Talent.com
Software Engineer - GenAI inference

Software Engineer - GenAI inference

Databricks Inc.San Francisco, CA, United States
2 days ago
Job type
  • Full-time
Job description

As a software engineer for GenAI inference, you will help design, develop, and optimize the inference engine that powers Databricks’ Foundation Model API. You’ll work at the intersection of research and production, ensuring our large language model (LLM) serving systems are fast, scalable, and efficient. Your work will touch the full GenAI inference stack — from kernels and runtimes to orchestration and memory management.

What You Will Do

  • Contribute to the design and implementation of the inference engine, and collaborate on model-serving stack optimized for large-scale LLMs inference
  • Collaborate with researchers to bring new model architectures or features (sparsity, activation compression, mixture-of-experts) into the engine
  • Optimize for latency, throughput, memory efficiency, and hardware utilization across GPUs, and accelerators
  • Build and maintain instrumentation, profiling, and tracing tooling to uncover bottlenecks and guide optimizations
  • Develop and enhance scalable routing, batching, scheduling, memory management, and dynamic loading mechanisms for inference workloads
  • Support reliability, reproducibility, and fault tolerance in the inference pipelines, including A / B launches, rollback, and model versioning
  • Integrate with federated, distributed inference infrastructure – orchestrate across nodes, balance load, handle communication overhead
  • Collaborate cross-functionally : with platform engineers, cloud infrastructure, and security / compliance teams
  • Document and share learnings, contributing to internal best practices and open-source efforts when possible

What We Look For

  • BS / MS / PhD in Computer Science, or a related field
  • Strong software engineering background (3+ years or equivalent) in performance-critical systems
  • Solid understanding of ML inference internals : attention, MLPs, recurrent modules, quantization, sparse operations, etc.
  • Hands-on experience with CUDA, GPU programming, and key libraries (cuBLAS, cuDNN, NCCL, etc.)
  • Comfortable designing and operating distributed systems, including RPC frameworks, queuing, RPC batching, sharding, memory partitioning
  • Demonstrated ability to uncover and solve performance bottlenecks across layers (kernel, memory, networking, scheduler)
  • Experience building instrumentation, tracing, and profiling tools for ML models
  • Ability to work closely with ML researchers, translate novel model ideas into production systems
  • Ownership mindset and eagerness to dive deep into complex system challenges
  • Bonus : published research or open-source contributions in ML systems, inference optimization, or model serving
  • Pay Range Transparency

    Local Pay Range

    $142,200 — $204,600 USD

    About Databricks

    Databricks is the data and AI company. More than 10,000 organizations worldwide — including Comcast, Condé Nast, Grammarly, and over 50% of the Fortune 500 — rely on the Databricks Data Intelligence Platform to unify and democratize data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, Apache Spark™, Delta Lake and MLflow. To learn more, follow Databricks on Twitter, LinkedIn and Facebook.

    Benefits

    At Databricks, we strive to provide comprehensive benefits and perks that meet the needs of all of our employees. For specific details on the benefits offered in your region, please visithttps : / / www.mybenefitsnow.com / databricks.

    Our Commitment to Diversity and Inclusion

    At Databricks, we are committed to fostering a diverse and inclusive culture where everyone can excel. We take great care to ensure that our hiring practices are inclusive and meet equal employment opportunity standards. Individuals looking for employment at Databricks are considered without regard to age, color, disability, ethnicity, family or marital status, gender identity or expression, language, national origin, physical and mental ability, political affiliation, race, religion, sexual orientation, socio-economic status, veteran status, and other protected characteristics.

    Compliance

    If access to export-controlled technology or source code is required for performance of job duties, it is within Employer's discretion whether to apply for a U.S. government license for such positions, and Employer may decline to proceed with an applicant on this basis alone.

    #J-18808-Ljbffr

    Create a job alert for this search

    Software Engineer • San Francisco, CA, United States

    Related jobs
    • Promoted
    AI Implementation Specialist

    AI Implementation Specialist

    VirtualVocationsHayward, California, United States
    Full-time
    A company is looking for an AI Implementation Specialist for Automated Portfolio Analysis and Reporting.Key Responsibilities Develop a GUI-based prototype for AI-powered document analysis and vis...Show moreLast updated: 30+ days ago
    • Promoted
    Machine Learning Engineer, Prediction

    Machine Learning Engineer, Prediction

    WaymoMountain View, CA, United States
    Full-time
    Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on buildin...Show moreLast updated: 4 days ago
    • Promoted
    • New!
    Principal AI Scientist

    Principal AI Scientist

    VirtualVocationsHayward, California, United States
    Full-time
    A company is looking for a Principal AI Scientist.Key Responsibilities Coordinate research, design, and maintenance of predictive models in a production environment Partner with stakeholders to ...Show moreLast updated: 22 hours ago
    • Promoted
    • New!
    Senior AI Agent Developer

    Senior AI Agent Developer

    VirtualVocationsFremont, California, United States
    Full-time
    Key Responsibilities Design, develop, and deploy production-ready AI agents that support critical workflows across the enterprise Rapidly prototype, test, and refine agents to ensure high perfor...Show moreLast updated: 8 hours ago
    • Promoted
    • New!
    AI Tools Subject Matter Expert

    AI Tools Subject Matter Expert

    VirtualVocationsFremont, California, United States
    Full-time
    A company is looking for an AI Tool SME.Key Responsibilities : Lead tool onboarding and training Ensure compliance with AI governance Support cross-functional adoption and education Required Qu...Show moreLast updated: 8 hours ago
    • Promoted
    • New!
    Senior AI Consultant

    Senior AI Consultant

    VirtualVocationsHayward, California, United States
    Full-time
    A company is looking for a Senior Consultant AI.Key Responsibilities Utilize Azure AI Services to build and optimize OpenAI solutions Develop and manage data ingestion and transformation pipelin...Show moreLast updated: 6 hours ago
    • Promoted
    AI Solutions Specialist

    AI Solutions Specialist

    VirtualVocationsConcord, California, United States
    Full-time
    A company is looking for an AI Solutions Specialist to drive business innovation through artificial intelligence.Key Responsibilities Analyze complex datasets to uncover trends and actionable ins...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Data Engineer with GCP Certification

    Data Engineer with GCP Certification

    VirtualVocationsConcord, California, United States
    Full-time
    A company is looking for a Data Engineer (Gen AI).Key Responsibilities Install and test Looker and Looker Studio extensions or API plugins Identify and deprecate unused Looker dashboards Enhanc...Show moreLast updated: 2 hours ago
    • Promoted
    • New!
    ADABAS DB2 Database Specialist

    ADABAS DB2 Database Specialist

    VirtualVocationsConcord, California, United States
    Full-time
    A company is looking for an ADABAS / DB2 Database Specialist to provide expert-level technical support in a remote setting. Key Responsibilities Plan, install, configure, and implement SAG ADABAS so...Show moreLast updated: less than 1 hour ago
    • Promoted
    • New!
    Epic Certified BI Developer

    Epic Certified BI Developer

    VirtualVocationsHayward, California, United States
    Full-time
    A company is looking for an IT Data Solutions Developer Associate.Key Responsibilities Analyze, develop, and improve BI solutions to meet business needs Collaborate with operations teams to desi...Show moreLast updated: less than 1 hour ago
    • Promoted
    AI Evaluation Analyst

    AI Evaluation Analyst

    VirtualVocationsHayward, California, United States
    Full-time
    A company is looking for an AI Agent Evaluation Analyst.Key Responsibilities Review evaluation tasks and scenarios for logic, completeness, and realism Identify inconsistencies, missing assumpti...Show moreLast updated: 1 day ago
    • Promoted
    Principal Machine Learning Engineer, Monetization

    Principal Machine Learning Engineer, Monetization

    PinterestPalo Alto, CA, United States
    Full-time
    Millions of people around the world come to our platform to find creative ideas, dream about new possibilities and plan for memories that will last a lifetime. At Pinterest, we're on a mission to br...Show moreLast updated: 28 days ago
    • Promoted
    Junior Machine Learning Engineer

    Junior Machine Learning Engineer

    VirtualVocationsConcord, California, United States
    Full-time
    A company is looking for a Junior Machine Learning Engineer to support the design, development, and deployment of machine learning models. Key Responsibilities Support the end-to-end lifecycle of ...Show moreLast updated: 30+ days ago
    • Promoted
    AI Prototype Specialist

    AI Prototype Specialist

    VirtualVocationsHayward, California, United States
    Full-time
    A company is looking for an AI Prototype Specialist (Vibe Coder).Key Responsibilities Collaborate with strategy and sales teams to understand client needs and challenges Design and build functio...Show moreLast updated: 1 day ago
    • Promoted
    • New!
    Generative AI Scientist

    Generative AI Scientist

    VirtualVocationsFremont, California, United States
    Full-time
    A company is looking for a Generative AI Scientist.Key Responsibilities Deliver solutions to identify payment integrity issues and improve healthcare quality Develop exploratory data analysis ap...Show moreLast updated: 2 hours ago
    • Promoted
    • New!
    Deep Learning Software Engineer

    Deep Learning Software Engineer

    VirtualVocationsConcord, California, United States
    Full-time
    A company is looking for a Deep Learning Software Engineer, Inference and Model Optimization - New College Grad 2025.Key Responsibilities Train, develop, and deploy generative AI models using the...Show moreLast updated: 8 hours ago
    • Promoted
    • New!
    AI Applications Engineer

    AI Applications Engineer

    VirtualVocationsFremont, California, United States
    Full-time
    A company is looking for an AI Application Engineer to bridge cutting-edge AI research and practical implementation.Key Responsibilities Identify and prioritize high-potential experiments based o...Show moreLast updated: less than 1 hour ago
    • Promoted
    UX Researcher for Training

    UX Researcher for Training

    VirtualVocationsFremont, California, United States
    Full-time
    A company is looking for a UX Researcher (Training & Education) to design and refine training courses related to prevention of sexual assault and other harmful behaviors in the military.Key Respons...Show moreLast updated: 1 day ago
    • Promoted
    AI Agent Evaluation Analyst

    AI Agent Evaluation Analyst

    VirtualVocationsConcord, California, United States
    Full-time
    A company is looking for an AI Agent Evaluation Analyst.Key Responsibilities Review evaluation tasks and scenarios for logic, completeness, and realism Identify inconsistencies, missing assumpti...Show moreLast updated: 1 day ago
    • Promoted
    ETL Data Architect

    ETL Data Architect

    VirtualVocationsConcord, California, United States
    Full-time
    A company is looking for an ETL Data Architect to design and develop enterprise-wide application systems.Key Responsibilities Collaborate with business partners and teams to identify technical an...Show moreLast updated: 1 day ago