Talent.com
Software Engineer - ML/LLM Inference
Software Engineer - ML/LLM InferenceAlldus • San Francisco, CA, United States
Software Engineer - ML / LLM Inference

Software Engineer - ML / LLM Inference

Alldus • San Francisco, CA, United States
30+ days ago
Job type
  • Full-time
Job description

Get AI-powered advice on this job and more exclusive features.

Direct message the job poster from Alldus

Principal Recruitment Consultant | AI & Machine Learning | Co-organizer of the AI in Action Podcast

My client is searching for a talented engineer to work on ML / LLM inference and serving. They specialize in developing next-gen LLM fine-tuning and inference engines.

We are seeking a talented and motivated Software Engineer specializing in Machine Learning (ML) and Large Language Model (LLM) inference to join our dynamic ML Inference team. In this role, you will bridge the gap between AI / ML research and systems programming to build and enhance our next-generation LLM Inference Engine. You will play a crucial role in optimizing the performance, scalability, and efficiency of our LLM serving systems.

Key Responsibilities :

Develop and Enhance Inference Engine :

  • Design, implement, and optimize the next-generation LLM Inference Engine.
  • Integrate the latest LLM inference techniques from research to enhance latency and throughput.

Performance Optimization :

  • Conduct deep performance optimizations across multiple layers of the technology stack, including PyTorch, C++, and CUDA.
  • Analyze and improve system performance to meet the demands of various use cases.
  • Work closely with customers to understand specific performance requirements and optimize solutions accordingly.
  • Provide technical expertise and support to ensure successful deployment and operation of inference systems.
  • Technical Leadership :

  • Define the roadmap and technical vision for the inference stack.
  • Lead initiatives to drive innovation and maintain the competitive edge of our inference technologies.
  • Infrastructure Development :

  • Collaborate with partner teams to build and maintain scalable, multi-replica serving infrastructure.
  • Ensure the reliability and scalability of LLM serving systems to handle increasing workloads.
  • Qualifications : Technical Skills :

  • Proficiency in systems programming languages such as C++.
  • Strong experience with machine learning frameworks, particularly PyTorch.
  • Expertise in GPU programming and CUDA for performance optimization.
  • Solid understanding of AI / ML concepts, especially related to large language models.
  • Experience :

  • Proven experience in developing and optimizing ML / LLM inference systems.
  • Demonstrated ability to integrate research advancements into production systems.
  • Experience with performance tuning and profiling across various technology stacks.
  • Experience with vLLM
  • Seniority level

    Seniority level

    Mid-Senior level

    Employment type

    Employment type

    Full-time

    Job function

    Industries

    Staffing and Recruiting and Software Development

    Referrals increase your chances of interviewing at Alldus by 2x

    Inferred from the description for this job

    San Francisco, CA $130,000.00-$238,000.00 3 days ago

    San Francisco, CA $40,000.00-$70,000.00 2 weeks ago

    San Francisco, CA $145,000.00-$230,000.00 5 days ago

    Full-Stack Software Engineer (Jr / Mid level)

    San Francisco, CA $220,000.00-$350,000.00 4 hours ago

    San Francisco, CA $150,000.00-$230,000.00 2 months ago

    San Francisco, CA $150,000.00-$176,000.00 2 months ago

    San Francisco, CA $99,500.00-$200,000.00 1 day ago

    San Francisco, CA $130,000.00-$140,000.00 2 days ago

    San Francisco, CA $120,000.00-$190,000.00 8 months ago

    San Francisco, CA $125,000.00-$175,000.00 1 month ago

    Software Engineer, Frontend (All Levels)

    San Francisco, CA $150,000.00-$220,000.00 1 hour ago

    San Francisco, CA $56.25-$173,000.00 2 weeks ago

    San Francisco, CA $176,000.00-$250,000.00 2 weeks ago

    Alameda, CA $130,000.00-$160,000.00 4 weeks ago

    San Francisco, CA $150,000.00-$283,000.00 2 weeks ago

    San Francisco, CA $150,000.00-$300,000.00 5 days ago

    San Francisco, CA $165,000.00-$165,000.00 2 years ago

    San Francisco, CA $140,000.00-$280,000.00 7 months ago

    San Francisco, CA $140,000.00-$180,000.00 1 month ago

    San Francisco, CA $130,000.00-$185,000.00 2 months ago

    San Francisco, CA $99,500.00-$200,000.00 1 day ago

    San Francisco, CA $150,500.00-$269,200.00 2 days ago

    San Francisco, CA $100,000.00-$200,000.00 1 year ago

    San Francisco, CA $120,000.00-$200,000.00 2 years ago

    San Francisco, CA $150,000.00-$250,000.00 9 months ago

    We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

    #J-18808-Ljbffr

    Create a job alert for this search

    Software Engineer • San Francisco, CA, United States

    Related jobs
    ML Systems Engineer : Distributed LLM Training & Inference

    ML Systems Engineer : Distributed LLM Training & Inference

    Scale AI • San Francisco, CA, United States
    Full-time
    A leading AI technology company in San Francisco seeks a team member to build and optimize a machine learning framework for large language models. Candidates should have system optimization experien...Show more
    Last updated: 5 days ago • Promoted
    Sr. Software Engineer, ML

    Sr. Software Engineer, ML

    Relyance AI • San Francisco, CA, United States
    Full-time
    NLP for information extraction from legal documents, ML / NLP for information extraction from code and general ML in code analysis, as well as overall AI backend initiatives.You will partner with cro...Show more
    Last updated: 17 days ago • Promoted
    Distributed LLM Inference Engineer

    Distributed LLM Inference Engineer

    Anyscale, Inc • San Francisco, CA, United States
    Full-time
    At Anyscale, we're on a mission to democratize distributed computing and make it accessible to software developers of all skill levels. We're commercializing Ray, a popular open-source project that'...Show more
    Last updated: 30+ days ago • Promoted
    Software Engineer, ML Infra

    Software Engineer, ML Infra

    G-TV • San Francisco, CA, United States
    Full-time
    We're building GTV, a consumer product that combines creativity and automation to deliver a next-generation video experience. Our team includes senior people from Instagram, TikTok, and NVIDIA.We're...Show more
    Last updated: 17 days ago • Promoted
    ML / AI Software Engineer - Metrics Frameworks

    ML / AI Software Engineer - Metrics Frameworks

    General Motors • San Francisco, CA, United States
    Full-time
    As an AI / ML Engineer on the Metrics Frameworks team, part of the Simulation, Evaluation, and Data organization, you will be an individual contributor focused on developing and optimizing infrastruc...Show more
    Last updated: 1 day ago • Promoted
    Senior AI / ML Software Engineer (Remote in California)

    Senior AI / ML Software Engineer (Remote in California)

    Rocket Lawyer • San Francisco, California, United States
    Remote
    Full-time
    We believe everyone deserves access to affordable and simple legal services.Founded in 2008, Rocket Lawyer is the largest and most widely used online legal service platform in the world.With office...Show more
    Last updated: 20 days ago • Promoted
    Software Engineer, ML Inference, Simulation Infrastructure

    Software Engineer, ML Inference, Simulation Infrastructure

    Waymo • San Francisco, CA, United States
    Full-time
    Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on buildin...Show more
    Last updated: 17 days ago • Promoted
    AIML - Sr. Software Engineer, ML Platform Technologies (MLPT)

    AIML - Sr. Software Engineer, ML Platform Technologies (MLPT)

    Apple Inc. • San Francisco, CA, United States
    Full-time
    Software Engineer, ML Platform Technologies (MLPT).San Francisco Bay Area, California, United States Machine Learning and AI. Want to build the platform that enables the next generation of intellige...Show more
    Last updated: 22 days ago • Promoted
    Software Engineer, ML

    Software Engineer, ML

    Heartflow • San Francisco, California, United States
    Full-time
    Heartflow is a medical technology company advancing the diagnosis and management of coronary artery disease, the #1 cause of death worldwide, using cutting-edge technology.The flagship product—an A...Show more
    Last updated: 30+ days ago • Promoted
    Software Engineer, Systems ML

    Software Engineer, Systems ML

    META • Menlo Park, CA, United States
    Full-time
    Meta), formerly known as Facebook Inc.When Facebook launched in 2004, it changed the way people connect.Apps and services like Messenger, Instagram, and WhatsApp further empowered billions around t...Show more
    Last updated: 30+ days ago • Promoted
    ML Research Engineer, ML Systems

    ML Research Engineer, ML Systems

    Scale AI, Inc. • San Francisco, CA, United States
    Full-time
    Scale's ML platform (RLXF) team builds our internal distributed framework for large language model training and inference. The platform has been powering MLEs, researchers, data scientists and opera...Show more
    Last updated: 30+ days ago • Promoted
    Software Engineer, Machine Learning Infrastructure

    Software Engineer, Machine Learning Infrastructure

    Datologyai • Redwood City, California, United States
    Full-time
    Companies want to train their own large models on their own data.The current industry standard is to train on a random sample of your data, which is inefficient at best and actively harmful to mode...Show more
    Last updated: 30+ days ago • Promoted
    AIML - Sr. Software Engineer, ML Platform Technologies (MLPT)

    AIML - Sr. Software Engineer, ML Platform Technologies (MLPT)

    Apple • San Francisco, CA, United States
    Full-time
    Want to build the platform that enables the next generation of intelligent experiences on Apple products & services? As a software engineer on the Machine Learning Platform team, you will be respon...Show more
    Last updated: 17 days ago • Promoted
    Software Engineer - ML Pricing

    Software Engineer - ML Pricing

    Opendoor • San Francisco, California, United States
    Full-time
    At Opendoor, pricing is at the core of our product — our models directly influence high-stakes decisions around real estate transactions across the country. We are looking for a mid-level.This is a ...Show more
    Last updated: 15 days ago • Promoted
    Distributed ML Systems Engineer- Inference

    Distributed ML Systems Engineer- Inference

    Together AI • San Francisco, CA, US
    Full-time
    About the Role Together AI is seeking a Distributed ML Systems Engineer to design and build scalable machine learning systems that power our accelerated AI initiatives. This role involves developing...Show more
    Last updated: 30+ days ago • Promoted
    Software Engineer - ML & Platform

    Software Engineer - ML & Platform

    Rhizome • San Francisco, CA, United States
    Full-time
    A changing climate demands Resilience by Design.We like solving hard problems with creativity, tenacity, and empathy for our customers. At the same time, we believe that being better stewards in our...Show more
    Last updated: 10 days ago • Promoted
    Software Engineer - ML & Platform

    Software Engineer - ML & Platform

    Rhizome co • San Francisco, CA, United States
    Full-time
    A changing climate demands Resilience by Design.We like solving hard problems with creativity, tenacity, and empathy for our customers. At the same time, we believe that being better stewards in our...Show more
    Last updated: 6 days ago • Promoted
    Software Engineer (ML Platform)

    Software Engineer (ML Platform)

    Anyscale • San Francisco, California, United States
    Full-time
    Ray in their tech stacks to accelerate the progress of AI applications out into the real world.With Anyscale, we’re building the best place to run Ray, so that any developer or data scientist can s...Show more
    Last updated: 30+ days ago • Promoted