Talent.com
Software Engineer - ML / LLM Inference

Software Engineer - ML / LLM Inference

AlldusSan Francisco, CA, United States
Hace más de 30 días
Tipo de contrato
  • A tiempo completo
Descripción del trabajo

Get AI-powered advice on this job and more exclusive features.

Direct message the job poster from Alldus

Principal Recruitment Consultant | AI & Machine Learning | Co-organizer of the AI in Action Podcast

My client is searching for a talented engineer to work on ML / LLM inference and serving. They specialize in developing next-gen LLM fine-tuning and inference engines.

We are seeking a talented and motivated Software Engineer specializing in Machine Learning (ML) and Large Language Model (LLM) inference to join our dynamic ML Inference team. In this role, you will bridge the gap between AI / ML research and systems programming to build and enhance our next-generation LLM Inference Engine. You will play a crucial role in optimizing the performance, scalability, and efficiency of our LLM serving systems.

Key Responsibilities :

Develop and Enhance Inference Engine :

  • Design, implement, and optimize the next-generation LLM Inference Engine.
  • Integrate the latest LLM inference techniques from research to enhance latency and throughput.

Performance Optimization :

  • Conduct deep performance optimizations across multiple layers of the technology stack, including PyTorch, C++, and CUDA.
  • Analyze and improve system performance to meet the demands of various use cases.
  • Work closely with customers to understand specific performance requirements and optimize solutions accordingly.
  • Provide technical expertise and support to ensure successful deployment and operation of inference systems.
  • Technical Leadership :

  • Define the roadmap and technical vision for the inference stack.
  • Lead initiatives to drive innovation and maintain the competitive edge of our inference technologies.
  • Infrastructure Development :

  • Collaborate with partner teams to build and maintain scalable, multi-replica serving infrastructure.
  • Ensure the reliability and scalability of LLM serving systems to handle increasing workloads.
  • Qualifications : Technical Skills :

  • Proficiency in systems programming languages such as C++.
  • Strong experience with machine learning frameworks, particularly PyTorch.
  • Expertise in GPU programming and CUDA for performance optimization.
  • Solid understanding of AI / ML concepts, especially related to large language models.
  • Experience :

  • Proven experience in developing and optimizing ML / LLM inference systems.
  • Demonstrated ability to integrate research advancements into production systems.
  • Experience with performance tuning and profiling across various technology stacks.
  • Experience with vLLM
  • Seniority level

    Seniority level

    Mid-Senior level

    Employment type

    Employment type

    Full-time

    Job function

    Industries

    Staffing and Recruiting and Software Development

    Referrals increase your chances of interviewing at Alldus by 2x

    Inferred from the description for this job

    San Francisco, CA $130,000.00-$238,000.00 3 days ago

    San Francisco, CA $40,000.00-$70,000.00 2 weeks ago

    San Francisco, CA $145,000.00-$230,000.00 5 days ago

    Full-Stack Software Engineer (Jr / Mid level)

    San Francisco, CA $220,000.00-$350,000.00 4 hours ago

    San Francisco, CA $150,000.00-$230,000.00 2 months ago

    San Francisco, CA $150,000.00-$176,000.00 2 months ago

    San Francisco, CA $99,500.00-$200,000.00 1 day ago

    San Francisco, CA $130,000.00-$140,000.00 2 days ago

    San Francisco, CA $120,000.00-$190,000.00 8 months ago

    San Francisco, CA $125,000.00-$175,000.00 1 month ago

    Software Engineer, Frontend (All Levels)

    San Francisco, CA $150,000.00-$220,000.00 1 hour ago

    San Francisco, CA $56.25-$173,000.00 2 weeks ago

    San Francisco, CA $176,000.00-$250,000.00 2 weeks ago

    Alameda, CA $130,000.00-$160,000.00 4 weeks ago

    San Francisco, CA $150,000.00-$283,000.00 2 weeks ago

    San Francisco, CA $150,000.00-$300,000.00 5 days ago

    San Francisco, CA $165,000.00-$165,000.00 2 years ago

    San Francisco, CA $140,000.00-$280,000.00 7 months ago

    San Francisco, CA $140,000.00-$180,000.00 1 month ago

    San Francisco, CA $130,000.00-$185,000.00 2 months ago

    San Francisco, CA $99,500.00-$200,000.00 1 day ago

    San Francisco, CA $150,500.00-$269,200.00 2 days ago

    San Francisco, CA $100,000.00-$200,000.00 1 year ago

    San Francisco, CA $120,000.00-$200,000.00 2 years ago

    San Francisco, CA $150,000.00-$250,000.00 9 months ago

    We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

    #J-18808-Ljbffr

    Crear una alerta de empleo para esta búsqueda

    Software Engineer • San Francisco, CA, United States

    Ofertas relacionadas
    • Oferta promocionada
    Sr. Software Engineer, ML

    Sr. Software Engineer, ML

    Relyance AISan Francisco, CA, United States
    A tiempo completo
    NLP for information extraction from legal documents, ML / NLP for information extraction from code and general ML in code analysis, and overall AI backend initiatives. You will partner with cross-func...Mostrar másÚltima actualización: hace 4 días
    • Oferta promocionada
    Software Engineer, AI / ML

    Software Engineer, AI / ML

    Glu Mobile Inc.San Francisco, CA, United States
    A tiempo completo
    Glue is a well-funded startup working on the next generation of work communication tools.We believe that today’s work chat is noisy, unstructured, and not designed for productivity.We’re drawing fr...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Applied ML / LLM Engineer

    Applied ML / LLM Engineer

    PincitesSan Francisco, CA, United States
    A tiempo completo
    We’re looking for a sharp, ambitious.AI-native products — someone who knows how to turn messy real-world data into performant models, fine-tune and deploy LLMs, and design feedback loops that make ...Mostrar másÚltima actualización: hace 4 días
    • Oferta promocionada
    Senior Software Engineer - AI / ML Infra

    Senior Software Engineer - AI / ML Infra

    GEICOPalo Alto, CA, United States
    A tiempo completo
    At GEICO, we offer a rewarding career where your ambitions are met with endless possibilities.Every day we honor our iconic brand by offering quality coverage to millions of customers and being the...Mostrar másÚltima actualización: hace 23 horas
    • Oferta promocionada
    AIML - Sr. Software Engineer, ML Platform Technologies (MLPT)

    AIML - Sr. Software Engineer, ML Platform Technologies (MLPT)

    Apple Inc.San Francisco, CA, United States
    A tiempo completo
    Software Engineer, ML Platform Technologies (MLPT).San Francisco Bay Area, California, United States Machine Learning and AI. Want to build the platform that enables the next generation of intellige...Mostrar másÚltima actualización: hace 4 días
    • Oferta promocionada
    Senior Software Engineer - ML / LLM Serving

    Senior Software Engineer - ML / LLM Serving

    AlldusSan Jose, CA, United States
    A tiempo completo
    Senior Software Engineer - ML / LLM Serving.This range is provided by Alldus.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.Direct message the jo...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Software Engineer - ML Performance

    Software Engineer - ML Performance

    BasetenSan Ramon, California, United States
    A tiempo completo
    We’re a growing team of builders backed by top-tier investors, including.ML teams at enterprises and category-defining AI-native companies like. Baseten to power their core production workloads with...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Distributed ML Systems Engineer- Inference

    Distributed ML Systems Engineer- Inference

    Together AISan Francisco, CA, United States
    A tiempo completo
    Together AI is seeking a Distributed ML Systems Engineer to design and build scalable machine learning systems that power our accelerated AI initiatives. This role involves developing large-scale, f...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    • Nueva oferta
    Remote Senior Software Engineer (LLM)

    Remote Senior Software Engineer (LLM)

    TuringSan Francisco, CA, United States
    Teletrabajo
    A tiempo completo
    Remote Senior Software Engineer (LLM) - 34953.Remote Senior Software Engineer (LLM) - 34953.Be among the first 25 applicants. Turing is one of the worlds fastest-growing AI companies, pushing the bo...Mostrar másÚltima actualización: hace menos de 1 hora
    • Oferta promocionada
    • Nueva oferta
    AIML - Senior Software Engineer, Machine Learning Platform Technologies

    AIML - Senior Software Engineer, Machine Learning Platform Technologies

    San Francisco StaffingSan Francisco, CA, United States
    A tiempo completo
    We are looking for a Senior Software Engineer to shape how we measure the success and reliability of Apple Intelligence software features. This role is at the intersection of feature delivery, telem...Mostrar másÚltima actualización: hace menos de 1 hora
    • Oferta promocionada
    Senior Machine Learning Operations (MLOps) and Infrastructure Engineer

    Senior Machine Learning Operations (MLOps) and Infrastructure Engineer

    ASunnyvale, California, United States
    A tiempo completo
    Our Wayfinder team is building scalable, certifiable autonomy systems to power the next generation of commercial aircraft. Our team of experts is driving the maturation of machine learning and other...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    ML Research Engineer, ML Systems

    ML Research Engineer, ML Systems

    Scale AI, Inc.San Francisco, CA, United States
    A tiempo completo
    Scale's ML platform (RLXF) team builds our internal distributed framework for large language model training and inference. The platform has been powering MLEs, researchers, data scientists and opera...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Applied ML Engineer (LLMs & RAG)

    Applied ML Engineer (LLMs & RAG)

    AlldusSan Francisco, CA, United States
    A tiempo completo
    This role is a combination of research and engineering.We are looking for someone who's a talented software engineer at their core, but has contributed to AI research, especially in the field of RA...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    Software Engineer - ML & Platform

    Software Engineer - ML & Platform

    RhizomeSan Francisco, CA, United States
    A tiempo completo
    A changing climate demands Resilience by Design.We like solving hard problems with creativity, tenacity, and empathy for our customers. At the same time, we believe that being better stewards in our...Mostrar másÚltima actualización: hace 4 días
    • Oferta promocionada
    Software Engineer - ML Pricing

    Software Engineer - ML Pricing

    PendoSan Francisco, CA, United States
    A tiempo completo
    At Opendoor, pricing is at the core of our product — our models directly influence high-stakes decisions around real estate transactions across the country. We are looking for a mid-level.This is a ...Mostrar másÚltima actualización: hace más de 30 días
    • Oferta promocionada
    • Nueva oferta
    Software Engineer - ML & Platform

    Software Engineer - ML & Platform

    Rhizome coSan Francisco, CA, United States
    A tiempo completo
    A changing climate demands Resilience by Design.We like solving hard problems with creativity, tenacity, and empathy for our customers. At the same time, we believe that being better stewards in our...Mostrar másÚltima actualización: hace menos de 1 hora
    • Oferta promocionada
    • Nueva oferta
    Software Engineer, ML Infrastructure, Optimization

    Software Engineer, ML Infrastructure, Optimization

    NuroMountain View, CA, United States
    A tiempo completo
    Nuro is a self-driving technology company on a mission to make autonomy accessible to all.Founded in 2016, Nuro is building the world's most scalable driver, combining cutting-edge AI with automoti...Mostrar másÚltima actualización: hace 12 horas
    • Oferta promocionada
    Software Machine Learning Engineer

    Software Machine Learning Engineer

    ServicePoint ITLos Altos, CA, United States
    A tiempo completo
    ServicePoint Has a Customer Seeking a Software Machine Learning Engineer For a 3+ Month Opportunity Located In Los Altos, CA. There Is a Possibility For Extensions Or Even Hire Down The Road.The Hou...Mostrar másÚltima actualización: hace 4 días