Talent.com
ML Systems Engineer

ML Systems Engineer

GenmoSan Francisco, CA, United States
30+ days ago
Job type
  • Full-time
Job description

We are Genmo, a research lab dedicated to building open, state-of-the-art models for video generation towards unlocking the right brain of AGI. Join us in shaping the future of AI and pushing the boundaries of what's possible in video generation.

The Role

You'll own our model serving layer, implementing high-performance inference systems that can handle millions of requests daily. You'll work at the intersection of ML frameworks and cloud infrastructure, building automated pipelines for model optimization and deployment. Your work will directly impact the performance and scalability of our video generation models, ensuring sub-second latency at global scale.

Key Responsibilities

  • Design and implement high-performance model serving infrastructure supporting streaming, batching, and multi-modal inputs
  • Build automated model compilation and optimization pipelines using TensorRT, torch.compile, and other compilers
  • Optimize serving systems for throughput, latency, and GPU utilization across our H100 fleet
  • Develop monitoring and observability for model-specific metrics (quality, latency, throughput)
  • Collaborate with researchers to transition models from development to production
  • Implement A / B testing, canary deployments, and gradual rollout strategies for models
  • Integrate serving layer with platform infrastructure (load balancers, API gateways, queue systems)

Qualifications

  • Bachelor's or Master's degree in Computer Science or related field
  • 4+ years ML engineering experience with 2+ years focused on model serving
  • Production experience with high-performance model serving frameworks (vLLM, SGLang, TensorRT-LLM, or similar)
  • Strong Python proficiency and PyTorch experience
  • Experience with model compilation and optimization (TensorRT, ONNX, quantization)
  • Track record of building inference systems at scale (10K+ QPS)
  • Understanding of attention mechanisms and transformer architectures
  • Experience with containerized deployment and orchestration
  • We Value

  • Contributions to open-source serving frameworks
  • Experience with continuous batching and advanced serving optimizations
  • Knowledge of GPU architecture and memory management
  • Background at companies with large-scale ML serving
  • Experience with streaming / iterative generation patterns
  • Genmo is an Equal Opportunity Employer. Candidates are evaluated without regard to age, race, color, religion, sex, disability, national origin, sexual orientation, veteran status, or any other characteristic protected by federal or state law. Genmo, Inc. is an E-Verify company and you may review the Notice of E-Verify Participation and the Right to Work posters in English and Spanish.

    Create a job alert for this search

    Ml Engineer • San Francisco, CA, United States

    Related jobs
    • Promoted
    ML Engineer

    ML Engineer

    PhizenixMenlo Park, CA, United States
    Full-time +1
    Client Opportunity | Through Phizenix.Phizenix, a certified minority and women-led recruiting firm, is hiring on behalf of an innovative generative AI startup that's developing diffusion-based larg...Show moreLast updated: 30+ days ago
    • Promoted
    Staff Systems Engineer

    Staff Systems Engineer

    Bio-Rad LaboratoriesHercules, CA, United States
    Full-time
    Working within Bio-Rad's Life Science R&D Group as a Systems Engineer, you will take engineering concepts, requirements and transform them into functional prototypes and finished products that impr...Show moreLast updated: 11 days ago
    • Promoted
    Systems Engineer

    Systems Engineer

    Robert HalfSan Jose, CA, US
    Permanent
    We are looking for an experienced Systems Engineer to join our team in San Jose, California.This is a Contract to permanent position, offering an excellent opportunity for a highly skilled and deta...Show moreLast updated: 13 days ago
    • Promoted
    Distributed ML Systems Engineer- Inference

    Distributed ML Systems Engineer- Inference

    Together AISan Francisco, CA, United States
    Full-time
    Together AI is seeking a Distributed ML Systems Engineer to design and build scalable machine learning systems that power our accelerated AI initiatives. This role involves developing large-scale, f...Show moreLast updated: 30+ days ago
    • Promoted
    Contractor - Systems Engineer (willing to consider all Sr. levels)

    Contractor - Systems Engineer (willing to consider all Sr. levels)

    Redwire SpaceSan Jose, CA, United States
    Permanent
    Where dreams and reality collide and the output is, out of this world.At Redwire Space, we are a team of dreamers and doers. Where the impossible becomes possible, and every day is an opportunity to...Show moreLast updated: 30+ days ago
    • Promoted
    Software Engineer, Systems ML

    Software Engineer, Systems ML

    METAMenlo Park, CA, United States
    Full-time
    Meta), formerly known as Facebook Inc.When Facebook launched in 2004, it changed the way people connect.Apps and services like Messenger, Instagram, and WhatsApp further empowered billions around t...Show moreLast updated: 30+ days ago
    • Promoted
    Embedded Systems Engineer III

    Embedded Systems Engineer III

    RIX INDUSTRIESBenicia, CA, US
    Full-time
    RIX Industries is a technology-focused company specializing in the design, development and manufacturing of gas generation systems, precision compressor solutions, and cryogenic cooling technologie...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    ML Engineer [IC3]San Francisco, CA

    ML Engineer [IC3]San Francisco, CA

    SourcegraphSan Francisco, CA, United States
    Full-time
    Our mission at Sourcegraph is to make it so that everyone can code, not just ~0.We are transforming how the world's most important companies build software by industrializing development with AI.To...Show moreLast updated: 11 hours ago
    • Promoted
    Software Engineer, Systems ML - SW / HW Co-design

    Software Engineer, Systems ML - SW / HW Co-design

    METAMenlo Park, CA, United States
    Full-time
    Meta is seeking an AI Software Engineer to join our Research & Development teams.The ideal candidate will have industry experience working on AI Infrastructure related topics.The position will invo...Show moreLast updated: 30+ days ago
    • Promoted
    AIML - Staff ML Infrastructure Engineer, ML Platform & Technology - ML Compute

    AIML - Staff ML Infrastructure Engineer, ML Platform & Technology - ML Compute

    AppleSan Francisco, CA, United States
    Full-time
    Apple is where individual imaginations gather together, committing to the values that lead to great work.Every new product we build, service we create, or Apple Store experience we deliver is the r...Show moreLast updated: 30+ days ago
    • Promoted
    ML Research Engineer, ML Systems

    ML Research Engineer, ML Systems

    Scale AI, Inc.San Francisco, CA, United States
    Full-time
    Scale's ML platform (RLXF) team builds our internal distributed framework for large language model training and inference. The platform has been powering MLEs, researchers, data scientists and opera...Show moreLast updated: 30+ days ago
    • Promoted
    Machine Learning Systems Engineer (1 Year Fixed Term)

    Machine Learning Systems Engineer (1 Year Fixed Term)

    Stanford UniversityStanford, California, United States
    Temporary
    The Department of Ophthalmology in the School of Medicine at Stanford University is launching an interdisciplinary Neuro-AI project dedicated to building a foundation model of the brain.This endeav...Show moreLast updated: 27 days ago
    • Promoted
    • New!
    AIML - Sr. Software Engineer, ML Platform Technologies (MLPT)

    AIML - Sr. Software Engineer, ML Platform Technologies (MLPT)

    AppleSan Francisco, CA, United States
    Full-time
    Want to build the platform that enables the next generation of intelligent experiences on Apple products & services? As a software engineer on the Machine Learning Platform team, you will be respon...Show moreLast updated: 11 hours ago
    • Promoted
    Tech Lead Manager- MLRE, ML Systems

    Tech Lead Manager- MLRE, ML Systems

    Scale AI, Inc.San Francisco, CA, United States
    Full-time
    Scale's LLM post-training platform team builds our internal distributed framework for large language model training.The platform powers MLEs, researchers, data scientists, and operators for fast an...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Wireless Systems Engineer, Ranging and Sensing

    Wireless Systems Engineer, Ranging and Sensing

    AppleSan Francisco, CA, United States
    Full-time
    At Apple, we work every single day to craft products that enrich people’s lives.Do you love working on challenges that no one has solved yet? As a member of our Wireless Silicon Design group, you w...Show moreLast updated: 10 hours ago
    • Promoted
    Senior Applied AI Engineer - ML for Systems & Infrastructure

    Senior Applied AI Engineer - ML for Systems & Infrastructure

    DatabricksSan Francisco, CA, United States
    Full-time
    As a Senior Applied AI Engineer at Databricks, you will apply machine learning, scheduling and optimization algorithms to improve the efficiency and performance of our engineering systems and infra...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Senior ML infrastructure engineer

    Senior ML infrastructure engineer

    KuzcoSan Francisco, CA, United States
    Full-time
    Kuzco is seeking a Senior ML Infrastructure Engineer to join our team.This role involves developing large-scale, fault-tolerant systems that handle millions of large language model inference reques...Show moreLast updated: 11 hours ago
    • Promoted
    ML Engineer

    ML Engineer

    RIT Solutions, Inc.Fremont, CA, United States
    Full-time
    Onsite in Fremont, CA (MUST BE LOCAL).In-depth knowledge of Python for high-performance data-intensive applications.Familiarity with at least one modern deep learning framework (Pytorch, Jax, Tenso...Show moreLast updated: 30+ days ago