ML Systems Engineer

GenmoSan Francisco, CA, United States

30+ days ago

Job type

Full-time

Job description

We are Genmo, a research lab dedicated to building open, state-of-the-art models for video generation towards unlocking the right brain of AGI. Join us in shaping the future of AI and pushing the boundaries of what's possible in video generation.

The Role

You'll own our model serving layer, implementing high-performance inference systems that can handle millions of requests daily. You'll work at the intersection of ML frameworks and cloud infrastructure, building automated pipelines for model optimization and deployment. Your work will directly impact the performance and scalability of our video generation models, ensuring sub-second latency at global scale.

Key Responsibilities

Design and implement high-performance model serving infrastructure supporting streaming, batching, and multi-modal inputs
Build automated model compilation and optimization pipelines using TensorRT, torch.compile, and other compilers
Optimize serving systems for throughput, latency, and GPU utilization across our H100 fleet
Develop monitoring and observability for model-specific metrics (quality, latency, throughput)
Collaborate with researchers to transition models from development to production
Implement A / B testing, canary deployments, and gradual rollout strategies for models
Integrate serving layer with platform infrastructure (load balancers, API gateways, queue systems)

Qualifications

Bachelor's or Master's degree in Computer Science or related field

4+ years ML engineering experience with 2+ years focused on model serving

Production experience with high-performance model serving frameworks (vLLM, SGLang, TensorRT-LLM, or similar)

Strong Python proficiency and PyTorch experience

Experience with model compilation and optimization (TensorRT, ONNX, quantization)

Track record of building inference systems at scale (10K+ QPS)

Understanding of attention mechanisms and transformer architectures

Experience with containerized deployment and orchestration

We Value

Contributions to open-source serving frameworks

Experience with continuous batching and advanced serving optimizations

Knowledge of GPU architecture and memory management

Background at companies with large-scale ML serving

Experience with streaming / iterative generation patterns

Genmo is an Equal Opportunity Employer. Candidates are evaluated without regard to age, race, color, religion, sex, disability, national origin, sexual orientation, veteran status, or any other characteristic protected by federal or state law. Genmo, Inc. is an E-Verify company and you may review the Notice of E-Verify Participation and the Right to Work posters in English and Spanish.

Create a job alert for this search

Ml Engineer • San Francisco, CA, United States

Related jobs

Promoted

ML Engineer

PhizenixMenlo Park, CA, United States

Full-time +1

Client Opportunity | Through Phizenix.Phizenix, a certified minority and women-led recruiting firm, is hiring on behalf of an innovative generative AI startup that's developing diffusion-based larg...Show moreLast updated: 30+ days ago

Promoted

Staff Systems Engineer

Bio-Rad LaboratoriesHercules, CA, United States

Full-time

Working within Bio-Rad's Life Science R&D Group as a Systems Engineer, you will take engineering concepts, requirements and transform them into functional prototypes and finished products that impr...Show moreLast updated: 11 days ago

Promoted

Systems Engineer

Robert HalfSan Jose, CA, US

Permanent

We are looking for an experienced Systems Engineer to join our team in San Jose, California.This is a Contract to permanent position, offering an excellent opportunity for a highly skilled and deta...Show moreLast updated: 13 days ago

Promoted

Distributed ML Systems Engineer- Inference

Together AISan Francisco, CA, United States

Full-time

Together AI is seeking a Distributed ML Systems Engineer to design and build scalable machine learning systems that power our accelerated AI initiatives. This role involves developing large-scale, f...Show moreLast updated: 30+ days ago

Promoted

Contractor - Systems Engineer (willing to consider all Sr. levels)

Redwire SpaceSan Jose, CA, United States

Permanent

Where dreams and reality collide and the output is, out of this world.At Redwire Space, we are a team of dreamers and doers. Where the impossible becomes possible, and every day is an opportunity to...Show moreLast updated: 30+ days ago

Promoted

Software Engineer, Systems ML

METAMenlo Park, CA, United States

Full-time

Meta), formerly known as Facebook Inc.When Facebook launched in 2004, it changed the way people connect.Apps and services like Messenger, Instagram, and WhatsApp further empowered billions around t...Show moreLast updated: 30+ days ago

Promoted

Embedded Systems Engineer III

RIX INDUSTRIESBenicia, CA, US

Full-time

RIX Industries is a technology-focused company specializing in the design, development and manufacturing of gas generation systems, precision compressor solutions, and cryogenic cooling technologie...Show moreLast updated: 30+ days ago

Promoted
New!

ML Engineer [IC3]San Francisco, CA

SourcegraphSan Francisco, CA, United States

Full-time

Our mission at Sourcegraph is to make it so that everyone can code, not just ~0.We are transforming how the world's most important companies build software by industrializing development with AI.To...Show moreLast updated: 11 hours ago

Promoted

Software Engineer, Systems ML - SW / HW Co-design

METAMenlo Park, CA, United States

Full-time

Meta is seeking an AI Software Engineer to join our Research & Development teams.The ideal candidate will have industry experience working on AI Infrastructure related topics.The position will invo...Show moreLast updated: 30+ days ago

Promoted

AIML - Staff ML Infrastructure Engineer, ML Platform & Technology - ML Compute

AppleSan Francisco, CA, United States

Full-time

Apple is where individual imaginations gather together, committing to the values that lead to great work.Every new product we build, service we create, or Apple Store experience we deliver is the r...Show moreLast updated: 30+ days ago

Promoted

ML Research Engineer, ML Systems

Scale AI, Inc.San Francisco, CA, United States

Full-time

Scale's ML platform (RLXF) team builds our internal distributed framework for large language model training and inference. The platform has been powering MLEs, researchers, data scientists and opera...Show moreLast updated: 30+ days ago

Promoted

Machine Learning Systems Engineer (1 Year Fixed Term)

Stanford UniversityStanford, California, United States

Temporary

The Department of Ophthalmology in the School of Medicine at Stanford University is launching an interdisciplinary Neuro-AI project dedicated to building a foundation model of the brain.This endeav...Show moreLast updated: 27 days ago

Promoted
New!

AIML - Sr. Software Engineer, ML Platform Technologies (MLPT)

AppleSan Francisco, CA, United States

Full-time

Want to build the platform that enables the next generation of intelligent experiences on Apple products & services? As a software engineer on the Machine Learning Platform team, you will be respon...Show moreLast updated: 11 hours ago

Promoted

Tech Lead Manager- MLRE, ML Systems

Scale AI, Inc.San Francisco, CA, United States

Full-time

Scale's LLM post-training platform team builds our internal distributed framework for large language model training.The platform powers MLEs, researchers, data scientists, and operators for fast an...Show moreLast updated: 30+ days ago

Promoted
New!

Wireless Systems Engineer, Ranging and Sensing

AppleSan Francisco, CA, United States

Full-time

At Apple, we work every single day to craft products that enrich people’s lives.Do you love working on challenges that no one has solved yet? As a member of our Wireless Silicon Design group, you w...Show moreLast updated: 10 hours ago

Promoted

Senior Applied AI Engineer - ML for Systems & Infrastructure

DatabricksSan Francisco, CA, United States

Full-time

As a Senior Applied AI Engineer at Databricks, you will apply machine learning, scheduling and optimization algorithms to improve the efficiency and performance of our engineering systems and infra...Show moreLast updated: 30+ days ago

Promoted
New!

Senior ML infrastructure engineer

KuzcoSan Francisco, CA, United States

Full-time

Kuzco is seeking a Senior ML Infrastructure Engineer to join our team.This role involves developing large-scale, fault-tolerant systems that handle millions of large language model inference reques...Show moreLast updated: 11 hours ago

Promoted

ML Engineer

RIT Solutions, Inc.Fremont, CA, United States

Full-time

Onsite in Fremont, CA (MUST BE LOCAL).In-depth knowledge of Python for high-performance data-intensive applications.Familiarity with at least one modern deep learning framework (Pytorch, Jax, Tenso...Show moreLast updated: 30+ days ago