Staff ML Infrastructure Engineer

Cubiq RecruitmentHayward, CA, US

10 hours ago

Job type

Full-time

Job description

Job Description

Staff / Lead ML Infrastructure Engineer

San Francisco, CA — Onsite

Salary - Over market average + equity

We are building one of the world’s leading generative video and multimodal AI platforms, and we’re looking for a senior infrastructure engineer to drive the backbone that makes it possible. This role is ideal for an engineer from a top-tier tech company who has built cloud-scale systems, high-performance compute platforms, and battle-tested CI / CD pipelines that support complex ML workloads.

What You’ll Own

Core ML Platform Architecture : Design and evolve the infrastructure that supports large-scale generative video and multimodal model training, evaluation, and deployment.
High-Throughput Compute Systems : Build and optimize GPU / TPU clusters, distributed training systems, and orchestration layers tailored for video-heavy pipelines.
Production Reliability for Generative Models : Create the tooling and services needed to safely push frequent model updates while handling massive compute loads and long-running jobs.
End-to-End CI / CD for ML : Lead the development of automated pipelines for model training, validation, artifact management, and production rollout.
Multimodal Data Infrastructure : Build systems to ingest, version, transform, and serve large-scale video, audio, and text datasets with high reliability.
Internal Developer Experience : Partner with research, product, and applied ML teams to build intuitive internal tooling for experiment tracking, model lineage, and resource scheduling.
Technical Leadership : Mentor engineers, set platform standards, and influence long-term architectural direction.

What You’ve Done

Experience architecting and operating large-scale infrastructure at a cloud provider, hyperscaler, or leading AI company.

Built or owned mission-critical CI / CD systems, high-capacity compute platforms, or data infrastructure supporting ML teams.

Deep experience with distributed compute across GPUs / accelerators, Kubernetes, and cloud infrastructure (AWS / GCP / Azure).

Strong engineering fundamentals in Python, Go, or equivalent languages.

Previous exposure to ML training pipelines—especially systems that handle heavy video, multimodal, or high-dimensional data.

Demonstrated ability to lead complex cross-org initiatives and drive technical strategy.

Nice to Have

Experience with video processing systems, large-scale media pipelines, or streaming architectures.

Familiarity with modern multimodal or video-generation frameworks (PyTorch, JAX, diffusers, custom accelerators).

Experience with Ray, Triton, CUDA optimization, or specialized scheduling for ML workloads.

Background working in high-growth AI startups or research-focused environments.

Security and compliance considerations for models that generate or process user content.

Why Join

Shape the underlying platform powering one of the most advanced generative video systems in the world.

Influence the future of multimodal AI by building infrastructure that directly accelerates research and product breakthroughs.

Work closely with experienced founding engineers, researchers, and platform builders from leading tech companies.

Highly competitive compensation, meaningful equity, and strong in-person engineering culture in San Francisco.

Create a job alert for this search

Staff Engineer Infrastructure • Hayward, CA, US

Related jobs

Promoted

Sr. Staff ML Platform Engineer (TLM)

EarninMountain View, California, United States

Full-time

As one of the first pioneers of earned wage access, our passion at EarnIn is building products that deliver real-time financial flexibility for those with the unique needs of living paycheck to pay...Show moreLast updated: 30+ days ago

Promoted

Infrastructure Engineer

FAR.AIBerkeley, California, United States

Full-time

AI is a non-profit AI research institute dedicated to ensuring advanced AI is safe and beneficial for everyone.Our mission is to facilitate breakthrough AI safety research, advance global understan...Show moreLast updated: 30+ days ago

Promoted

RAN Infrastructure Engineer

Skylo TechnologiesMountain View, California, United States

Full-time

Skylo is a global Non-Terrestrial Network service provider based in Mountain View, CA, offering a service that allows smartphone and IoT cellular devices to connect directly over existing satellite...Show moreLast updated: 30+ days ago

Promoted

Staff Infrastructure Engineer

IroncladSan Francisco, California, United States

Remote

Full-time

Ironclad is the #1 contract lifecycle management platform for innovative companies.Every company, in every country, in every industry runs on contracts, but managing these contracts slows companies...Show moreLast updated: 30+ days ago

Promoted

Staff Infrastructure / DevOps Engineer

Gatik AiMountain View, California, United States

Full-time

Gatik, the leader in autonomous middle-mile logistics, is revolutionizing the B2B supply chain with its autonomous transportation-as-a-service (ATaaS) solution and prioritizing safe, consistent del...Show moreLast updated: 30+ days ago

Promoted

Staff Infrastructure Engineer

Tools for HumanitySan Francisco, CA, United States

Full-time

Be among the first 25 applicants.Get AI-powered advice on this job and more exclusive features.World is a network of real humans, built on privacy-preserving proof-of-human technology, and powered ...Show moreLast updated: 30+ days ago

Promoted
New!

Staff ML Infrastructure Engineer

Cubiq RecruitmentAlameda, CA, US

Full-time

Staff / Lead ML Infrastructure Engineer.San Francisco, CA — Onsite.Salary - Over market average + equity.We are building one of the world’s leading generative video and multimodal AI pl...Show moreLast updated: 10 hours ago

Promoted

Product Infrastructure Engineer - Site Reliability

ZyphraPalo Alto, California, United States

Full-time

Infrastructure Engineer - Site Reliability.Your work will be essential to ensuring the reliability and reproducibility of ML workloads, the safety and control of deployments, and the long-term main...Show moreLast updated: 30+ days ago

Promoted

ML Infrastructure Engineer

OpenaiSan Francisco, California, United States

Full-time

The Runtime team builds the low level framework components to power our ML training systems.We work on building robust, scalable, high performance components to support our distributed training wor...Show moreLast updated: 30+ days ago

Promoted

Infrastructure Engineer

MercorSan Francisco, California, United States

Full-time

Mercor is training models that predict how well someone will perform on a job better than a human can.We use our platform to source, vet, and onboard expert contractors who help train AI models in ...Show moreLast updated: 30+ days ago

Promoted

Senior Infrastructure Engineer - Supercomputing

Institute Of Foundation ModelsSunnyvale, California, United States

Full-time

About the Institute of Foundation Models.We are a dedicated research lab for building, understanding, using, and risk-managing foundation models. Our mandate is to advance research, nurture the next...Show moreLast updated: 30+ days ago

Promoted

ML Infrastructure Engineer

PhizenixMenlo Park, California, United States

Full-time +1

Menlo Park, CA | On-Site | Full-Time / Direct Hire.Client Opportunity | Through Phizenix.Phizenix, a certified minority and women-led recruiting firm, is hiring on behalf of an AI startup pioneering ...Show moreLast updated: 30+ days ago

Promoted

Lead Infrastructure Engineer

PIP LabsSan Francisco, California, United States

Full-time

Story aims to grow the creativity of the internet.The internet has introduced Story is building the IP infrastructure for the internet era, where creativity and intelligence move at the speed of cu...Show moreLast updated: 30+ days ago

Promoted

MTS, Infrastructure Engineer

DelphinaSan Francisco, California, United States

Full-time

Today’s Data Scientists are in pain - spending their time manually wrangling data, building models through slow trial and error, taking on painstaking rewrites for deployment, and dealing with coun...Show moreLast updated: 30+ days ago

Promoted

Infrastructure Engineer

VibecodeSan Francisco, California, United States

Full-time

We're democratizing software creation.Our platform lets anyone describe an idea and instantly turn it into a working application—no coding required. We're solving one of computing's fundamental chal...Show moreLast updated: 30+ days ago

Promoted

ML Infrastructure Generalist

OpenAISan Francisco, California, United States

Full-time

We're hiring across multiple teams, each focused on distinct areas of advancing artificial general intelligence (AGI).As an ML engineer supporting these teams, you’ll help build the infrastructure ...Show moreLast updated: 30+ days ago

Promoted

Infrastructure Engineer Sr

PNC Bank NAWalnut Creek, California, USA

Full-time +1

At PNC our people are our greatest differentiator and competitive advantage in the markets we serve.We are all united in delivering the best experience for our customers. We work together each day t...Show moreLast updated: 30+ days ago

Promoted

Staff ML Engineer - Infrastructure

ChipStackSan Jose, California, United States

Full-time

Chips are at the center of today's tech-driven world.But how we design them has not changed in decades, while their complexity and specialization have skyrocketed due to increasing performance dema...Show moreLast updated: 30+ days ago