Inference Engineer, Video AI

CantinaSan Francisco, CA, United States

30+ days ago

Job type

Full-time

Job description

A bit about Cantina :

Cantina, founded by Sean Parker, is a new social platform with the most advanced AI character creator. Build, share, and interact with AI bots and your friends directly in the Cantina or across the internet.

Cantina bots are lifelike, social creatures, capable of interacting wherever humans go on the internet. Recreate yourself using powerful AI, imagine someone new, or choose from thousands of existing characters. Bots are a new media type that offer a way for creators to share infinitely scalable and personalized content experiences combined with seamless group chat across voice, video, and text.

If you're excited about the potential AI has to shape human creativity and social interactions, join us in building the future!

A bit about the role : We're looking for an Inference Engineer who specializes in productionizing and hosting video AI models at scale. You'll be responsible for taking cutting-edge neural networks from research to production, building robust inference infrastructure, and optimizing model performance for real-time applications. This role focuses on the deployment and serving of large video models.

As an Inference Engineer, you will :

Deploy video AI models to production - Take research models and build production-ready inference endpoints with APIs, ensuring efficient operation across cloud infrastructure.
Maintain and optimize inference systems - Debug complex model serving issues, optimize latency performance, monitor system health, and ensure 99.9% uptime for AI-powered features.
Implement model optimizations - Work with neural network architectures including diffusion networks, VAEs, and transformers. Apply streaming optimizations and understand video model architectures to implement effective performance improvements.
Manage inference infrastructure - Leverage containerization with Docker, cloud storage solutions like S3, and cluster computing to build scalable model serving infrastructure.
Collaborate with research teams - Work closely with AI researchers to understand model requirements, architectural constraints, and optimization opportunities for new video generation models.

A bit about you :

2+ years of ML engineering experience with focus on model inference and deployment

Strong understanding of neural network architectures , particularly diffusion networks, VAEs, and transformer models

Experience with video and image models - Understanding of how video / image generation models work, their architectures, and optimization strategies specific to video processing

Multi-GPU inference expertise - Experience running model components across multiple GPUs, implementing parallel processing strategies for large models

Production model hosting experience - Track record of deploying and maintaining ML models in production environments, including streaming and real-time inference

Experience with containerization (Docker), AWS, and cluster computing environments

Familiarity with machine learning frameworks (PyTorch, TensorFlow)

Experience with inference platforms and model serving solutions

Technical Stack You'll Work With :

Cloud : AWS (S3, DynamoDB), Kubernetes clusters

ML Infrastructure : Model serving platforms, Docker

Languages : Python

Frameworks : PyTorch, TensorFlow

Models : Video generation models, diffusion networks, VAEs, transformers

Optimization : Multi-GPU inference, real-time processing techniques

Pay Equity :

In compliance with Pay Transparency Laws, the base salary range for this role is between $175,000-$225,000 for those located in the San Francisco Bay Area, New York City and Seattle, WA. When determining compensation, a number of factors will be considered, including skills, experience, job scope, location, and competitive compensation market data.

Benefits :

Health Care - 99% of premiums for medical, vision, dental are fully paid for by Cantina, plus One Medical membership.

Monthly Wellness Stipend - $500 / month to use on whatever you'd like!

Rest and Recharge - 15 PTO days per year, 10 sick days, all Federal holidays, and 2 floating holidays.

401(K) - Eligible to participate on day one of employment.

Parental Leave & Fertility Support

Competitive Salary & Equity

Lunch and snacks provided for in-office employees.

WFH equipment provided for full-time hybrid / remote employees.

Create a job alert for this search

Video Engineer • San Francisco, CA, United States

Related jobs

Promoted

Senior AI Research Engineer, Model Inference (Remote)

Tether Operations LimitedSan Francisco, CA, United States

Remote

Full-time

Join Tether and Shape the Future of Digital Finance.At Tether, we’re building solutions that empower businesses to integrate reserve-backed tokens across blockchains with transparency and trust in ...Show moreLast updated: 30+ days ago

Promoted

Senior Research Engineer, TikTok AI Search (LLM Pretraining / Alignment / Inference)

Tik TokSan Jose, CA, United States

Full-time

About the team On the TikTok Search Team, you will have the opportunity to develop and apply cutting edge machine learning technologies in real-time large-scale systems, which serve billions of sea...Show moreLast updated: 9 days ago

Promoted

Lead AI / ML Engineer, Recommendation Systems

AmazonSunnyvale, CA, United States

Full-time

Prime Video is a first-stop entertainment destination offering customers a vast collection of premium programming in one app available across thousands of devices. Prime members can customize their ...Show moreLast updated: 1 day ago

Promoted

AI Inference Engineer

Perplexity AI Inc.San Francisco, CA, United States

Full-time

We are looking for an AI Inference engineer to join our growing team.Our current stack is Python, Rust, C++, PyTorch, Triton, CUDA, Kubernetes. You will have the opportunity to work on large-scale d...Show moreLast updated: 4 days ago

Promoted

Lead Applied ML Research Engineer - Video AI

AdobeSan Jose, CA, United States

Full-time

Changing the world through digital experiences is what Adobe's all about.We give everyone-from emerging artists to global brands-everything they need to design and deliver exceptional digital exper...Show moreLast updated: 9 days ago

Promoted

Applied AI Inference Engineer

BasetenSan Francisco, CA, United States

Full-time

Baseten provides the infrastructure, tooling, and expertise needed to bring great AI products to market - fast.We're trusted by leading AI-driven innovators like Writer, Abridge, Bland, Patreon, De...Show moreLast updated: 1 day ago

Promoted

Video / Imaging Software Engineer

AppleSan Francisco, CA, United States

Full-time

Apple is where individual imaginations gather together, committing to the values that lead to great work.Every new product we build, service we create, or Apple Store experience we deliver is the r...Show moreLast updated: 1 day ago

Promoted

Vice President of Artificial Intelligence (AI) and Machine Learning (ML)

ConfidentialSan Jose, CA, United States

Full-time

Vice President of Artificial Intelligence (AI) and Machine Learning (ML).The Company is seeking a Senior Software Engineer to join their team. The successful candidate will be responsible for design...Show moreLast updated: 23 hours ago

Promoted

Software Engineer III, AI / ML GenAI, YouTube

California Staffing ServiceSan Francisco, CA, United States

Full-time

Software Engineer III, AI / ML GenAI, YouTube.Experience driving progress, solving problems, and mentoring more junior team members. deeper expertise and applied knowledge within relevant area.Bachel...Show moreLast updated: 23 hours ago

Promoted

Senior Software Engineer, Audio / Video

DiscordSan Francisco, CA, United States

Full-time

Discord is used by over 200 million people every month for many different reasons, but there’s one thing that nearly everyone does on our platform : . Over 90% of our users play games, spending a comb...Show moreLast updated: 30+ days ago

Promoted

Staff Machine Learning Engineer- Video AI / Computer Vision

Warner Bros. DiscoverySan Francisco, CA, United States

Full-time

When we say, "the stuff dreams are made of," we're not just referring to the world of wizards, dragons and superheroes, or even to the wonders of Planet Earth. Behind WBD's vast portfolio of iconic ...Show moreLast updated: 30+ days ago

Promoted

Senior Staff AI Research Engineer, On-Device Language Intelligence

Samsung Research AmericaMountain View, CA, United States

Full-time

Samsung AI Research Center (AIC) located in Mountain View, California, is currently recruiting outstanding scientists for the Language and Personal Intelligence lab. Our goal is to perform research ...Show moreLast updated: 9 days ago

Promoted

Video / Imaging Software Engineer

California Staffing ServiceSan Francisco, CA, United States

Full-time

Video / Imaging Software Engineer.Apple is where individual imaginations gather together, committing to the values that lead to great work. Every new product we build, service we create, or Apple Stor...Show moreLast updated: 1 day ago

Promoted

Machine Learning Engineer, Video

CantinaSan Francisco, CA, United States

Full-time

Cantina, founded by Sean Parker, is a new social platform with the most advanced AI character creator.Build, share, and interact with AI bots and your friends directly in the Cantina or across the ...Show moreLast updated: 1 day ago

Promoted

Inference Engineer

Cartesia, Inc.San Francisco, CA, United States

Full-time

Our mission is to build the next generation of AI : ubiquitous, interactive intelligence that runs wherever you are.Today, not even the best models can continuously process and reason over a year-lo...Show moreLast updated: 30+ days ago

Promoted

Machine Learning Video Engineer

Apple Inc.Cupertino, CA, United States

Full-time

Cupertino, California, United States Hardware.Want to work on cutting edge technology that keeps the customer front and center? The Video Engineering group at Apple is responsible for creating the ...Show moreLast updated: 30+ days ago

Promoted

Senior Research Engineer - Video & Audio Generative AI / ML

CanvaSan Francisco, CA, United States

Full-time

Senior Research Engineer - Video & Audio Generative AI / ML.Join the team redefining how the world experiences design.We know job hunting can be a little time consuming and you're probably keen to ...Show moreLast updated: 30+ days ago

Promoted

Staff Engineer-Artificial Intelligence

InnovaccerSan Francisco, CA, United States

Full-time

Staff Engineer-Artificial Intelligence.Staff Engineer-Artificial Intelligence.This range is provided by Innovaccer.Your actual pay will be based on your skills and experience talk with your recruit...Show moreLast updated: 6 days ago