Talent.com
Inference Engineer, Video AI

Inference Engineer, Video AI

CantinaSan Francisco, CA, United States
30+ days ago
Job type
  • Full-time
Job description

A bit about Cantina :

Cantina, founded by Sean Parker, is a new social platform with the most advanced AI character creator. Build, share, and interact with AI bots and your friends directly in the Cantina or across the internet.

Cantina bots are lifelike, social creatures, capable of interacting wherever humans go on the internet. Recreate yourself using powerful AI, imagine someone new, or choose from thousands of existing characters. Bots are a new media type that offer a way for creators to share infinitely scalable and personalized content experiences combined with seamless group chat across voice, video, and text.

If you're excited about the potential AI has to shape human creativity and social interactions, join us in building the future!

A bit about the role : We're looking for an Inference Engineer who specializes in productionizing and hosting video AI models at scale. You'll be responsible for taking cutting-edge neural networks from research to production, building robust inference infrastructure, and optimizing model performance for real-time applications. This role focuses on the deployment and serving of large video models.

As an Inference Engineer, you will :

  • Deploy video AI models to production - Take research models and build production-ready inference endpoints with APIs, ensuring efficient operation across cloud infrastructure.
  • Maintain and optimize inference systems - Debug complex model serving issues, optimize latency performance, monitor system health, and ensure 99.9% uptime for AI-powered features.
  • Implement model optimizations - Work with neural network architectures including diffusion networks, VAEs, and transformers. Apply streaming optimizations and understand video model architectures to implement effective performance improvements.
  • Manage inference infrastructure - Leverage containerization with Docker, cloud storage solutions like S3, and cluster computing to build scalable model serving infrastructure.
  • Collaborate with research teams - Work closely with AI researchers to understand model requirements, architectural constraints, and optimization opportunities for new video generation models.

A bit about you :

  • 2+ years of ML engineering experience with focus on model inference and deployment
  • Strong understanding of neural network architectures , particularly diffusion networks, VAEs, and transformer models
  • Experience with video and image models - Understanding of how video / image generation models work, their architectures, and optimization strategies specific to video processing
  • Multi-GPU inference expertise - Experience running model components across multiple GPUs, implementing parallel processing strategies for large models
  • Production model hosting experience - Track record of deploying and maintaining ML models in production environments, including streaming and real-time inference
  • Experience with containerization (Docker), AWS, and cluster computing environments
  • Familiarity with machine learning frameworks (PyTorch, TensorFlow)
  • Experience with inference platforms and model serving solutions
  • Technical Stack You'll Work With :

  • Cloud : AWS (S3, DynamoDB), Kubernetes clusters
  • ML Infrastructure : Model serving platforms, Docker
  • Languages : Python
  • Frameworks : PyTorch, TensorFlow
  • Models : Video generation models, diffusion networks, VAEs, transformers
  • Optimization : Multi-GPU inference, real-time processing techniques

    Pay Equity :

    In compliance with Pay Transparency Laws, the base salary range for this role is between $175,000-$225,000 for those located in the San Francisco Bay Area, New York City and Seattle, WA. When determining compensation, a number of factors will be considered, including skills, experience, job scope, location, and competitive compensation market data.

    Benefits :

  • Health Care - 99% of premiums for medical, vision, dental are fully paid for by Cantina, plus One Medical membership.
  • Monthly Wellness Stipend - $500 / month to use on whatever you'd like!
  • Rest and Recharge - 15 PTO days per year, 10 sick days, all Federal holidays, and 2 floating holidays.
  • 401(K) - Eligible to participate on day one of employment.
  • Parental Leave & Fertility Support
  • Competitive Salary & Equity
  • Lunch and snacks provided for in-office employees.
  • WFH equipment provided for full-time hybrid / remote employees.
  • Create a job alert for this search

    Video Engineer • San Francisco, CA, United States

    Related jobs
    • Promoted
    Senior AI Research Engineer, Model Inference (Remote)

    Senior AI Research Engineer, Model Inference (Remote)

    Tether Operations LimitedSan Francisco, CA, United States
    Remote
    Full-time
    Join Tether and Shape the Future of Digital Finance.At Tether, we’re building solutions that empower businesses to integrate reserve-backed tokens across blockchains with transparency and trust in ...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Research Engineer, TikTok AI Search (LLM Pretraining / Alignment / Inference)

    Senior Research Engineer, TikTok AI Search (LLM Pretraining / Alignment / Inference)

    Tik TokSan Jose, CA, United States
    Full-time
    About the team On the TikTok Search Team, you will have the opportunity to develop and apply cutting edge machine learning technologies in real-time large-scale systems, which serve billions of sea...Show moreLast updated: 9 days ago
    • Promoted
    Lead AI / ML Engineer, Recommendation Systems

    Lead AI / ML Engineer, Recommendation Systems

    AmazonSunnyvale, CA, United States
    Full-time
    Prime Video is a first-stop entertainment destination offering customers a vast collection of premium programming in one app available across thousands of devices. Prime members can customize their ...Show moreLast updated: 1 day ago
    • Promoted
    AI Inference Engineer

    AI Inference Engineer

    Perplexity AI Inc.San Francisco, CA, United States
    Full-time
    We are looking for an AI Inference engineer to join our growing team.Our current stack is Python, Rust, C++, PyTorch, Triton, CUDA, Kubernetes. You will have the opportunity to work on large-scale d...Show moreLast updated: 4 days ago
    • Promoted
    Lead Applied ML Research Engineer - Video AI

    Lead Applied ML Research Engineer - Video AI

    AdobeSan Jose, CA, United States
    Full-time
    Changing the world through digital experiences is what Adobe's all about.We give everyone-from emerging artists to global brands-everything they need to design and deliver exceptional digital exper...Show moreLast updated: 9 days ago
    • Promoted
    Applied AI Inference Engineer

    Applied AI Inference Engineer

    BasetenSan Francisco, CA, United States
    Full-time
    Baseten provides the infrastructure, tooling, and expertise needed to bring great AI products to market - fast.We're trusted by leading AI-driven innovators like Writer, Abridge, Bland, Patreon, De...Show moreLast updated: 1 day ago
    • Promoted
    Video / Imaging Software Engineer

    Video / Imaging Software Engineer

    AppleSan Francisco, CA, United States
    Full-time
    Apple is where individual imaginations gather together, committing to the values that lead to great work.Every new product we build, service we create, or Apple Store experience we deliver is the r...Show moreLast updated: 1 day ago
    • Promoted
    Vice President of Artificial Intelligence (AI) and Machine Learning (ML)

    Vice President of Artificial Intelligence (AI) and Machine Learning (ML)

    ConfidentialSan Jose, CA, United States
    Full-time
    Vice President of Artificial Intelligence (AI) and Machine Learning (ML).The Company is seeking a Senior Software Engineer to join their team. The successful candidate will be responsible for design...Show moreLast updated: 23 hours ago
    • Promoted
    Software Engineer III, AI / ML GenAI, YouTube

    Software Engineer III, AI / ML GenAI, YouTube

    California Staffing ServiceSan Francisco, CA, United States
    Full-time
    Software Engineer III, AI / ML GenAI, YouTube.Experience driving progress, solving problems, and mentoring more junior team members. deeper expertise and applied knowledge within relevant area.Bachel...Show moreLast updated: 23 hours ago
    • Promoted
    Senior Software Engineer, Audio / Video

    Senior Software Engineer, Audio / Video

    DiscordSan Francisco, CA, United States
    Full-time
    Discord is used by over 200 million people every month for many different reasons, but there’s one thing that nearly everyone does on our platform : . Over 90% of our users play games, spending a comb...Show moreLast updated: 30+ days ago
    • Promoted
    Staff Machine Learning Engineer- Video AI / Computer Vision

    Staff Machine Learning Engineer- Video AI / Computer Vision

    Warner Bros. DiscoverySan Francisco, CA, United States
    Full-time
    When we say, "the stuff dreams are made of," we're not just referring to the world of wizards, dragons and superheroes, or even to the wonders of Planet Earth. Behind WBD's vast portfolio of iconic ...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Staff AI Research Engineer, On-Device Language Intelligence

    Senior Staff AI Research Engineer, On-Device Language Intelligence

    Samsung Research AmericaMountain View, CA, United States
    Full-time
    Samsung AI Research Center (AIC) located in Mountain View, California, is currently recruiting outstanding scientists for the Language and Personal Intelligence lab. Our goal is to perform research ...Show moreLast updated: 9 days ago
    • Promoted
    Video / Imaging Software Engineer

    Video / Imaging Software Engineer

    California Staffing ServiceSan Francisco, CA, United States
    Full-time
    Video / Imaging Software Engineer.Apple is where individual imaginations gather together, committing to the values that lead to great work. Every new product we build, service we create, or Apple Stor...Show moreLast updated: 1 day ago
    • Promoted
    Machine Learning Engineer, Video

    Machine Learning Engineer, Video

    CantinaSan Francisco, CA, United States
    Full-time
    Cantina, founded by Sean Parker, is a new social platform with the most advanced AI character creator.Build, share, and interact with AI bots and your friends directly in the Cantina or across the ...Show moreLast updated: 1 day ago
    • Promoted
    Inference Engineer

    Inference Engineer

    Cartesia, Inc.San Francisco, CA, United States
    Full-time
    Our mission is to build the next generation of AI : ubiquitous, interactive intelligence that runs wherever you are.Today, not even the best models can continuously process and reason over a year-lo...Show moreLast updated: 30+ days ago
    • Promoted
    Machine Learning Video Engineer

    Machine Learning Video Engineer

    Apple Inc.Cupertino, CA, United States
    Full-time
    Cupertino, California, United States Hardware.Want to work on cutting edge technology that keeps the customer front and center? The Video Engineering group at Apple is responsible for creating the ...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Research Engineer - Video & Audio Generative AI / ML

    Senior Research Engineer - Video & Audio Generative AI / ML

    CanvaSan Francisco, CA, United States
    Full-time
    Senior Research Engineer - Video & Audio Generative AI / ML.Join the team redefining how the world experiences design.We know job hunting can be a little time consuming and you're probably keen to ...Show moreLast updated: 30+ days ago
    • Promoted
    Staff Engineer-Artificial Intelligence

    Staff Engineer-Artificial Intelligence

    InnovaccerSan Francisco, CA, United States
    Full-time
    Staff Engineer-Artificial Intelligence.Staff Engineer-Artificial Intelligence.This range is provided by Innovaccer.Your actual pay will be based on your skills and experience talk with your recruit...Show moreLast updated: 6 days ago