Talent.com
Inference Engineer

Inference Engineer

Cartesia, Inc.San Francisco, CA, United States
4 days ago
Job type
  • Full-time
Job description

About Cartesia

Our mission is to build the next generation of AI : ubiquitous, interactive intelligence that runs wherever you are. Today, not even the best models can continuously process and reason over a year-long stream of audio, video and text-1B text tokens, 10B audio tokens and 1T video tokens-let alone do this on-device.

We're pioneering the model architectures that will make this possible. Our founding team met as PhDs at the Stanford AI Lab, where we invented State Space Models or SSMs, a new primitive for training efficient, large-scale foundation models. Our team combines deep expertise in model innovation and systems engineering paired with a design-minded product engineering team to build and ship cutting edge models and experiences.

We're funded by leading investors at Index Ventures and Lightspeed Venture Partners, along with Factory, Conviction, A Star, General Catalyst, SV Angel, Databricks and others. We're fortunate to have the support of many amazing advisors, and 90+ angels across many industries, including the world's foremost experts in AI.

Role Responsibilities

We're hiring an Inference Engineer to advance our mission of building real-time multimodal intelligence.

What You'll Do

  • Design and build low latency, scalable, and reliable model inference and serving stack for our cutting edge foundation models using Transformers, SSMs and hybrid models.
  • Work closely with our research team and product engineers to serve our suite of products in a fast, cost-effective, and reliable manner.
  • Design and build robust inference infrastructure and monitoring for our products.
  • Have significant autonomy to shape our products and directly impact how cutting-edge AI is applied across various devices and applications.

What You'll Bring

Given the scale and difficulty of problems we work on, we value strong engineering skills at Cartesia.

  • Strong engineering skills, comfortable navigating complex codebases and monorepos.
  • An eye for craft and writing clean and maintainable code.
  • Experience building large-scale distributed systems with high demands on performance, reliability, and observability.
  • Technical leadership with the ability to execute and deliver zero-to-one results amidst ambiguity.
  • Experience designing best practices and processes for monitoring and scaling large scale production systems.
  • Background in or experience working on inference pipelines with machine learning and generative models.
  • Experience working in CUDA, Triton or similar.
  • Our culture

    We're an in-person team based out of San Francisco. We love being in the office, hanging out together and learning from each other everyday.

    We ship fast. All of our work is novel and cutting edge, and execution speed is paramount. We have a high bar, and we don't sacrifice quality and design along the way.

    We support each other. We have an open and inclusive culture that's focused on giving everyone the resources they need to succeed.

    Our perks

    Lunch, dinner and snacks at the office.

    Fully covered medical, dental, and vision insurance for employees.

    401(k).

    Relocation and immigration support.

    Your own personal Yoshi.

    Create a job alert for this search

    Engineer • San Francisco, CA, United States

    Related jobs
    • Promoted
    Senior AI Research Engineer, Model Inference (Remote)

    Senior AI Research Engineer, Model Inference (Remote)

    Tether Operations LimitedSan Francisco, CA, United States
    Remote
    Full-time
    Join Tether and Shape the Future of Digital Finance.At Tether, we’re building solutions that empower businesses to integrate reserve-backed tokens across blockchains with transparency and trust in ...Show moreLast updated: 9 days ago
    • Promoted
    Security Engineer

    Security Engineer

    Glow NetworksMountain View, CA, United States
    Full-time
    We are seeking a Security Engineer to design and implement Data Loss Protection capabilities for complex security use cases, identifying bad actor threat behaviors and preventing / reducing malicious...Show moreLast updated: 30+ days ago
    • Promoted
    Detection Engineer - Remote

    Detection Engineer - Remote

    Strada.ioSan Francisco, CA, United States
    Remote
    Full-time
    Join us on a journey of endless possibilities.At Strada, possibility isn't just a promise - it's the foundation of everything we do. We believe in unlocking potential for every colleague, creating a...Show moreLast updated: 4 days ago
    • Promoted
    Security Engineer - Detection & Response

    Security Engineer - Detection & Response

    NERDYSan Jose, CA, United States
    Full-time
    You are an AI-powered Security Engineer responsible for identifying and responding to malicious or suspicious activity across our environment with speed and confidence. This role leads the engineeri...Show moreLast updated: 4 days ago
    • Promoted
    Adversarial AI Engineer

    Adversarial AI Engineer

    GoFundMeSan Francisco, CA, United States
    Full-time
    Want to help us help others? We’re hiring! GoFundMe is the world’s most powerful community for good, dedicated to helping people help each other. By uniting individuals and nonprofits in one place, ...Show moreLast updated: 29 days ago
    • Promoted
    Solutions Architect, Inference Deployments

    Solutions Architect, Inference Deployments

    NVIDIASanta Clara, CA, United States
    Full-time
    We’re forming a team of innovators to roll out and enhance AI inference solutions at scale, demonstrating NVIDIA’s GPU technology and Kubernetes. As a Solutions Architect (Inference Focus), you’ll c...Show moreLast updated: 4 days ago
    • Promoted
    Lead AI Engineer (FM Hosting, LLM Inference)

    Lead AI Engineer (FM Hosting, LLM Inference)

    Capital OneSan Francisco, CA, United States
    Part-time
    You love to build systems, take pride in the quality of your work, and also share our passion to do the right thing.You want to work on problems that will help change banking for good.Passion for s...Show moreLast updated: 17 days ago
    • Promoted
    AI Inference Engineer

    AI Inference Engineer

    Perplexity AISan Francisco, CA, US
    Full-time
    Perplexity is an AI-powered answer engine founded in December 2022 and growing rapidly as one of the world's leading AI platforms. Perplexity has raised over $1B in venture investment from some ...Show moreLast updated: 30+ days ago
    • Promoted
    Senior / Staff Engineer, Inference

    Senior / Staff Engineer, Inference

    Menlo VenturesSan Francisco, CA, United States
    Full-time
    Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...Show moreLast updated: 17 days ago
    • Promoted
    Security Engineer - Surface Coverage, Detection Engineering

    Security Engineer - Surface Coverage, Detection Engineering

    METAMenlo Park, CA, United States
    Full-time
    Meta Security is looking for a Security Engineer with experience in threat modeling, TTP identification, and detection engineering. You’ll work alongside Software Engineers and Offensive Security En...Show moreLast updated: 30+ days ago
    • Promoted
    Applied AI Inference Engineer

    Applied AI Inference Engineer

    BasetenSan Francisco, CA, United States
    Full-time
    Baseten provides the infrastructure, tooling, and expertise needed to bring great AI products to market - fast.Backed by top investors including IVP, Spark Capital, Greylock, and Conviction, we’re ...Show moreLast updated: 26 days ago
    • Promoted
    Software Engineer, AI Inference

    Software Engineer, AI Inference

    Menlo VenturesSan Francisco, CA, United States
    Full-time
    At Skild AI, we are building the world's first general purpose robotic intelligence that is robust and adapts to unseen scenarios without failing. We believe massive scale through data-driven machin...Show moreLast updated: 30+ days ago
    • Promoted
    AI Engineer - LLM Infra

    AI Engineer - LLM Infra

    YutoriSan Francisco, CA, United States
    Full-time
    Yutori is reimagining how people interact with the web by building AI agents that can reliably do everyday digital tasks. We are building the entire stack to be agent-first, from training our own mo...Show moreLast updated: 30+ days ago
    • Promoted
    Foundation Inspector / Engineer

    Foundation Inspector / Engineer

    BEAR Engineering Inc.Hayward, CA, US
    Full-time
    FEFF;Foundation & Drainage Inspector / Engineer (On-Site in the Bay Area, CA).BEAR Engineering provides independent, objective & data driven foundation, drainage, seismic & retaining wa...Show moreLast updated: 2 days ago
    • Promoted
    Security Engineer Investigator, Insider Trust Menlo Park, CA +2 locations • • Engineering Engin[...]

    Security Engineer Investigator, Insider Trust Menlo Park, CA +2 locations • • Engineering Engin[...]

    MetaMenlo Park, CA, United States
    Full-time
    Security Engineer Investigator, Insider Trust.As part of Meta Security, our Insider Trust team is dedicated to identifying and responding to insider threats that target our data.Our mission is to d...Show moreLast updated: 30+ days ago
    • Promoted
    Fullstack Engineer - Intelligent Agents & Systems

    Fullstack Engineer - Intelligent Agents & Systems

    ZipRecruiterPalo Alto, CA, United States
    Full-time
    Job DescriptionJob Description.Agentic Systems and Interaction projects.You will be at the forefront of building a next- desktop and browser-based agent (end to end) that can autonomously navigate ...Show moreLast updated: 4 days ago
    • Promoted
    Senior Security Engineer II (ML)

    Senior Security Engineer II (ML)

    Moveworks.aiMountain View, CA, United States
    Full-time
    Are you passionate about leveraging machine learning to scale-up security and privacy efforts? Do you have a keen understanding of security risks and a desire to innovate with cutting-edge ML solut...Show moreLast updated: 30+ days ago
    • Promoted
    Fullstack Engineer, Intelligence Systems

    Fullstack Engineer, Intelligence Systems

    OpenAISan Francisco, CA, United States
    Full-time
    As an Intelligence Systems Engineer, you’ll be focused on advancing our Intelligence & Investigations efforts at OpenAI, ensuring the safe and responsible use of AI across our products and services...Show moreLast updated: 30+ days ago