Inference Engineer

Cartesia, Inc.San Francisco, CA, United States

4 days ago

Job type

Full-time

Job description

About Cartesia

Our mission is to build the next generation of AI : ubiquitous, interactive intelligence that runs wherever you are. Today, not even the best models can continuously process and reason over a year-long stream of audio, video and text-1B text tokens, 10B audio tokens and 1T video tokens-let alone do this on-device.

We're pioneering the model architectures that will make this possible. Our founding team met as PhDs at the Stanford AI Lab, where we invented State Space Models or SSMs, a new primitive for training efficient, large-scale foundation models. Our team combines deep expertise in model innovation and systems engineering paired with a design-minded product engineering team to build and ship cutting edge models and experiences.

We're funded by leading investors at Index Ventures and Lightspeed Venture Partners, along with Factory, Conviction, A Star, General Catalyst, SV Angel, Databricks and others. We're fortunate to have the support of many amazing advisors, and 90+ angels across many industries, including the world's foremost experts in AI.

Role Responsibilities

We're hiring an Inference Engineer to advance our mission of building real-time multimodal intelligence.

What You'll Do

Design and build low latency, scalable, and reliable model inference and serving stack for our cutting edge foundation models using Transformers, SSMs and hybrid models.
Work closely with our research team and product engineers to serve our suite of products in a fast, cost-effective, and reliable manner.
Design and build robust inference infrastructure and monitoring for our products.
Have significant autonomy to shape our products and directly impact how cutting-edge AI is applied across various devices and applications.

What You'll Bring

Given the scale and difficulty of problems we work on, we value strong engineering skills at Cartesia.

Strong engineering skills, comfortable navigating complex codebases and monorepos.

An eye for craft and writing clean and maintainable code.

Experience building large-scale distributed systems with high demands on performance, reliability, and observability.

Technical leadership with the ability to execute and deliver zero-to-one results amidst ambiguity.

Experience designing best practices and processes for monitoring and scaling large scale production systems.

Background in or experience working on inference pipelines with machine learning and generative models.

Experience working in CUDA, Triton or similar.

Our culture

We're an in-person team based out of San Francisco. We love being in the office, hanging out together and learning from each other everyday.

We ship fast. All of our work is novel and cutting edge, and execution speed is paramount. We have a high bar, and we don't sacrifice quality and design along the way.

We support each other. We have an open and inclusive culture that's focused on giving everyone the resources they need to succeed.

Our perks

Lunch, dinner and snacks at the office.

Fully covered medical, dental, and vision insurance for employees.

401(k).

Relocation and immigration support.

Your own personal Yoshi.

Create a job alert for this search

Engineer • San Francisco, CA, United States

Related jobs

Promoted

Senior AI Research Engineer, Model Inference (Remote)

Tether Operations LimitedSan Francisco, CA, United States

Remote

Full-time

Join Tether and Shape the Future of Digital Finance.At Tether, we’re building solutions that empower businesses to integrate reserve-backed tokens across blockchains with transparency and trust in ...Show moreLast updated: 9 days ago

Promoted

Security Engineer

Glow NetworksMountain View, CA, United States

Full-time

We are seeking a Security Engineer to design and implement Data Loss Protection capabilities for complex security use cases, identifying bad actor threat behaviors and preventing / reducing malicious...Show moreLast updated: 30+ days ago

Promoted

Detection Engineer - Remote

Strada.ioSan Francisco, CA, United States

Remote

Full-time

Join us on a journey of endless possibilities.At Strada, possibility isn't just a promise - it's the foundation of everything we do. We believe in unlocking potential for every colleague, creating a...Show moreLast updated: 4 days ago

Promoted

Security Engineer - Detection & Response

NERDYSan Jose, CA, United States

Full-time

You are an AI-powered Security Engineer responsible for identifying and responding to malicious or suspicious activity across our environment with speed and confidence. This role leads the engineeri...Show moreLast updated: 4 days ago

Promoted

Adversarial AI Engineer

GoFundMeSan Francisco, CA, United States

Full-time

Want to help us help others? We’re hiring! GoFundMe is the world’s most powerful community for good, dedicated to helping people help each other. By uniting individuals and nonprofits in one place, ...Show moreLast updated: 29 days ago

Promoted

Solutions Architect, Inference Deployments

NVIDIASanta Clara, CA, United States

Full-time

We’re forming a team of innovators to roll out and enhance AI inference solutions at scale, demonstrating NVIDIA’s GPU technology and Kubernetes. As a Solutions Architect (Inference Focus), you’ll c...Show moreLast updated: 4 days ago

Promoted

Lead AI Engineer (FM Hosting, LLM Inference)

Capital OneSan Francisco, CA, United States

Part-time

You love to build systems, take pride in the quality of your work, and also share our passion to do the right thing.You want to work on problems that will help change banking for good.Passion for s...Show moreLast updated: 17 days ago

Promoted

AI Inference Engineer

Perplexity AISan Francisco, CA, US

Full-time

Perplexity is an AI-powered answer engine founded in December 2022 and growing rapidly as one of the world's leading AI platforms. Perplexity has raised over $1B in venture investment from some ...Show moreLast updated: 30+ days ago

Promoted

Senior / Staff Engineer, Inference

Menlo VenturesSan Francisco, CA, United States

Full-time

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...Show moreLast updated: 17 days ago

Promoted

Security Engineer - Surface Coverage, Detection Engineering

METAMenlo Park, CA, United States

Full-time

Meta Security is looking for a Security Engineer with experience in threat modeling, TTP identification, and detection engineering. You’ll work alongside Software Engineers and Offensive Security En...Show moreLast updated: 30+ days ago

Promoted

Applied AI Inference Engineer

BasetenSan Francisco, CA, United States

Full-time

Baseten provides the infrastructure, tooling, and expertise needed to bring great AI products to market - fast.Backed by top investors including IVP, Spark Capital, Greylock, and Conviction, we’re ...Show moreLast updated: 26 days ago

Promoted

Software Engineer, AI Inference

Menlo VenturesSan Francisco, CA, United States

Full-time

At Skild AI, we are building the world's first general purpose robotic intelligence that is robust and adapts to unseen scenarios without failing. We believe massive scale through data-driven machin...Show moreLast updated: 30+ days ago

Promoted

AI Engineer - LLM Infra

YutoriSan Francisco, CA, United States

Full-time

Yutori is reimagining how people interact with the web by building AI agents that can reliably do everyday digital tasks. We are building the entire stack to be agent-first, from training our own mo...Show moreLast updated: 30+ days ago

Promoted

Foundation Inspector / Engineer

BEAR Engineering Inc.Hayward, CA, US

Full-time

FEFF;Foundation & Drainage Inspector / Engineer (On-Site in the Bay Area, CA).BEAR Engineering provides independent, objective & data driven foundation, drainage, seismic & retaining wa...Show moreLast updated: 2 days ago

Promoted

Security Engineer Investigator, Insider Trust Menlo Park, CA +2 locations • • Engineering Engin[...]

MetaMenlo Park, CA, United States

Full-time

Security Engineer Investigator, Insider Trust.As part of Meta Security, our Insider Trust team is dedicated to identifying and responding to insider threats that target our data.Our mission is to d...Show moreLast updated: 30+ days ago

Promoted

Fullstack Engineer - Intelligent Agents & Systems

ZipRecruiterPalo Alto, CA, United States

Full-time

Job DescriptionJob Description.Agentic Systems and Interaction projects.You will be at the forefront of building a next- desktop and browser-based agent (end to end) that can autonomously navigate ...Show moreLast updated: 4 days ago

Promoted

Senior Security Engineer II (ML)

Moveworks.aiMountain View, CA, United States

Full-time

Are you passionate about leveraging machine learning to scale-up security and privacy efforts? Do you have a keen understanding of security risks and a desire to innovate with cutting-edge ML solut...Show moreLast updated: 30+ days ago

Promoted

Fullstack Engineer, Intelligence Systems

OpenAISan Francisco, CA, United States

Full-time

As an Intelligence Systems Engineer, you’ll be focused on advancing our Intelligence & Investigations efforts at OpenAI, ensuring the safe and responsible use of AI across our products and services...Show moreLast updated: 30+ days ago