Software Engineer - Applied InferenceXai • Palo Alto, CA, United States

Software Engineer - Applied Inference

Xai • Palo Alto, CA, United States

22 hours ago

Job type

Full-time

Job description

About xAI

xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company's mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All engineers are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates.

Tech Stack

Kubernetes
Buildkite / ArgoCD
Prometheus / Grafana / PagerDuty
Pulumi / Terraform
SGLang : This team is leading the development of one of the most popular open-source inference engines, SGLang (). You have the opportunity to work on open-source projects.
Custom debugging and tracing tools

Focus

Architect and implement scalable distributed infrastructure for model serving, such as load balancing, auto scaling, batch scheduling, and global KVcache systems.

Ensure the reliability of inference services, targeting 100% uptime, a 0% error rate, and good tail performance, through proactive monitoring, fault-tolerant designs, and rigorous testing.

Create custom tools to trace, replay, and fix issues or crashes across the entire stack, from cluster orchestration to GPU kernels.

Benchmark and fine-tune inference engines to deliver optimal performance under diverse, production workloads.

Develop robust CI / CD infrastructure to enable seamless endpoint deployment, image publishing, feature rollouts, and inference engine updates.

Ideal Experiences

Worked on large-scale, high-concurrent production serving.

Worked on GPU inference engines.

Worked on testing, benchmarking, and the reliability of inference services.

Worked on designing and implementing CI / CD infrastructure.

Location

The role is based in the Bay Area [San Francisco and Palo Alto]. Candidates are expected to be located near the Bay Area or open to relocation.

Interview Process

After submitting your application, the team reviews your CV and statement of exceptional work. If your application passes this stage, you will be invited to a 15-minute interview ("phone interview") during which a member of our team will ask some basic questions. If you clear the initial phone interview, you will enter the main process, which consists of four technical interviews :

Coding assessment in a language of your choice.

Systems hands-on : Demonstrate practical skills in a live problem-solving session.

Project deep-dive : Present your past exceptional work to a small audience.

Meet and greet with the wider team.

Our goal is to finish the main process within one week. All interviews will be conducted via Google Meet.

Annual Salary Range

$180,000 - $440,000 USD

Benefits

Base salary is just one part of our total rewards package at xAI, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and various other discounts and perks.

xAI is an equal opportunity employer.

California Consumer Privacy Act (CCPA) Notice

Create a job alert for this search

Software Engineer • Palo Alto, CA, United States

Related jobs

Software Engineer - Large Scale Inference

The San Francisco Compute Company • San Francisco, CA, United States

Full-time

We think people should buy it like one.Startups shouldn’t be forced to buy a year’s worth of compute time in order to get market rate and compute providers shouldn’t go bankrupt because they can’t ...Show more

Last updated: 30+ days ago • Promoted

Software Engineer - Applied AI

Selector AI • Santa Clara, CA, United States

Full-time

Selector is building an operational intelligence platform for digital infrastructure.By adopting an AI / ML-based analytics approach, the platform provides actionable multi-dimensional insights to ne...Show more

Last updated: 2 hours ago • Promoted • New!

AGI Sr Inference Software Development Engineering, AGI Inference

Amazon • Sunnyvale, CA, United States

Full-time

The Sensory Inference team at AGI is a group of innovative developers working on ground-breaking multi-modal inference solutions that revolutionize how AI systems perceive and interact with the wor...Show more

Last updated: 30+ days ago • Promoted

Senior Inference Software Engineer

Etched • San Jose, CA, United States

Full-time

Etched is building AI chips that are hard-coded for individual model architectures.Our first product (Sohu) only supports transformers, but has an order of magnitude more throughput and lower laten...Show more

Last updated: 5 days ago • Promoted

Software Engineer, Inference

algojobs • San Francisco, CA, United States

Full-time

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...Show more

Last updated: 5 days ago • Promoted

Software Engineer - Applied AI

Selector Software, Inc. • Santa Clara, CA, United States

Full-time

Last updated: 30+ days ago • Promoted

Senior Software Engineer, AI Inference Platform

CEREBRAS SYSTEMS INC. • Sunnyvale, CA, United States

Full-time

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs.Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programm...Show more

Last updated: 20 hours ago • Promoted • New!

Inference Software Engineer - Collectives

ETCHED LLC • San Jose, CA, United States

Full-time

Etched is building the world's first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically lower cost and latency than a B200.With Etched ASIC...Show more

Last updated: 30+ days ago • Promoted

Software Engineer - AI Agent Infrastructure (Healthcare)

Honey Health • Fremont, CA, United States

Full-time

Honey Health is the all-in-one AI back office for primary and specialty care.Our AI agents autonomously handle core back-office jobs, such as aggregating patient data, processing orders and prescri...Show more

Last updated: 12 days ago • Promoted

Software Engineer, Inference

Trypulse • San Francisco, CA, United States

Full-time

Pulse is tackling one of the most persistent challenges in data infrastructure : extracting accurate, structured information from complex documents at scale. We have a breakthrough approach to docume...Show more

Last updated: 30+ days ago • Promoted

Applied AI Inference Engineer

Baseten • San Francisco, CA, United States

Full-time

Baseten provides the infrastructure, tooling, and expertise needed to bring great AI products to market - fast.Backed by top investors including IVP, Spark Capital, Greylock, and Conviction, we’re ...Show more

Last updated: 30+ days ago • Promoted

Senior Software Engineer, Inference Platform

MongoDB • Palo Alto, CA, United States

Full-time

MongoDB’s mission is to empower innovators to create, transform, and disrupt industries by unleashing the power of software and data. We enable organizations of all sizes to easily build, scale, and...Show more

Last updated: 30+ days ago • Promoted

Software Engineer, Optimus Inference Co Design

Tesla • Palo Alto, CA, United States

Full-time

The AI inference co-design team's goal is to take research models and make them run efficiently on our AI-ASIC to power real-time inference for Optimus humanoid robot programs, with applications ex...Show more

Last updated: 30+ days ago • Promoted

Senior Software Engineer - Intelligence

Hard Yaka • San Francisco, CA, United States

Full-time

We exist to accelerate innovation.We do this by giving more people the opportunity to participate in the venture economy by building the financial infrastructure that makes it possible for more peo...Show more

Last updated: 5 days ago • Promoted

Senior Software Engineer - AI Agent Infrastructure (Healthcare)

Honey Health • Sunnyvale, CA, United States

Full-time

Honey Health is the all-in-one AI back office for primary and specialty care.Our AI agents autonomously handle core back-office jobs, such as aggregating patients data, processing orders and prescr...Show more

Last updated: 10 days ago • Promoted

Inference Software Engineer

ETCHED LLC • Cupertino, CA, United States

Full-time

Last updated: 30+ days ago • Promoted

Software Engineer - Large Scale Inference

SF Compute • San Francisco, CA, United States

Full-time

We're going to secure the financial risk of the largest infrastructure build-out in the history of the world.When people finance clusters, the data centers that house them, and the power that power...Show more

Last updated: 14 days ago • Promoted

Software Engineer, AI Inference Co Design

Tesla • Palo Alto, CA, United States

Full-time

The AI inference co-design team's goal is to take research models and make them run efficiently on our AI-ASIC to power real-time inference for Autopilot and Optimus programs.This unique role lies ...Show more

Last updated: 30+ days ago • Promoted