Talent.com
Software Engineer - Applied Inference
Software Engineer - Applied InferenceXai • Palo Alto, CA, United States
Software Engineer - Applied Inference

Software Engineer - Applied Inference

Xai • Palo Alto, CA, United States
22 hours ago
Job type
  • Full-time
Job description

About xAI

xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company's mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All engineers are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates.

Tech Stack

  • Kubernetes
  • Buildkite / ArgoCD
  • Prometheus / Grafana / PagerDuty
  • Pulumi / Terraform
  • SGLang : This team is leading the development of one of the most popular open-source inference engines, SGLang (). You have the opportunity to work on open-source projects.
  • Custom debugging and tracing tools

Focus

  • Architect and implement scalable distributed infrastructure for model serving, such as load balancing, auto scaling, batch scheduling, and global KVcache systems.
  • Ensure the reliability of inference services, targeting 100% uptime, a 0% error rate, and good tail performance, through proactive monitoring, fault-tolerant designs, and rigorous testing.
  • Create custom tools to trace, replay, and fix issues or crashes across the entire stack, from cluster orchestration to GPU kernels.
  • Benchmark and fine-tune inference engines to deliver optimal performance under diverse, production workloads.
  • Develop robust CI / CD infrastructure to enable seamless endpoint deployment, image publishing, feature rollouts, and inference engine updates.
  • Ideal Experiences

  • Worked on large-scale, high-concurrent production serving.
  • Worked on GPU inference engines.
  • Worked on testing, benchmarking, and the reliability of inference services.
  • Worked on designing and implementing CI / CD infrastructure.
  • Location

    The role is based in the Bay Area [San Francisco and Palo Alto]. Candidates are expected to be located near the Bay Area or open to relocation.

    Interview Process

    After submitting your application, the team reviews your CV and statement of exceptional work. If your application passes this stage, you will be invited to a 15-minute interview ("phone interview") during which a member of our team will ask some basic questions. If you clear the initial phone interview, you will enter the main process, which consists of four technical interviews :

  • Coding assessment in a language of your choice.
  • Systems hands-on : Demonstrate practical skills in a live problem-solving session.
  • Project deep-dive : Present your past exceptional work to a small audience.
  • Meet and greet with the wider team.
  • Our goal is to finish the main process within one week. All interviews will be conducted via Google Meet.

    Annual Salary Range

    $180,000 - $440,000 USD

    Benefits

    Base salary is just one part of our total rewards package at xAI, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and various other discounts and perks.

    xAI is an equal opportunity employer.

    California Consumer Privacy Act (CCPA) Notice

    Create a job alert for this search

    Software Engineer • Palo Alto, CA, United States

    Related jobs
    Software Engineer - Large Scale Inference

    Software Engineer - Large Scale Inference

    The San Francisco Compute Company • San Francisco, CA, United States
    Full-time
    We think people should buy it like one.Startups shouldn’t be forced to buy a year’s worth of compute time in order to get market rate and compute providers shouldn’t go bankrupt because they can’t ...Show more
    Last updated: 30+ days ago • Promoted
    Software Engineer - Applied AI

    Software Engineer - Applied AI

    Selector AI • Santa Clara, CA, United States
    Full-time
    Selector is building an operational intelligence platform for digital infrastructure.By adopting an AI / ML-based analytics approach, the platform provides actionable multi-dimensional insights to ne...Show more
    Last updated: 2 hours ago • Promoted • New!
    AGI Sr Inference Software Development Engineering, AGI Inference

    AGI Sr Inference Software Development Engineering, AGI Inference

    Amazon • Sunnyvale, CA, United States
    Full-time
    The Sensory Inference team at AGI is a group of innovative developers working on ground-breaking multi-modal inference solutions that revolutionize how AI systems perceive and interact with the wor...Show more
    Last updated: 30+ days ago • Promoted
    Senior Inference Software Engineer

    Senior Inference Software Engineer

    Etched • San Jose, CA, United States
    Full-time
    Etched is building AI chips that are hard-coded for individual model architectures.Our first product (Sohu) only supports transformers, but has an order of magnitude more throughput and lower laten...Show more
    Last updated: 5 days ago • Promoted
    Software Engineer, Inference

    Software Engineer, Inference

    algojobs • San Francisco, CA, United States
    Full-time
    Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...Show more
    Last updated: 5 days ago • Promoted
    Software Engineer - Applied AI

    Software Engineer - Applied AI

    Selector Software, Inc. • Santa Clara, CA, United States
    Full-time
    Selector is building an operational intelligence platform for digital infrastructure.By adopting an AI / ML-based analytics approach, the platform provides actionable multi-dimensional insights to ne...Show more
    Last updated: 30+ days ago • Promoted
    Senior Software Engineer, AI Inference Platform

    Senior Software Engineer, AI Inference Platform

    CEREBRAS SYSTEMS INC. • Sunnyvale, CA, United States
    Full-time
    Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs.Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programm...Show more
    Last updated: 20 hours ago • Promoted • New!
    Inference Software Engineer - Collectives

    Inference Software Engineer - Collectives

    ETCHED LLC • San Jose, CA, United States
    Full-time
    Etched is building the world's first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically lower cost and latency than a B200.With Etched ASIC...Show more
    Last updated: 30+ days ago • Promoted
    Software Engineer - AI Agent Infrastructure (Healthcare)

    Software Engineer - AI Agent Infrastructure (Healthcare)

    Honey Health • Fremont, CA, United States
    Full-time
    Honey Health is the all-in-one AI back office for primary and specialty care.Our AI agents autonomously handle core back-office jobs, such as aggregating patient data, processing orders and prescri...Show more
    Last updated: 12 days ago • Promoted
    Software Engineer, Inference

    Software Engineer, Inference

    Trypulse • San Francisco, CA, United States
    Full-time
    Pulse is tackling one of the most persistent challenges in data infrastructure : extracting accurate, structured information from complex documents at scale. We have a breakthrough approach to docume...Show more
    Last updated: 30+ days ago • Promoted
    Applied AI Inference Engineer

    Applied AI Inference Engineer

    Baseten • San Francisco, CA, United States
    Full-time
    Baseten provides the infrastructure, tooling, and expertise needed to bring great AI products to market - fast.Backed by top investors including IVP, Spark Capital, Greylock, and Conviction, we’re ...Show more
    Last updated: 30+ days ago • Promoted
    Senior Software Engineer, Inference Platform

    Senior Software Engineer, Inference Platform

    MongoDB • Palo Alto, CA, United States
    Full-time
    MongoDB’s mission is to empower innovators to create, transform, and disrupt industries by unleashing the power of software and data. We enable organizations of all sizes to easily build, scale, and...Show more
    Last updated: 30+ days ago • Promoted
    Software Engineer, Optimus Inference Co Design

    Software Engineer, Optimus Inference Co Design

    Tesla • Palo Alto, CA, United States
    Full-time
    The AI inference co-design team's goal is to take research models and make them run efficiently on our AI-ASIC to power real-time inference for Optimus humanoid robot programs, with applications ex...Show more
    Last updated: 30+ days ago • Promoted
    Senior Software Engineer - Intelligence

    Senior Software Engineer - Intelligence

    Hard Yaka • San Francisco, CA, United States
    Full-time
    We exist to accelerate innovation.We do this by giving more people the opportunity to participate in the venture economy by building the financial infrastructure that makes it possible for more peo...Show more
    Last updated: 5 days ago • Promoted
    Senior Software Engineer - AI Agent Infrastructure (Healthcare)

    Senior Software Engineer - AI Agent Infrastructure (Healthcare)

    Honey Health • Sunnyvale, CA, United States
    Full-time
    Honey Health is the all-in-one AI back office for primary and specialty care.Our AI agents autonomously handle core back-office jobs, such as aggregating patients data, processing orders and prescr...Show more
    Last updated: 10 days ago • Promoted
    Inference Software Engineer

    Inference Software Engineer

    ETCHED LLC • Cupertino, CA, United States
    Full-time
    Etched is building AI chips that are hard-coded for individual model architectures.Our first product (Sohu) only supports transformers, but has an order of magnitude more throughput and lower laten...Show more
    Last updated: 30+ days ago • Promoted
    Software Engineer - Large Scale Inference

    Software Engineer - Large Scale Inference

    SF Compute • San Francisco, CA, United States
    Full-time
    We're going to secure the financial risk of the largest infrastructure build-out in the history of the world.When people finance clusters, the data centers that house them, and the power that power...Show more
    Last updated: 14 days ago • Promoted
    Software Engineer, AI Inference Co Design

    Software Engineer, AI Inference Co Design

    Tesla • Palo Alto, CA, United States
    Full-time
    The AI inference co-design team's goal is to take research models and make them run efficiently on our AI-ASIC to power real-time inference for Autopilot and Optimus programs.This unique role lies ...Show more
    Last updated: 30+ days ago • Promoted