Talent.com
Senior ML infrastructure engineer

Senior ML infrastructure engineer

KuzcoSan Francisco, CA, United States
23 hours ago
Job type
  • Full-time
Job description

Overview

Kuzco is seeking a Senior ML Infrastructure Engineer to join our team. This role involves developing large-scale, fault-tolerant systems that handle millions of large language model inference requests per day. If you are passionate about developing next-generation ML systems that operate at scale, we want to hear from you.

About Kuzco

We are building a distributed LLM inference network that combines idle GPU capacity from around the world into a single cohesive plane of compute that can be used for running large-language models like Llama and Mistral. At any given moment, we have over 5,000 GPUs and hundreds of terabytes of VRAM connected to the network. Learn more here .

We are a small, well-funded team of staff-level engineers who work in-person in downtown San Francisco on difficult, high-impact engineering problems. Everyone on the team has been writing code for over 10 years, and has founded and run their own software companies. We are high-agency, adaptable, and collaborative. We value creativity alongside technical prowess and humility. We work hard, and deeply enjoy the work that we do; we are almost always online at least six days per week.

About the Role

You will be responsible for designing and implementing the core systems that power our globally distributed LLM inference network. You'll work on problems at the intersection of distributed systems, machine learning, and resource optimization.

Key Responsibilities

  • Design and implement scalable distributed systems for our inference network
  • Develop models for efficient resource allocation across a network of heterogeneous hardware and quickly changing topology
  • Optimize network latency, throughput, and availability
  • Build robust logging and metrics systems to monitor network health and performance
  • Conduct reviews of architecture and system design to ensure use of best practices
  • Collaborate with founders, engineers, and other stakeholders to improve our infrastructure and product offerings

What We're Looking For

  • Very strong problem-solving skills and ability to work in a startup environment
  • 5+ years of experience in building high performance systems
  • Strong programming skills in Typescript, Python, and one of Go, Rust, or C++
  • Solid understanding of distributed systems concepts
  • Knowledge of orchestrators and schedulers like Kubernetes and Nomad
  • Use of AI tooling in development workflow (ChatGPT, Claude, Cursor, etc)
  • Experience with LLM inference engines like vLLM or TensorRT-LLM is plus
  • Experience with GPU programming and optimization (CUDA experience is a plus)
  • Compensation

    We offer competitive compensation, equity in a high-growth startup, and comprehensive benefits. The base salary range for this role is $180,000 - $250,000, plus equity and benefits, depending on experience.

    Equal Opportunity

    Kuzco is an equal opportunity employer. We welcome applicants from all backgrounds and don't discriminate based on race, color, religion, gender, sexual orientation, national origin, genetics, disability, age, or veteran status.

    If you're excited about building the future of developer-first AI infrastructure, we'd love to hear from you. Please send your resume, LinkedIn, and GitHub to sam@kuzco.xyz.

    #J-18808-Ljbffr

    Create a job alert for this search

    Senior Engineer Infrastructure • San Francisco, CA, United States

    Related jobs
    • Promoted
    AI / ML Infrastructure Engineer

    AI / ML Infrastructure Engineer

    RIT Solutions, Inc.Concord, CA, United States
    Full-time
    Title : AI / ML Infrastructure Engineer, 3 days onsite, locals only.Grant St Concord California 94520 United States.Lead and design the platform and infrastructure architecture for AIML and NLP in mod...Show moreLast updated: 30+ days ago
    • Promoted
    ML Infrastructure Engineer in Oakland

    ML Infrastructure Engineer in Oakland

    Energy Jobline ZROakland, CA, United States
    Full-time
    Energy Jobline is the largest and fastest growing global Energy Job Board and Energy Hub.We have an audience reach of over 7 million energy professionals, 400,000+ monthly advertised global energy ...Show moreLast updated: 23 hours ago
    • Promoted
    AI / ML Infrastructure Engineer

    AI / ML Infrastructure Engineer

    Syntricate TechnologiesConcord, CA, United States
    Full-time
    Grant St Concord California 94520 (3 days onsite in week).Lead and design the platform and infrastructure architecture for AIML and NLP in modern hybrid cloud computing. Participate in day-to-day st...Show moreLast updated: 30+ days ago
    • Promoted
    Global Infrastructure Engineer

    Global Infrastructure Engineer

    METANewark, CA, United States
    Full-time
    The Site Operations team is responsible for the delivery of data center compute and storage at Meta, enabling our family of apps and services to support a growing global community.We are seeking a ...Show moreLast updated: 23 hours ago
    • Promoted
    ML Infrastructure Engineer in Menlo Park

    ML Infrastructure Engineer in Menlo Park

    Energy Jobline ZRMenlo Park, CA, United States
    Full-time +1
    Energy Jobline is the largest and fastest growing global Energy Job Board and Energy Hub.We have an audience reach of over 7 million energy professionals, 400,000+ monthly advertised global energy ...Show moreLast updated: 23 hours ago
    • Promoted
    Senior Software Engineer - ML Infrastructure

    Senior Software Engineer - ML Infrastructure

    PlaidSan Francisco, CA, United States
    Full-time
    Plaid is evolving into an AI-first company, where data and machine learning are the key enablers of smarter, more secure insight products built on top of Plaid's vast financial data network.The Mac...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Infrastructure Engineer

    Senior Infrastructure Engineer

    PumpSan Francisco, CA, United States
    Full-time
    Cloud spend is a whopping $500 billion / yr, the biggest growing expense category for any tech company - tackling these costs requires continuous effort and time from DevOps teams.Pump is a building ...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Software Engineer, ML Infrastructure

    Senior Software Engineer, ML Infrastructure

    LMArenaSan Francisco, CA, United States
    Full-time
    Senior Software Engineer, ML Infrastructure.Senior Software Engineer (Infrastructure).In this role, you'll architect systems that capture and process large volumes of serving requests in real time,...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Infrastructure Engineer

    Senior Infrastructure Engineer

    AngelListSan Francisco, CA, United States
    Full-time
    We exist to accelerate innovation.We do this by giving more people the opportunity to participate in the venture economy by building the financial infrastructure that makes it possible for more peo...Show moreLast updated: 23 hours ago
    • Promoted
    AIML - Staff ML Infrastructure Engineer, ML Platform & Technology - ML Compute

    AIML - Staff ML Infrastructure Engineer, ML Platform & Technology - ML Compute

    AppleSan Francisco, CA, United States
    Full-time
    Apple is where individual imaginations gather together, committing to the values that lead to great work.Every new product we build, service we create, or Apple Store experience we deliver is the r...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Infrastructure Engineer

    Senior Infrastructure Engineer

    DigitalOceanSan Francisco, CA, United States
    Full-time
    Dive in and do the best work of your career at DigitalOcean.Journey alongside a strong community of top talent who are relentless in their drive to build the simplest scalable cloud.If you have a g...Show moreLast updated: 23 hours ago
    • Promoted
    Senior Software Engineer - ML Infrastructure in San Francisco

    Senior Software Engineer - ML Infrastructure in San Francisco

    Energy Jobline ZRSan Francisco, CA, United States
    Full-time
    Energy Jobline is the largest and fastest growing global Energy Job Board and Energy Hub.We have an audience reach of over 7 million energy professionals, 400,000+ monthly advertised global energy ...Show moreLast updated: 23 hours ago
    • Promoted
    • New!
    Senior Kubernetes & Infrastructure Engineer

    Senior Kubernetes & Infrastructure Engineer

    Third Wave AutomationUnion City, CA, United States
    Full-time
    Third Wave Automation is a rapidly growing startup that has demonstrated its core technology components, proven its market fit, and just closed its Series C funding. If you are excited about cutting...Show moreLast updated: 1 hour ago
    • Promoted
    MTS, Infrastructure Engineer

    MTS, Infrastructure Engineer

    DelphinaSan Francisco, CA, United States
    Full-time
    Today's Data Scientists are in pain - spending their time manually wrangling data, building models through slow trial and error, taking on painstaking rewrites for deployment, and dealing with coun...Show moreLast updated: 23 hours ago
    • Promoted
    AIML - Core Infrastructure Engineering, Core Infrastructure

    AIML - Core Infrastructure Engineering, Core Infrastructure

    AppleSan Francisco, CA, United States
    Full-time
    Do you want to make Apple products smarter for our users? The AIML Core Infra team is looking for an experienced software engineer to work on core infrastructure for information intelligence at App...Show moreLast updated: 23 hours ago
    • Promoted
    ML Infrastructure Engineer

    ML Infrastructure Engineer

    PhizenixMenlo Park, CA, United States
    Full-time +1
    Menlo Park, CA | On-Site | Full-Time / Direct Hire.Looking for ML Infra experts (Bay Area preferred) with deep experience in CUDA, GPU optimization, VLLMs, and LLM inference-pure language focus, no v...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Infrastructure Engineer

    Senior Infrastructure Engineer

    Recruiting from ScratchSan Francisco, CA, United States
    Full-time
    Who is Recruiting from Scratch : .Recruiting from Scratch is a specialized talent firm dedicated to helping companies build exceptional teams. We partner closely with our clients to deeply understand ...Show moreLast updated: 23 hours ago
    • Promoted
    Senior Applied AI Engineer - ML for Systems & Infrastructure

    Senior Applied AI Engineer - ML for Systems & Infrastructure

    DatabricksSan Francisco, CA, United States
    Full-time
    As a Senior Applied AI Engineer at Databricks, you will apply machine learning, scheduling and optimization algorithms to improve the efficiency and performance of our engineering systems and infra...Show moreLast updated: 30+ days ago