Talent.com
Inference Software Engineer
Inference Software EngineerETCHED LLC • Cupertino, CA, United States
Inference Software Engineer

Inference Software Engineer

ETCHED LLC • Cupertino, CA, United States
30+ days ago
Job type
  • Full-time
Job description

About Etched

Etched is building AI chips that are hard-coded for individual model architectures. Our first product (Sohu) only supports transformers, but has an order of magnitude more throughput and lower latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video generation models and extremely deep & parallel chain-of-thought reasoning agents.

Key responsibilities

  • Contribute to the architecture and design of the Sohu host software stack
  • Implement high-performance, modular code across the complete Etched software stack, consisting of a mix of Rust, C++ and Python.
  • Interface with firmware and drivers teams delivering highest-performance HW / SW stack.
  • Work with AI model researchers and product-facing teams building out the Etched serving front-end.

Representative projects

  • Build scheduling logic for handling continuous batching and real time inference
  • Implement inference-time acceleration techniques such as speculative decoding, tree search, KV cache sharing, etc.
  • Implement distributed networking primitives for efficient multi-server inference
  • You may be a good fit if you have

  • Experience with C++ and Python
  • Familiarity with transformer model architectures and inference serving stacks (vLLM, SGLang, etc.) or experience working in distributed inference / training environments
  • Experience working cross-functionally in large software and hardware organizations
  • Strong candidates may also have

  • Experience with Rust
  • Familiarity with GPU kernels, the CUDA compilation stack and related tools, or other hardware accelerators
  • Understanding of distributed systems, networking, and parallel programming
  • Benefits

  • Full medical, dental, and vision packages, with 100% of premium covered
  • Housing subsidy of $2,000 / month for those living within walking distance of the office
  • Daily lunch and dinner in our office
  • Relocation support for those moving to Cupertino
  • How we're different

    Etched believes in the Bitter Lesson. We think most of the progress in the AI field has come from using more FLOPs to train and run models, and the best way to get more FLOPs is to build model-specific hardware. Larger and larger training runs encourage companies to consolidate around fewer model architectures, which creates a market for single-model ASICs.

    We are a fully in-person team in Cupertino, and greatly value engineering skills. We do not have boundaries between engineering and research, and we expect all of our technical staff to contribute to both as needed.

    Create a job alert for this search

    Software Engineer • Cupertino, CA, United States

    Related jobs
    Senior Software Engineer, Inference Platform

    Senior Software Engineer, Inference Platform

    MongoDB • Palo Alto, CA, United States
    Full-time
    We're looking for a Senior Engineer to help build the next-generation inference platform that supports embedding models used for semantic search, retrieval, and AI-native experiences in MongoDB Atl...Show more
    Last updated: 30+ days ago • Promoted
    Senior Software Engineer, Decision Intelligence

    Senior Software Engineer, Decision Intelligence

    NVIDIA Corporation • Santa Clara, CA, United States
    Full-time
    It’s a unique legacy of innovation that’s fueled by great technology—and amazing people.Today, we’re tapping into the unlimited potential of AI to define the next era of computing.An era in which o...Show more
    Last updated: 12 hours ago • Promoted • New!
    Software Engineer

    Software Engineer

    VIVIO, a Public Benefit Corporation • Hayward, CA, US
    Full-time
    VIVIO Health, a Public Benefit Corporation, is revolutionizing pharmacy benefits management through data and technology.Our foundational principle - "The Right Drug for the Right Person at the...Show more
    Last updated: 30+ days ago • Promoted
    Software Engineer - Applied Inference

    Software Engineer - Applied Inference

    Xai • Palo Alto, CA, United States
    Full-time
    AI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excelle...Show more
    Last updated: 12 days ago • Promoted
    Software Engineer, AI Infra Innovation

    Software Engineer, AI Infra Innovation

    Pure Storage • Santa Clara, CA, United States
    Full-time
    We're in an unbelievably exciting area of tech and are fundamentally reshaping the data storage industry.Here, you lead with innovative thinking, grow along with us, and join the smartest team in t...Show more
    Last updated: 12 days ago • Promoted
    Senior Software Engineer, AI Inference Platform

    Senior Software Engineer, AI Inference Platform

    Cerebras Systems • Sunnyvale, CA, United States
    Full-time
    Senior Software Engineer, AI Inference Platform.Sunnyvale, CA or Toronto, Canada.Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer‑scale architecture de...Show more
    Last updated: 19 days ago • Promoted
    Senior Software Engineer II, Agentic AI Platform

    Senior Software Engineer II, Agentic AI Platform

    Moveworks.ai • Mountain View, CA, United States
    Full-time
    Are you up for an exciting challenge? Picture yourself scaling and optimizing a cutting-edge Generative AI product that offers instant assistance to enterprise users. Ever wondered how to apply abst...Show more
    Last updated: 30+ days ago • Promoted
    Senior Software Engineer - AI Agent Infrastructure (Healthcare)

    Senior Software Engineer - AI Agent Infrastructure (Healthcare)

    Honey Health • Fremont, CA, United States
    Full-time
    Honey Health is the all-in-one AI back office for primary and specialty care.Our AI agents autonomously handle core back-office jobs, such as aggregating patients data, processing orders and prescr...Show more
    Last updated: 12 days ago • Promoted
    Software Engineer, Recommendation Architecture ShortText

    Software Engineer, Recommendation Architecture ShortText

    Tik Tok • San Jose, CA, United States
    Full-time
    Our Team We cover almost all short-text recommendation scenarios in TikTok, such as search suggestions, the video-related search bar, and comment entities. Our recommendation system supports persona...Show more
    Last updated: 12 days ago • Promoted
    Senior Software Engineer, AI Inference Platform

    Senior Software Engineer, AI Inference Platform

    CEREBRAS SYSTEMS INC. • Sunnyvale, CA, United States
    Full-time
    Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs.Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programm...Show more
    Last updated: 12 days ago • Promoted
    Software Engineer

    Software Engineer

    Brevian • Sunnyvale, California, United States
    Full-time
    BREV / AN is at the forefront of revolutionizing how businesses leverage artificial intelligence.Brevian empowers teams with real-time intelligence, automated workflows, and seamless execution.We are...Show more
    Last updated: 30+ days ago • Promoted
    Missions Software Engineer (Client Facing)

    Missions Software Engineer (Client Facing)

    Applied Intuition • Sunnyvale, CA, United States
    Full-time
    Applied Intuition is the vehicle intelligence company that accelerates the global adoption of safe, AI-driven machines.Founded in 2017 and now valued at $15 billion following its recent Series F fu...Show more
    Last updated: 12 days ago • Promoted
    AI Software Engineer

    AI Software Engineer

    Right Seat • San Jose, CA, United States
    Permanent
    Qualification to acquire and retain DoD Secret Clearance or Agency Public Trust.Up to 25% nationwide travel and frequent travel to client sites within the D. We help companies streamline operations,...Show more
    Last updated: 30+ days ago • Promoted
    Sr. Software Engineer (25403)

    Sr. Software Engineer (25403)

    Supermicro • San Jose, CA, United States
    Full-time
    Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...Show more
    Last updated: 30+ days ago • Promoted
    Elasticsearch - Principal Software Engineer II - Vector Search

    Elasticsearch - Principal Software Engineer II - Vector Search

    Elastic • Mountain View, CA, United States
    Full-time
    Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale - unleashing the potential of businesses and people.The Elastic Search AI...Show more
    Last updated: 30+ days ago • Promoted
    Software Engineer, Apple Intelligence Model Platform

    Software Engineer, Apple Intelligence Model Platform

    Apple • Cupertino, CA, United States
    Full-time
    The Proactive Intelligence Platform is at the heart of an intelligent system experience that understands you and anticipates your needs. We are building an on-device personal and contextual intellig...Show more
    Last updated: 30+ days ago • Promoted
    Software Engineer 2 - CoreAI

    Software Engineer 2 - CoreAI

    Microsoft Corporation • Mountain View, CA, United States
    Full-time
    You will embed with customers where GenAI application performance matters, delivery is urgent, and ambiguity is the default. You will use this to map their problems, structure delivery, and ship fas...Show more
    Last updated: 1 day ago • Promoted
    Software Engineer, AI Agent

    Software Engineer, AI Agent

    NewsBreak • Mountain View, CA, United States
    Full-time
    Founded in 2015, NewsBreak is the Content Intelligence platform shaping the future content economy.With over 40 million monthly active users, our flagship platform delivers highly personalized loca...Show more
    Last updated: 1 day ago • Promoted