Talent.com
Machine Learning Co-Design Researcher
Machine Learning Co-Design ResearcherETCHED LLC • San Jose, CA, United States
Machine Learning Co-Design Researcher

Machine Learning Co-Design Researcher

ETCHED LLC • San Jose, CA, United States
30+ days ago
Job type
  • Full-time
Job description

About Etched

Etched is building the world's first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically lower cost and latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video generation models and extremely deep & parallel chain-of-thought reasoning agents. Backed by hundreds of millions from top-tier investors and staffed by leading engineers, Etched is redefining the infrastructure layer for the fastest growing industry in history.

Key responsibilities

  • Translate core mathematical operations from transformer models into optimized operation sequences for Sohu
  • Develop and leverage a deep understanding of Sohu to co-design both HW instructions and model architecture operations to maximize model performance
  • Implement high-performance software components for the Model Toolkit
  • Collaborate with hardware engineers to maximize chip utilization and minimize latency
  • Implement efficient batching strategies and execution plans for inference workloads
  • Design and implement cutting edge inference time compute scaling methods
  • Alter and fine-tune model architectures or inference time compute algorithms
  • Contribute to the evolution of our system architecture and programming model

Representative projects

  • Optimize operation sequences to maximize Sohu's computational resources for specific transformer architectures such as Llama 4.
  • Research and implement efficient memory management for KV cache sharing and prefix optimization
  • Develop algorithms for continuous batching and batch interleaving to improve throughput and / or latency
  • Research and implement model-specific inference-time acceleration algorithms such as speculative decoding, tree search, KV cache sharing, priority scheduling, etc by interacting with the rest of the inference serving stack
  • Research and implement structured decoding and novel sampling algorithms for reasoning models
  • You may be a good fit if you have

  • Co-design expertise across both SW and HW domains
  • Strong software engineering skills with systems programming experience
  • Deep knowledge of transformer model architectures and / or inference serving stacks (vLLM, SGLang, etc.)
  • Strong mathematical skills, esp. in linear algebra
  • Ability to reason about performance bottlenecks and optimization opportunities
  • Experience working cross-functionally in diverse software and hardware organizations
  • Strong candidates may also have experience with

  • Experience with hardware accelerators, ASICs, or FPGAs
  • Experience with Rust programming language
  • Deep expertise in ML systems engineering and hardware / software co-design with demonstrated impact (contributions to open-source projects or published papers)
  • Track record of optimizing large co-designed SW / HW systems
  • Benefits

  • Full medical, dental, and vision packages, with generous premium coverage
  • Housing subsidy of $2,000 / month for those living within walking distance of the office
  • Daily lunch and dinner in our office
  • Relocation support for those moving to West San Jose
  • Compensation Range

  • $150,000 - $275,000
  • How we're different

    Etched believes in the Bitter Lesson. We think most of the progress in the AI field has come from using more FLOPs to train and run models, and the best way to get more FLOPs is to build model-specific hardware. Larger and larger training runs encourage companies to consolidate around fewer model architectures, which creates a market for single-model ASICs.

    We are a fully in-person team in West San Jose, and greatly value engineering skills. We do not have boundaries between engineering and research, and we expect all of our technical staff to contribute to both as needed.

    Create a job alert for this search

    Machine Learning Researcher • San Jose, CA, United States

    Related jobs
    Machine Learning Researcher

    Machine Learning Researcher

    OpenReq • Cupertino, CA, United States
    Full-time
    With Etched ASICs, we have fundamentally different constraints than existing AI chips.We can parallelize workloads and digest. Sohu enables entirely new research directions and products.When our chi...Show more
    Last updated: 30+ days ago • Promoted
    Machine Learning Research Engineer (1 Year Fixed Term)

    Machine Learning Research Engineer (1 Year Fixed Term)

    Stanford University School of Medicine • Palo Alto, CA, US
    Full-time +1
    Machine Learning Research Engineer (1 Year Fixed Term) Join to apply for the Machine Learning Research Engineer (1 Year Fixed Term) role at Stanford University School of Medicine Machine Learni...Show more
    Last updated: 10 days ago • Promoted
    Machine Learning Operations Contractor

    Machine Learning Operations Contractor

    Coherent Corp. • Fremont, CA, US
    Full-time
    Overview Join to apply for the Machine Learning Operations Contractor role at Coherent Corp.This range is provided by Coherent Corp. Your actual pay will be based on your skills and experience — tal...Show more
    Last updated: 8 days ago • Promoted
    Lead Machine Learning Engineer, Recommender Systems

    Lead Machine Learning Engineer, Recommender Systems

    HP IQ • Palo Alto, California, United States
    Full-time
    HP IQ is HP’s new AI innovation lab.Combining startup agility with HP’s global scale, we’re building intelligent technologies that redefine how the world works, creates, and collaborates.We’re asse...Show more
    Last updated: 30+ days ago • Promoted
    Machine Learning Engineer, End-to-end Autonomy

    Machine Learning Engineer, End-to-end Autonomy

    Woven by Toyota • Palo Alto, CA, US
    Full-time
    Machine Learning Engineer, End-to-end Autonomy Join to apply for the Machine Learning Engineer, End-to-end Autonomy role at Woven by Toyota. Woven by Toyota is enabling Toyota's once-in-a-century...Show more
    Last updated: 10 days ago • Promoted
    Machine Learning Engineer, Compute

    Machine Learning Engineer, Compute

    Waymo • Mountain View, California, United States
    Full-time
    Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on buildin...Show more
    Last updated: 30+ days ago • Promoted
    Robustness Analytics Researcher / User Study Researcher

    Robustness Analytics Researcher / User Study Researcher

    MindSource • Cupertino, CA, United States
    Temporary
    Please find below details of the spotlight call & discussion about 31517319 Job ID : .User Study Researcher IV / Robustness Analytics Researcher). Remote but from - (1) Seattle - WA, (2) NY - NYC, (3)...Show more
    Last updated: 25 days ago • Promoted
    AI Researcher Multimodal Physiological Modeling

    AI Researcher Multimodal Physiological Modeling

    KOS AI • Palo Alto, CA, United States
    Full-time
    AI Researcher Multimodal Physiological ModelingLocation : San Francisco Bay Area Company : KOS AI Role Type : Full Time Start Date : January 2025About UsKOS AI is developing the next generation of non-...Show more
    Last updated: 23 hours ago • Promoted
    Machine Learning Engineer

    Machine Learning Engineer

    Instrumental Inc. • Palo Alto, CA, United States
    Full-time
    Machine Learning Engineer (Computer Vision).We are looking for a customer-focused ML Engineer to help build and scale our end-to-end ML pipeline. You’ll balance research and productization in a fast...Show more
    Last updated: 30+ days ago • Promoted
    AI Engineer & Researcher - Product Analytics & Experimentation Design

    AI Engineer & Researcher - Product Analytics & Experimentation Design

    Xai • Palo Alto, California, United States
    Full-time
    AI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excelle...Show more
    Last updated: 30+ days ago • Promoted
    Machine Learning Operations Contractor

    Machine Learning Operations Contractor

    Coherent • Fremont, CA, United States
    Full-time
    Machine Learning Operations (MLOps) Contractor.The emphasis will be on yield improvement, screening accuracy, and design optimization. The MLOps Contractor will develop and validate models based on ...Show more
    Last updated: 30+ days ago • Promoted
    Machine learning engineer (Robotics)

    Machine learning engineer (Robotics)

    Dexmate • Santa Clara, California, United States
    Full-time
    We are an early-stage robotics startup working on building multi-purpose mobile robots that can do complex manipulation tasks. We are looking for a creative, skilled, and motivated engineers to join...Show more
    Last updated: 30+ days ago • Promoted
    Machine Learning Research Engineer

    Machine Learning Research Engineer

    Etched • Cupertino, CA, United States
    Full-time
    Etched is building AI chips that are hard-coded for individual model architectures.Our first product (Sohu) only supports transformers, but has an order of magnitude more throughput and lower laten...Show more
    Last updated: 10 days ago • Promoted
    Machine Learning Researcher / Engineer (Foundational Models)

    Machine Learning Researcher / Engineer (Foundational Models)

    Pathway • Palo Alto, California, United States
    Full-time +1
    At Pathway we are shaking the foundations of artificial intelligence by introducing the world’s first post-transformer model that adapts and thinks just like humans. Our breakthrough architecture ou...Show more
    Last updated: 12 days ago • Promoted
    Machine Learning Research Engineer, Text Generation

    Machine Learning Research Engineer, Text Generation

    Apple Inc. • Cupertino, CA, United States
    Full-time
    Machine Learning Research Engineer, Text Generation.Cupertino, California, United States Machine Learning and AI.Apple is where individual imaginations gather together, committing to the values tha...Show more
    Last updated: 2 days ago • Promoted
    Kim Lab : Research Specialist

    Kim Lab : Research Specialist

    University of California - Santa Cruz • Santa Cruz, CA, United States
    Full-time
    Kim Lab : Research Specialist (Junior - Associate Ranks) .Commensurate with qualifications and experience (see section.Represented Specialist Series Fiscal Year. A salary that is higher than the pu...Show more
    Last updated: 30+ days ago • Promoted
    Machine Learning Researcher / ML-Ops Engineer

    Machine Learning Researcher / ML-Ops Engineer

    Rivet Industries, Inc. • Palo Alto, CA, United States
    Full-time
    Machine Learning Researcher / ML-Ops Engineer.Rivet is an American company building integrated task systems — fusing hardened hardware with software, sensors, AI, and networking — for industrial wo...Show more
    Last updated: 30+ days ago • Promoted
    Machine Learning Research Scientist : Generative Modeling for Planning

    Machine Learning Research Scientist : Generative Modeling for Planning

    Nuro • Mountain View, CA, United States
    Full-time
    Nuro is a self-driving technology company on a mission to make autonomy accessible to all.Founded in 2016, Nuro is building the world's most scalable driver, combining cutting-edge AI with automoti...Show more
    Last updated: 30+ days ago • Promoted