Talent.com
Inference Software Engineer
Inference Software EngineerEtched • San Jose, California, United States
Inference Software Engineer

Inference Software Engineer

Etched • San Jose, California, United States
30+ days ago
Job type
  • Full-time
Job description

About Etched

Etched is building AI chips that are hard-coded for individual model architectures. Our first product (Sohu) only supports transformers, but has an order of magnitude more throughput and lower latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video generation models and extremely deep & parallel chain-of-thought reasoning agents.

Job Summary

Etched’s Inference SW team enables optimal mapping of models to Sohu’s dataflow architecture and serving requests across multiple chips, hosts and racks. We are seeking a highly skilled and motivated engineer to join our team as we work towards enabling Mixture-of-Experts (MoE) architectures on Sohu systems. You’ll build SW enabling frontier inference performance to satisfy exponentially growing serving demand.

This role is for a general contributor and will be expected to contribute to all parts of our stack. We also have more specialized needs for this team posted on the site.

Key responsibilities

Support porting state-of-the-art models to our architecture. Help build programming abstractions and testing capabilities to rapidly iterate on model porting

Scale and enhance Sohu’s runtime, including multi-node inference, intra-node execution, state management, and robust error handling

Optimize routing and communication layers using Sohu’s collectives

Develop tools for performance profiling and debugging, identifying bottlenecks and correctness issues

You may be a good fit if you have

Proficiency in Rust and / or C++

Good familiarity with PyTorch and / or JAX.

Good familiarity with transformers architectures

Ported applications to non-standard or accelerator hardware platforms.

Solid systems knowledge, including Linux internals, accelerator architectures (e.g., GPUs, TPUs), and high-speed interconnects (e.g., NVLink, InfiniBand)

Strong candidates may also have experience with

Developed low-latency, high-performance applications using both kernel-level and user-space networking stacks.

Deep understanding of distributed systems concepts, algorithms, and challenges, including consensus protocols, consistency models, and communication patterns.

Solid grasp of large language model architectures, particularly Mixture-of-Experts (MoE).

Experience analyzing performance traces and logs from distributed systems and ML workloads.

Built applications with extensive SIMD (Single Instruction, Multiple Data) optimizations for performance-critical paths.

Familiar with cluster orchestration tools (e.g., Kubernetes, Slurm) and ML platforms (e.g., Ray, Kubeflow)

Experience designing and implementing CI / CD pipelines for MLOps workflows.

Benefits

Full medical, dental, and vision packages, with generous premium coverage

Housing subsidy of $2,000 / month for those living within walking distance of the office

Daily lunch and dinner in our office

Relocation support for those moving to West San Jose

Compensation Range

$175,000 - $275,000

How we’re different

Etched believes in the Bitter Lesson . We think most of the progress in the AI field has come from using more FLOPs to train and run models, and the best way to get more FLOPs is to build model-specific hardware. Larger and larger training runs encourage companies to consolidate around fewer model architectures, which creates a market for single-model ASICs.

We are a fully in-person team in West San Jose, and greatly value engineering skills. We do not have boundaries between engineering and research, and we expect all of our technical staff to contribute to both as needed.

Create a job alert for this search

Software Engineer • San Jose, California, United States

Related jobs
AI Software Engineer

AI Software Engineer

Unitq • San Francisco, California, United States
Full-time
Q is a game-changing AI SaaS platform that empowers companies to build the world’s best products by leveraging real-time customer feedback to improve product quality and drive growth.Q’s leading AI...Show more
Last updated: 30+ days ago • Promoted
Software Engineer

Software Engineer

Talent Depot • San Francisco, California, United States
Full-time
Now Hiring : Software Builder – San Francisco (In-Person | Full-Time).Full-Time | Mid–Senior Level | Immediate On-Site Start Required. San Francisco, CA (On-Site Only).In person, 3–5 days per week (p...Show more
Last updated: 30+ days ago • Promoted
Software Engineer - Analytics & AI

Software Engineer - Analytics & AI

Cxapp Us, Inc. • San Ramon, California, United States
Full-time
At CXApp, we are the innovators of Indoor Intelligence, delivering actionable insights for people, places and things.Our flagship product the “CXApp” is a workplace experience platform for the ente...Show more
Last updated: 30+ days ago • Promoted
Software Engineer - Observability

Software Engineer - Observability

Snowflake • Menlo Park, California, United States
Full-time
The Observability team at Snowflake is in charge of building an extensible, self-service Observability platform that reliably collects and serves telemetry data such as metrics, logs, traces to bot...Show more
Last updated: 30+ days ago • Promoted
Software Engineer, Platform

Software Engineer, Platform

Attain • Redwood City, California, United States
Full-time
Built for consumers and companies, alike.In a world driven by data, we believe consumers and businesses can coexist.Our founders had a vision to empower consumers to leverage their greatest asset—t...Show more
Last updated: 30+ days ago • Promoted
Software Engineer, Enterprise AI

Software Engineer, Enterprise AI

Scale AI, Inc. • San Francisco, CA, United States
Full-time
Scale GP (Scale Generative AI Platform) is an enterprise-grade Generative AI platform that provides APIs for knowledge retrieval, inference, evaluation, and more. We are looking for a strong enginee...Show more
Last updated: 30+ days ago • Promoted
Software Engineer II, AI Box

Software Engineer II, AI Box

Box • Redwood City, California, United States
Full-time
Box (NYSE : BOX) is the leader in Intelligent Content Management.Our platform enables organizations to fuel collaboration, manage the entire content lifecycle, secure critical content, and transform ...Show more
Last updated: 30+ days ago • Promoted
Senior Software Engineer - Machine Learning Platform

Senior Software Engineer - Machine Learning Platform

Snowflake • Menlo Park, California, United States
Full-time
The Snowflake Machine Learning Platform team’s mission is to enable customers to bring their machine learning and deep learning workloads to Snowflake. Our customers want to build powerful models wi...Show more
Last updated: 30+ days ago • Promoted
Sr. Software Engineer

Sr. Software Engineer

Personalis • Fremont, California, United States
Full-time
At Personalis, we are transforming the active management of cancer through breakthrough personalized testing.We aim to drive a new paradigm for cancer management, guiding care from biopsy through t...Show more
Last updated: 30+ days ago • Promoted
Sr. Staff Software Engineer, AI Infra

Sr. Staff Software Engineer, AI Infra

Linkedin • Mountain View, California, United States
Full-time
LinkedIn is the worlds largest professional network, built to create economic opportunity for every member of the global workforce. Our products help people make powerful connections, discover excit...Show more
Last updated: 30+ days ago • Promoted
Software Engineer, Machine Learning Infrastructure

Software Engineer, Machine Learning Infrastructure

Datologyai • Redwood City, California, United States
Full-time
Companies want to train their own large models on their own data.The current industry standard is to train on a random sample of your data, which is inefficient at best and actively harmful to mode...Show more
Last updated: 30+ days ago • Promoted
AI Software Engineer

AI Software Engineer

Rattle • San Francisco, California, United States
Full-time
Rattle is building the first AI-powered Revenue Intelligence Platform, solving the most critical problem in B2B sales : 75% of companies miss their revenue forecasts because the entire revenue tech ...Show more
Last updated: 30+ days ago • Promoted
Senior Software Engineer - Machine Learning

Senior Software Engineer - Machine Learning

Celonis • Redwood City, California, United States
Full-time
We're Celonis, the global leader in Process Intelligence technology and one of the world's fastest-growing SaaS firms.We believe there is a massive opportunity to unlock productivity by placing AI,...Show more
Last updated: 1 day ago • Promoted
Sr Software Engineer - AI

Sr Software Engineer - AI

The Trade Desk • San Francisco, CA, United States
Full-time
The Trade Desk is a global technology company with a mission to create a better, more open internet for everyone through principled, intelligent advertising. Handling over 1 trillion queries per day...Show more
Last updated: 30+ days ago • Promoted
Software Engineer, Consensus -

Software Engineer, Consensus -

Yeah! Global • San Francisco, California, United States
Full-time
Design, implement, and improve mechanisms to enhance the throughput and stability of the Solana network.Create and refine algorithms to ensure fair and efficient block production among validators.I...Show more
Last updated: 30+ days ago • Promoted
AI Software Engineer, Search

AI Software Engineer, Search

Nexus • San Francisco, California, United States
Full-time
Nexus is building a world supercomputer by leveraging the latest advancements in cryptography, engineering, and science.Our team of experts is developing and deploying the Nexus Layer 1, the Nexus ...Show more
Last updated: 30+ days ago • Promoted
Software Engineer - Machine Learning Platform

Software Engineer - Machine Learning Platform

Snowflake • Menlo Park, California, United States
Full-time
The Snowflake Machine Learning Platform team’s mission is to enable customers to bring their ML / AI workload to Snowflake. Our customers want to leverage ML / AI to extract business values from ever in...Show more
Last updated: 30+ days ago • Promoted
Software Engineer

Software Engineer

Brevian • Sunnyvale, California, United States
Full-time
BREV / AN is at the forefront of revolutionizing how businesses leverage artificial intelligence.Brevian empowers teams with real-time intelligence, automated workflows, and seamless execution.We are...Show more
Last updated: 30+ days ago • Promoted