Talent.com

Performance engineer Jobs in Berkeley, CA

Create a job alert for this search

Performance engineer • berkeley ca

Last updated: 2 hours ago
Machine Learning Engineer - Model Performance

Machine Learning Engineer - Model Performance

InferenceSan Francisco, California, United States
Full-time
Machine Learning Engineer to join our team, focusing on optimizing the performance of our cutting-edge AI inference systems. This role involves working with state-of-the-art large language models an...Show moreLast updated: 29 days ago
Senior AI Performance Engineer

Senior AI Performance Engineer

GenmoSan Francisco, California, United States
Full-time
We are Genmo, a research lab dedicated to building open, state-of-the-art models for video generation towards unlocking the right brain of AGI. Join us in shaping the future of AI and pushing the bo...Show moreLast updated: 30+ days ago
  • Promoted
Senior Propulsion and Aircraft Performance Engineer

Senior Propulsion and Aircraft Performance Engineer

PykaAlameda, CA, US
Full-time
Senior Propulsion And Aircraft Performance Engineer.Pyka is looking for a highly skilled Senior Propulsion and Aircraft Performance Engineer to lead the design, testing, and optimization of our pro...Show moreLast updated: 13 days ago
  • Promoted
AI Agent Software Engineer - Agent Performance Engineering

AI Agent Software Engineer - Agent Performance Engineering

AssembledSan Francisco, CA, US
Full-time
Agent Performance Engineering Role.As part of the Agent Performance Engineering team, you'll be working on the core systems that make our AI agents smarter, more accurate, and more capable of handl...Show moreLast updated: 13 days ago
  • Promoted
Performance Engineer

Performance Engineer

AnthropicSan Francisco, CA, US
Full-time
Research Engineer, Frontier Red Team (Rsp Evaluations).San Francisco, CA | Seattle, WA.Research Scientist, Frontier Red Team (Autonomy). Remote-Friendly (Travel-Required) | San Francisco, CA | Seatt...Show moreLast updated: 16 days ago
Senior HPC Performance Engineer

Senior HPC Performance Engineer

NVIDIARemote, CA, US
Remote
Full-time
As a member of our team in NVIDIA's NVHPC compilers & tools group, you will analyze and run High Performance Computing (HPC) applications on HPC servers and systems to gain insight into the per...Show moreLast updated: 30+ days ago
Sr Building Performance Engineer

Sr Building Performance Engineer

HGASan Francisco, CA, US
Full-time
HGA is an award winning architectural, engineering and planning firm with a full-time opportunity for a talented, ambitious. HGA's Building Performance presence in the West Coast.We define a buildin...Show moreLast updated: 30+ days ago
  • Promoted
Principal AI Performance Engineer

Principal AI Performance Engineer

Epoch BiodesignSan Francisco, CA, United States
Full-time
Crusoe is building the World’s Favorite AI-first Cloud infrastructure company.We’re pioneering vertically integrated, purpose-built AI infrastructure solutions trusted by Fortune 500 companies to p...Show moreLast updated: 4 days ago
  • Promoted
Performance Analyst

Performance Analyst

CallanSan Francisco, CA, US
Full-time
Callan is seeking a new team member to join our Client Report Services (CRS) team in the following Callan locations : Portland, OR, or San Francisco headquarters. As a Performance Analyst in the Clie...Show moreLast updated: 13 days ago
  • Promoted
Principal AI Performance Engineer

Principal AI Performance Engineer

CrusoeSan Francisco, CA, US
Full-time
Crusoe is building the Worlds Favorite AI-first Cloud infrastructure company.Were pioneering vertically integrated, purpose-built AI infrastructure solutions trusted by Fortune 500 companies to po...Show moreLast updated: 17 days ago
QE Lead Performance Engineer

QE Lead Performance Engineer

US012 Marsh & McLennan Agency LLCCalifornia,San Francisco
Full-time
Award-winning, inclusive, Top Workplace culture doesn’t happen overnight.It’s a result of hard work by extraordinary people. The industry’s brightest talent drive our efforts to deliver purposeful w...Show moreLast updated: 30+ days ago
  • Promoted
  • New!
Lead Performance Tester / Engineer

Lead Performance Tester / Engineer

Diverse LynxSan Francisco, CA, US
Full-time
Design and implement comprehensive performance testing strategies for web, mobile, and backend systems.Develop, maintain, and execute performance test scripts using LoadRunner and NeoLoad and famil...Show moreLast updated: 2 hours ago
  • Promoted
Lead CPU Performance Engineer

Lead CPU Performance Engineer

VirtualVocationsOakland, California, United States
Full-time
A company is looking for a Lead CPU Performance Analysis Engineer.Key Responsibilities Conduct performance analysis of CPU architectures and designs Develop and implement performance modeling to...Show moreLast updated: 3 days ago
  • Promoted
Machine Learning Engineer - Model Performance

Machine Learning Engineer - Model Performance

SOLANA FOUNDATIONSan Francisco, CA, United States
Full-time
Machine Learning Engineer to join our team, focusing on optimizing the performance of our cutting-edge AI inference systems. This role involves working with state-of-the-art large language models an...Show moreLast updated: 12 days ago
  • Promoted
Sr. Performance Engineer San Francisco, California

Sr. Performance Engineer San Francisco, California

Databricks Inc.San Francisco, CA, US
Full-time
At Databricks, we are passionate about enabling data teams to solve the world's toughest problems.We do this by building and running the world's best data and AI infrastructure platform so our cust...Show moreLast updated: 30+ days ago
Sr. Software Engineer - Performance

Sr. Software Engineer - Performance

DatabricksSan Francisco, California
Full-time
At Databricks, we are passionate about enabling data teams to solve the world's toughest problems.We do this by building and running the world's best data and AI infrastructure platform so our cust...Show moreLast updated: 30+ days ago
  • Promoted
Performance engineer

Performance engineer

WriterSan Francisco, CA, US
Full-time
Writer is seeking a highly skilled and motivated Principal Performance Engineer to lead the performance optimization of our cutting-edge Generative AI technology stack. This role is critical in ensu...Show moreLast updated: 17 days ago
  • Promoted
Performance Engineer

Performance Engineer

OpenAISan Francisco, CA, US
Full-time
OpenAI is looking for an experienced Performance Engineer to help us scale the performance, reliability, and efficiency of our systems. In this role, you'll apply deep technical expertise to optimiz...Show moreLast updated: 16 days ago
  • Promoted
Performance engineer

Performance engineer

writer.comSan Francisco, CA, US
Full-time
Writer is seeking a highly skilled and motivated Principal performance engineer to lead the performance optimization of our cutting-edge Generative AI technology stack. This role is critical in ensu...Show moreLast updated: 30+ days ago
  • Promoted
performance Engineer / tester

performance Engineer / tester

Omega Solutions Inc.San Francisco, CA, US
Full-time
This is Ashok from Omega solutions.This is regarding an immediate opening for Performance Engineer.Please find the below description and let me know your interest. Designs, configures and runs perfo...Show moreLast updated: 30+ days ago
People also ask
Machine Learning Engineer - Model Performance

Machine Learning Engineer - Model Performance

InferenceSan Francisco, California, United States
29 days ago
Job type
  • Full-time
Job description

Inference.net is seeking a Machine Learning Engineer to join our team, focusing on optimizing the performance of our cutting-edge AI inference systems. This role involves working with state-of-the-art large language models and ensuring they run efficiently and effectively at scale. You will be responsible for deploying state-of-the-art models at scale and performing optimizations to increase throughput and enable new features. This position offers the chance to collaborate closely with our engineering team and make significant contributions to open source projects, like SGLang and vLLM.

About Inference.net

We are building a distributed LLM inference network that combines idle GPU capacity from around the world into a single cohesive plane of compute that can be used for running large-language models like DeepSeek and Llama 4. At any given moment, we have over 5,000 GPUs and hundreds of terabytes of VRAM connected to the network.

We are a small, well-funded team working on difficult, high-impact problems at the intersection of AI and distributed systems. We primarily work in-person from our office in downtown San Francisco. Our investors include A16z CSX and Multicoin. We are high-agency, adaptable, and collaborative. We value creativity alongside technical prowess and humility. We work hard, and deeply enjoy the work that we do.

Responsibilities

Design and implement optimization techniques to increase model throughput and reduce latency across our suite of models

Deploy and maintain large language models at scale in production environments

Deploy new models as they are released by frontier labs

Implement techniques like quantization, speculative decoding, and KV cache reuse

Contribute regularly to open source projects such as SGLang and vLLM

Deep dive into underlying codebases of TensorRT, PyTorch, TensorRT-LLM, vLLM, SGLang, CUDA, and other libraries to debug ML performance issues

Collaborate with the engineering team to bring new features and capabilities to our inference platform

Develop robust and scalable infrastructure for AI model serving

Create and maintain technical documentation for inference systems

Requirements

3+ years of experience writing high-performance, production-quality code

Strong proficiency with Python and deep learning frameworks, particularly PyTorch

Demonstrated experience with LLM inference optimization techniques

Hands-on experience with SGLang and vLLM, with contributions to these projects strongly preferred

Familiarity with Docker and Kubernetes for containerized deployments

Experience with CUDA programming and GPU optimization

Strong understanding of distributed systems and scalability challenges

Proven track record of optimizing AI models for production environments

Nice to Have

Familiarity with TensorRT and TensorRT-LLM

Knowledge of vision models and multimodal AI systems

Experience implementing techniques like quantization and speculative decoding

Contributions to open source machine learning projects

Experience with large-scale distributed computing

Compensation

We offer competitive compensation, equity in a high-growth startup, and comprehensive benefits. The base salary range for this role is $180,000 - $250,000, plus competitive equity and benefits including :

Full healthcare coverage

Quarterly offsites

Flexible PTO

Equal Opportunity

Inference.net is an equal opportunity employer. We welcome applicants from all backgrounds and don't discriminate based on race, color, religion, gender, sexual orientation, national origin, genetics, disability, age, or veteran status.

If you're passionate about building the next generation of high-performance systems that push the boundaries of what's possible with large language models, we want to hear from you!