Machine learning Empleos en Berkeley ca
Crear una alerta de empleo para esta búsqueda
Machine learning • berkeley ca
- Oferta promocionada
Machine Learning Engineer
Scouto AISan Francisco, California, United States- Oferta promocionada
Machine Learning Engineer
ScribdSan Francisco, California, United States- Oferta promocionada
Machine Learning Engineer
GoodfireSan Francisco, California, United States- Oferta promocionada
Machine Learning Engineer
Avatar RoboticsSan Francisco, California, United States- Oferta promocionada
Machine Learning Engineer
Willing TechSan Francisco, California, United States- Oferta promocionada
Machine Learning Engineer
Happy ElementsSan Francisco, California, United States- Oferta promocionada
Machine Learning Engineer
Two Dots IncSan Francisco, CA, United States- Oferta promocionada
Machine Learning Engineer
Monarch RecruitersSan Francisco, California, United States- Oferta promocionada
Machine Learning Engineer
AkkodisSan Francisco, California, United States- Oferta promocionada
Machine Learning Researcher
AntlerSan Francisco, CA, United States- Oferta promocionada
Machine Learning Engineer
BaselayerSan Francisco, California, United States- Oferta promocionada
Machine Learning
VewotechnologiesSan Francisco, CA, United States- Oferta promocionada
Machine Learning Engineer
AquabyteSan Francisco, California, United States- Oferta promocionada
Machine Learning Engineer
MiddeskSan Francisco, California, United States- Oferta promocionada
Machine Learning Engineer
VirtualVocationsSan Francisco, California, United States- Oferta promocionada
Machine Learning
Pantera CapitalSan Francisco, CA, United States- Oferta promocionada
Machine Learning Engineer
Bland.ai, Inc.San Francisco, CA, United StatesMachine Learning Engineer
Reveal Health TechSan Francisco, CA, US- Oferta promocionada
Machine Learning Researcher
Alljoined, Inc.San Francisco, CA, United StatesMachine Learning Engineer
Scouto AISan Francisco, California, United States- A tiempo completo
Overview
We are building a distributed LLM inference network that combines idle GPU capacity from around the world into a single cohesive plane of compute for running large-language models like DeepSeek and Llama 4. At any given moment, we have over 5,000 GPUs and hundreds of terabytes of VRAM connected to the network. We are a small, well-funded team working on difficult, high-impact problems at the intersection of AI and distributed systems. We primarily work in-person from our office in downtown San Francisco.
Responsibilities
Design and implement optimization techniques to increase model throughput and reduce latency across our suite of models
Deploy and maintain large language models at scale in production environments
Deploy new models as they are released by frontier labs
Implement techniques like quantization, speculative decoding, and KV cache reuse
Contribute regularly to open source projects such as SGLang and vLLM
Deep dive into underlying codebases of TensorRT, PyTorch, TensorRT-LLM, vLLM, SGLang, CUDA, and other libraries to debug ML performance issues
Collaborate with the engineering team to bring new features and capabilities to our inference platform
Develop robust and scalable infrastructure for AI model serving
Create and maintain technical documentation for inference systems
Requirements
3+ years of experience writing high-performance, production-quality code
Strong proficiency with Python and deep learning frameworks, particularly PyTorch
Demonstrated experience with LLM inference optimization techniques
Hands-on experience with SGLang and vLLM, with contributions to these projects strongly preferred
Familiarity with Docker and Kubernetes for containerized deployments
Experience with CUDA programming and GPU optimization
Strong understanding of distributed systems and scalability challenges
Proven track record of optimizing AI models for production environments
Nice to Have
Familiarity with TensorRT and TensorRT-LLM
Knowledge of vision models and multimodal AI systems
Experience implementing techniques like quantization and speculative decoding
Contributions to open source machine learning projects
Experience with large-scale distributed computing
Compensation
We offer competitive compensation, equity in a high-growth startup, and comprehensive benefits. The base salary range for this role is $180,000 - $250,000, plus competitive equity and benefits including :
Full healthcare coverage
Quarterly offsites
Flexible PTO
Skills : pytorch, gpu optimization, deep learning frameworks, sglang, vllm, cuda programming, machine learning, python, llm
#J-18808-Ljbffr