Talent.com
GPU Fleet Engineer – Hyperscale Infra, Kubernetes & AI
GPU Fleet Engineer – Hyperscale Infra, Kubernetes & AIOpenAI • San Francisco, CA, United States
GPU Fleet Engineer – Hyperscale Infra, Kubernetes & AI

GPU Fleet Engineer – Hyperscale Infra, Kubernetes & AI

OpenAI • San Francisco, CA, United States
12 hours ago
Job type
  • Full-time
Job description

Join a forward-thinking company as an engineer in the fleet infrastructure team, where you'll design and operate systems for one of the largest GPU fleets globally. This role offers the chance to work in a hybrid setting while contributing to cutting-edge AI capabilities. Your expertise in hyperscale compute systems and programming will be crucial in shaping infrastructure that supports model deployment and training. Collaborate with diverse teams to ensure high reliability and utilization, all while being part of a mission-driven organization that values safety and human needs in AI development.

#J-18808-Ljbffr

Create a job alert for this search

Fleet Engineer • San Francisco, CA, United States

Related jobs
GPU Infra Performance Engineer - Benchmarks & Trials

GPU Infra Performance Engineer - Benchmarks & Trials

Hyperbolic Labs • San Francisco, CA, United States
Full-time
A tech-driven AI company in San Francisco is seeking an individual to manage customer performance trials, running benchmarks and optimizing configurations. The ideal candidate has strong GPU cloud i...Show more
Last updated: 3 days ago • Promoted
Senior GPU HPC Platform Reliability Engineer

Senior GPU HPC Platform Reliability Engineer

OpenAI • San Francisco, CA, United States
Full-time
A leading AI research company in San Francisco is seeking a software engineer for its Fleet High Performance Computing team. In this role, you'll ensure the reliability and uptime of the compute fle...Show more
Last updated: 12 hours ago • Promoted • New!
Software Engineer, GPU Infrastructure - HPC

Software Engineer, GPU Infrastructure - HPC

OpenAI • San Francisco, CA, United States
Full-time
The Fleet team at OpenAI supports the computing environment that powers our cutting-edge research and product development. We oversee large-scale systems that span data centers, GPUs, networking, an...Show more
Last updated: 30+ days ago • Promoted
Senior GPU Fabric Networking Engineer (Remote) – Equity

Senior GPU Fabric Networking Engineer (Remote) – Equity

NVIDIA Corporation • Santa Clara, CA, United States
Remote
Full-time
A leading AI technology firm in California is seeking a Senior Software Engineer for its GPU Fabric Networking team.The role involves designing and maintaining system software enabling GPU communic...Show more
Last updated: 1 day ago • Promoted
Software Engineer, GPU Infrastructure

Software Engineer, GPU Infrastructure

OpenAI • San Francisco, CA, United States
Full-time
This role will support the fleet infrastructure team at OpenAI.The fleet team focuses on running the world's largest, most reliable, and frictionless GPU fleet to support OpenAI's general purpose m...Show more
Last updated: 30+ days ago • Promoted
C++ Systems Engineer — GPU Virtualization, On-Site SF

C++ Systems Engineer — GPU Virtualization, On-Site SF

Recruiting From Scratch • San Francisco, CA, United States
Full-time
A leading talent firm is seeking a Software Engineer (C++ Systems) based in San Francisco, CA.The role involves building and optimizing a high-performance C++ GPU virtualization library and debuggi...Show more
Last updated: 2 days ago • Promoted
AI Infrastructure Engineer, Model Serving Platform

AI Infrastructure Engineer, Model Serving Platform

Scale AI, Inc. • San Francisco, CA, United States
Full-time
As a Software Engineer on the ML Infrastructure team, you will design and build platforms for scalable, reliable, and efficient serving of LLMs. Our platform powers cutting-edge research and product...Show more
Last updated: 30+ days ago • Promoted
AI Engineer & Researcher - GPU Kernel

AI Engineer & Researcher - GPU Kernel

Xai • Palo Alto, California, United States
Full-time
AI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excelle...Show more
Last updated: 30+ days ago • Promoted
GPU Engineer, Platform Architecture

GPU Engineer, Platform Architecture

Apple • Cupertino, CA, United States
Full-time
Imagine what you could do here! At Apple, new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly. Bring passion and dedication to your job and there...Show more
Last updated: 2 days ago • Promoted
Software Engineer, GPU Inference

Software Engineer, GPU Inference

OpenAI • San Francisco, CA, United States
Full-time
The Sora team is pioneering multimodal capabilities for OpenAI's foundation models.We're a hybrid research and product team focused on integrating multimodal functionalities into our AI products, e...Show more
Last updated: 30+ days ago • Promoted
AI Infra Engineer : Scale ML Clusters Kubernetes & Slurm

AI Infra Engineer : Scale ML Clusters Kubernetes & Slurm

Perplexity AI Inc. • San Francisco, CA, United States
Full-time
A leading AI solutions provider in San Francisco is seeking an AI Infra Engineer to design and manage scalable Kubernetes clusters and optimize AI training infrastructure.The ideal candidate will h...Show more
Last updated: 1 day ago • Promoted
Senior GPU Communications and Networking Engineer Equity

Senior GPU Communications and Networking Engineer Equity

NVIDIA • Santa Clara, CA, United States
Full-time
A leading technology company in California is seeking a highly motivated Senior Software Engineer to join their communication libraries and network software team. This role involves designing and ma...Show more
Last updated: 4 days ago • Promoted
GPU Kernel Compiler Engineer, AI Inference

GPU Kernel Compiler Engineer, AI Inference

NVIDIA • Santa Clara, CA, United States
Full-time
NVIDIA's AI and GPU software is at the forefront of computing fueling breakthroughs across deep learning, LLMs, and intelligent applications. Our team is building solutions for rapid development and...Show more
Last updated: 1 day ago • Promoted
AI Forward Deployment Engineer — GPU & Customer Impact

AI Forward Deployment Engineer — GPU & Customer Impact

CareerArc • Santa Clara, CA, United States
Full-time
A leading technology company is seeking a Forward Deployment Software Engineer to turn cutting-edge AI technology into tangible business solutions. The ideal candidate will possess strong programmin...Show more
Last updated: 12 hours ago • Promoted • New!
Datacenter Modeling Engineer for AI / HPC Platforms (Equity)

Datacenter Modeling Engineer for AI / HPC Platforms (Equity)

NVIDIA Corporation • Santa Clara, CA, United States
Full-time
A leading technology company is seeking visionary software engineers to design and develop models for the next generation of GPU-accelerated datacenters. This role demands experience in Python-based...Show more
Last updated: 1 day ago • Promoted
Multimodal Inference Engineer — Scale GPU AI Models

Multimodal Inference Engineer — Scale GPU AI Models

OpenAI • San Francisco, CA, United States
Full-time
An innovative company is seeking a talented software engineer to join their dynamic Inference team.This role involves designing and implementing infrastructure for large-scale multimodal models, fo...Show more
Last updated: 12 hours ago • Promoted • New!
AI Infra Engineer : Kubernetes & Slurm (Hybrid, Equity)

AI Infra Engineer : Kubernetes & Slurm (Hybrid, Equity)

Pantera Capital • San Francisco, CA, United States
Full-time
A leading investment firm in San Francisco is seeking an AI Infrastructure Engineer to design and optimize AI training and inference clusters using Kubernetes and Slurm. The role requires strong pro...Show more
Last updated: 12 hours ago • Promoted • New!
Distinguished AI Engineer (Agentic AI Platform Infrastructure)

Distinguished AI Engineer (Agentic AI Platform Infrastructure)

Capital One • San Francisco, CA, United States
Full-time +1
Distinguished AI Engineer (Agentic AI Platform Infrastructure).At Capital One, we are creating responsible and reliable AI systems, changing banking for good. For years, Capital One has been an indu...Show more
Last updated: 30+ days ago • Promoted