GPU Fleet Engineer – Hyperscale Infra, Kubernetes & AIOpenAI • San Francisco, CA, United States

GPU Fleet Engineer – Hyperscale Infra, Kubernetes & AI

OpenAI • San Francisco, CA, United States

12 hours ago

Job type

Full-time

Job description

Join a forward-thinking company as an engineer in the fleet infrastructure team, where you'll design and operate systems for one of the largest GPU fleets globally. This role offers the chance to work in a hybrid setting while contributing to cutting-edge AI capabilities. Your expertise in hyperscale compute systems and programming will be crucial in shaping infrastructure that supports model deployment and training. Collaborate with diverse teams to ensure high reliability and utilization, all while being part of a mission-driven organization that values safety and human needs in AI development.

#J-18808-Ljbffr

Create a job alert for this search

Fleet Engineer • San Francisco, CA, United States

Related jobs

GPU Infra Performance Engineer - Benchmarks & Trials

Hyperbolic Labs • San Francisco, CA, United States

Full-time

A tech-driven AI company in San Francisco is seeking an individual to manage customer performance trials, running benchmarks and optimizing configurations. The ideal candidate has strong GPU cloud i...Show more

Last updated: 3 days ago • Promoted

Senior GPU HPC Platform Reliability Engineer

OpenAI • San Francisco, CA, United States

Full-time

A leading AI research company in San Francisco is seeking a software engineer for its Fleet High Performance Computing team. In this role, you'll ensure the reliability and uptime of the compute fle...Show more

Last updated: 12 hours ago • Promoted • New!

Software Engineer, GPU Infrastructure - HPC

OpenAI • San Francisco, CA, United States

Full-time

The Fleet team at OpenAI supports the computing environment that powers our cutting-edge research and product development. We oversee large-scale systems that span data centers, GPUs, networking, an...Show more

Last updated: 30+ days ago • Promoted

Senior GPU Fabric Networking Engineer (Remote) – Equity

NVIDIA Corporation • Santa Clara, CA, United States

Remote

Full-time

A leading AI technology firm in California is seeking a Senior Software Engineer for its GPU Fabric Networking team.The role involves designing and maintaining system software enabling GPU communic...Show more

Last updated: 1 day ago • Promoted

Software Engineer, GPU Infrastructure

OpenAI • San Francisco, CA, United States

Full-time

This role will support the fleet infrastructure team at OpenAI.The fleet team focuses on running the world's largest, most reliable, and frictionless GPU fleet to support OpenAI's general purpose m...Show more

Last updated: 30+ days ago • Promoted

C++ Systems Engineer — GPU Virtualization, On-Site SF

Recruiting From Scratch • San Francisco, CA, United States

Full-time

A leading talent firm is seeking a Software Engineer (C++ Systems) based in San Francisco, CA.The role involves building and optimizing a high-performance C++ GPU virtualization library and debuggi...Show more

Last updated: 2 days ago • Promoted

AI Infrastructure Engineer, Model Serving Platform

Scale AI, Inc. • San Francisco, CA, United States

Full-time

As a Software Engineer on the ML Infrastructure team, you will design and build platforms for scalable, reliable, and efficient serving of LLMs. Our platform powers cutting-edge research and product...Show more

Last updated: 30+ days ago • Promoted

AI Engineer & Researcher - GPU Kernel

Xai • Palo Alto, California, United States

Full-time

AI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excelle...Show more

Last updated: 30+ days ago • Promoted

GPU Engineer, Platform Architecture

Apple • Cupertino, CA, United States

Full-time

Imagine what you could do here! At Apple, new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly. Bring passion and dedication to your job and there...Show more

Last updated: 2 days ago • Promoted

Software Engineer, GPU Inference

OpenAI • San Francisco, CA, United States

Full-time

The Sora team is pioneering multimodal capabilities for OpenAI's foundation models.We're a hybrid research and product team focused on integrating multimodal functionalities into our AI products, e...Show more

Last updated: 30+ days ago • Promoted

AI Infra Engineer : Scale ML Clusters Kubernetes & Slurm

Perplexity AI Inc. • San Francisco, CA, United States

Full-time

A leading AI solutions provider in San Francisco is seeking an AI Infra Engineer to design and manage scalable Kubernetes clusters and optimize AI training infrastructure.The ideal candidate will h...Show more

Last updated: 1 day ago • Promoted

Senior GPU Communications and Networking Engineer Equity

NVIDIA • Santa Clara, CA, United States

Full-time

A leading technology company in California is seeking a highly motivated Senior Software Engineer to join their communication libraries and network software team. This role involves designing and ma...Show more

Last updated: 4 days ago • Promoted

GPU Kernel Compiler Engineer, AI Inference

NVIDIA • Santa Clara, CA, United States

Full-time

NVIDIA's AI and GPU software is at the forefront of computing fueling breakthroughs across deep learning, LLMs, and intelligent applications. Our team is building solutions for rapid development and...Show more

Last updated: 1 day ago • Promoted

AI Forward Deployment Engineer — GPU & Customer Impact

CareerArc • Santa Clara, CA, United States

Full-time

A leading technology company is seeking a Forward Deployment Software Engineer to turn cutting-edge AI technology into tangible business solutions. The ideal candidate will possess strong programmin...Show more

Last updated: 12 hours ago • Promoted • New!

Datacenter Modeling Engineer for AI / HPC Platforms (Equity)

NVIDIA Corporation • Santa Clara, CA, United States

Full-time

A leading technology company is seeking visionary software engineers to design and develop models for the next generation of GPU-accelerated datacenters. This role demands experience in Python-based...Show more

Last updated: 1 day ago • Promoted

Multimodal Inference Engineer — Scale GPU AI Models

OpenAI • San Francisco, CA, United States

Full-time

An innovative company is seeking a talented software engineer to join their dynamic Inference team.This role involves designing and implementing infrastructure for large-scale multimodal models, fo...Show more

Last updated: 12 hours ago • Promoted • New!

AI Infra Engineer : Kubernetes & Slurm (Hybrid, Equity)

Pantera Capital • San Francisco, CA, United States

Full-time

A leading investment firm in San Francisco is seeking an AI Infrastructure Engineer to design and optimize AI training and inference clusters using Kubernetes and Slurm. The role requires strong pro...Show more

Last updated: 12 hours ago • Promoted • New!

Distinguished AI Engineer (Agentic AI Platform Infrastructure)

Capital One • San Francisco, CA, United States

Full-time +1

Distinguished AI Engineer (Agentic AI Platform Infrastructure).At Capital One, we are creating responsible and reliable AI systems, changing banking for good. For years, Capital One has been an indu...Show more

Last updated: 30+ days ago • Promoted