Software Engineer, GPU Inference

OpenAISan Francisco, CA, United States

30+ days ago

Job type

Full-time

Job description

About the Team

The Sora team is pioneering multimodal capabilities for OpenAI's foundation models. We're a hybrid research and product team focused on integrating multimodal functionalities into our AI products, ensuring they are reliable, user-friendly, and aligned with our mission of broad societal benefit.

About the Role

We're looking for a GPU Inference Engineer to contribute to improvements in model serving efficiency for Sora. This is a high-impact role where you'll drive initiatives to optimize inference performance and scalability. You'll also be engaged in model design, to help assist our researchers in developing inference-friendly models.

This role is critical to scaling the team's broader goals - it will directly enable leadership to focus on higher-leverage initiatives by building a stronger technical foundation.

In this role you will :

Perform engineering efforts focused on improving model serving, inference performance, and system efficiency
Drive optimizations from a kernel and data movement perspective to improve system throughput and reliability
Partner closely with research and product teams to ensure our models perform effectively at scale
Design, build, and improve critical serving infrastructure to support Sora's growth and reliability needs

You might thrive in this role if you :

Have deep expertise in model performance optimization, particularly at the inference layer

Have a strong background in kernel-level systems, data movement, and low-level performance tuning

Are excited about scaling high-performing AI systems that serve real-world, multimodal workloads

Can navigate ambiguity, set technical direction, and drive complex initiatives to completion

This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.

For additional information, please see OpenAI's Affirmative Action and Equal Employment Opportunity Policy Statement.

Background checks for applicants will be administered in accordance with applicable law, and qualified applicants with arrest or conviction records will be considered for employment consistent with those laws, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act, for US-based candidates. For unincorporated Los Angeles County workers : we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment : protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations.

To notify OpenAI that you believe this job posting is non-compliant, please submit a report through this form. No response will be provided to inquiries unrelated to job posting compliance.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.

OpenAI Global Applicant Privacy Policy

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

Create a job alert for this search

Software Engineer • San Francisco, CA, United States

Related jobs

Promoted

Hardware Validation Engineer

SupermicroSan Jose, CA, United States

Full-time

Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...Show moreLast updated: 30+ days ago

Promoted

Software Engineer, GPU Infrastructure - HPC

OpenAISan Francisco, CA, United States

Full-time

The Fleet team at OpenAI supports the computing environment that powers our cutting-edge research and product development. We oversee large-scale systems that span data centers, GPUs, networking, an...Show moreLast updated: 30+ days ago

Promoted

GPU Systems Engineer - HPC / Parallel Computing

Vast.aiSan Francisco, CA, US

Full-time

AI projects and businesses all over the world.We are democratizing and decentralizing AI computing—reshaping our future for the benefit of humanity. We are a small, growing, and highly motivat...Show moreLast updated: 30+ days ago

Promoted

GPU Cluster Resource Scheduling and Optimization EngineerSan Francisco

Together AISan Francisco, CA, United States

Full-time

GPU Cluster Resource Scheduling And Optimization Engineer.AI infrastructure by creating cutting-edge systems that enable scalable and efficient machine learning workloads.Our team tackles the uniqu...Show moreLast updated: 8 days ago

Promoted

Sr. System Engineer - GPU Servers (27156)

SupermicroSan Jose, CA, United States

Full-time

Supermicro is a top-tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC, and IoT / Embedded customers...Show moreLast updated: 11 days ago

Promoted

Silicon Validation Software Engineer- GPU IP Validation and Integration

AppleCupertino, CA, United States

Full-time

Do you love creating elegant solutions to highly complex challenges? Do you intrinsically see the importance in every detail? As part of our Silicon Technologies group, you’ll help design and manuf...Show moreLast updated: 30+ days ago

Promoted

Sr. Hardware Design Engineer - x86 / GPU / HPC (27752)

SupermicroSan Jose, CA, United States

Full-time

Promoted

Sr Hardware Developer, GPU / AI and Compute

OracleSanta Clara, CA, United States

Full-time

Oracle hardware development engineering, within Oracle's Cloud Infrastructure development, is seeking a highly driven GPU Platform Hardware Engineer at the Senior Engineer level.The GPU Hardware En...Show moreLast updated: 8 days ago

Promoted
New!

Software Engineer, Inference AMD GPU Enablement

The Rundown AI, Inc.San Francisco, CA, United States

Full-time

OpenAIs Inference team ensures that our most advanced models run efficiently, reliably, and at scale.We build and optimize the systems that power our production APIs, internal research tools, and e...Show moreLast updated: 9 hours ago

Promoted
New!

Software Engineer, Mobile : Field Ops

GridwareSan Francisco, CA, United States

Full-time

Gridware is a San Francisco-based technology company dedicated to protecting and enhancing the electrical grid.We pioneered a groundbreaking new class of grid management called active grid response...Show moreLast updated: 9 hours ago

Promoted

Principal GPU Software Engineer II

F. Hoffmann-La Roche GruppeSanta Clara, CA, United States

Full-time

At Roche you can show up as yourself, embraced for the unique qualities you bring.Our culture encourages personal expression, open dialogue, and genuine connections, where you are valued, accepted ...Show moreLast updated: 30+ days ago

Promoted

GPU RTL / FW Engineer

Mastech DigitalSan Jose, CA, US

Temporary

Digital Transformation Services for all American Corporations.We value our professionals, providing comprehensive benefits and the opportunity for growth. San Jose, CA; San Diego, CA; Austin, TX - H...Show moreLast updated: 30+ days ago

Staff Software Engineer, GPU Algorithms

DeepSight TechnologySanta Clara, CA, USA

Full-time

Quick Apply

Staff Software Engineer, GPU Algorithms.Staff Software Engineer, GPU Algorithms.C++ and CUDA to shape our new ultrasound imaging platform. As a Staff Software Engineer, you will work on developing a...Show moreLast updated: 30+ days ago

Promoted

Lead Post-Silicon Validation Engineer

NVIDIASanta Clara, CA, United States

Full-time

We are seeking Lead Post-Silicon Validation Engineer within the GPU Engineering Team to help drive development of future GPUs be used in 3D graphics, deep learning, HPC and automotive markets.Make ...Show moreLast updated: 30+ days ago

Promoted
New!

Software Engineer, Inference - AMD GPU Enablement

OpenAISan Francisco, CA, United States

Full-time

Our Inference team brings OpenAI's most capable research and technology to the world through our products.We empower consumers, enterprises and developers alike to use and access our state-of-the-a...Show moreLast updated: 9 hours ago

Promoted

Software Engineer, GPU Infrastructure

OpenAISan Francisco, CA, United States

Full-time

This role will support the fleet infrastructure team at OpenAI.The fleet team focuses on running the world's largest, most reliable, and frictionless GPU fleet to support OpenAI's general purpose m...Show moreLast updated: 30+ days ago

Promoted
New!

Infrastructure Engineer, GPU

DigitalOceanSan Francisco, CA, United States

Full-time

Dive in and do the best work of your career at DigitalOcean.Journey alongside a strong community of top talent who are relentless in their drive to build the simplest scalable cloud.If you have a g...Show moreLast updated: 9 hours ago

Promoted

Sr. Hardware Design Engineer - x86 / GPU / HPC Servers (27733)

SupermicroSan Jose, CA, United States

Full-time