Talent.com
Software Engineer, GPU Inference

Software Engineer, GPU Inference

OpenAISan Francisco, CA, United States
30+ days ago
Job type
  • Full-time
Job description

About the Team

The Sora team is pioneering multimodal capabilities for OpenAI's foundation models. We're a hybrid research and product team focused on integrating multimodal functionalities into our AI products, ensuring they are reliable, user-friendly, and aligned with our mission of broad societal benefit.

About the Role

We're looking for a GPU Inference Engineer to contribute to improvements in model serving efficiency for Sora. This is a high-impact role where you'll drive initiatives to optimize inference performance and scalability. You'll also be engaged in model design, to help assist our researchers in developing inference-friendly models.

This role is critical to scaling the team's broader goals - it will directly enable leadership to focus on higher-leverage initiatives by building a stronger technical foundation.

In this role you will :

  • Perform engineering efforts focused on improving model serving, inference performance, and system efficiency
  • Drive optimizations from a kernel and data movement perspective to improve system throughput and reliability
  • Partner closely with research and product teams to ensure our models perform effectively at scale
  • Design, build, and improve critical serving infrastructure to support Sora's growth and reliability needs

You might thrive in this role if you :

  • Have deep expertise in model performance optimization, particularly at the inference layer
  • Have a strong background in kernel-level systems, data movement, and low-level performance tuning
  • Are excited about scaling high-performing AI systems that serve real-world, multimodal workloads
  • Can navigate ambiguity, set technical direction, and drive complex initiatives to completion
  • This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.

    About OpenAI

    OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

    We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.

    For additional information, please see OpenAI's Affirmative Action and Equal Employment Opportunity Policy Statement.

    Background checks for applicants will be administered in accordance with applicable law, and qualified applicants with arrest or conviction records will be considered for employment consistent with those laws, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act, for US-based candidates. For unincorporated Los Angeles County workers : we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment : protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations.

    To notify OpenAI that you believe this job posting is non-compliant, please submit a report through this form. No response will be provided to inquiries unrelated to job posting compliance.

    We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.

    OpenAI Global Applicant Privacy Policy

    At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

    Create a job alert for this search

    Software Engineer • San Francisco, CA, United States

    Related jobs
    • Promoted
    Hardware Validation Engineer

    Hardware Validation Engineer

    SupermicroSan Jose, CA, United States
    Full-time
    Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...Show moreLast updated: 30+ days ago
    • Promoted
    Software Engineer, GPU Infrastructure - HPC

    Software Engineer, GPU Infrastructure - HPC

    OpenAISan Francisco, CA, United States
    Full-time
    The Fleet team at OpenAI supports the computing environment that powers our cutting-edge research and product development. We oversee large-scale systems that span data centers, GPUs, networking, an...Show moreLast updated: 30+ days ago
    • Promoted
    GPU Systems Engineer - HPC / Parallel Computing

    GPU Systems Engineer - HPC / Parallel Computing

    Vast.aiSan Francisco, CA, US
    Full-time
    AI projects and businesses all over the world.We are democratizing and decentralizing AI computing—reshaping our future for the benefit of humanity. We are a small, growing, and highly motivat...Show moreLast updated: 30+ days ago
    • Promoted
    GPU Cluster Resource Scheduling and Optimization EngineerSan Francisco

    GPU Cluster Resource Scheduling and Optimization EngineerSan Francisco

    Together AISan Francisco, CA, United States
    Full-time
    GPU Cluster Resource Scheduling And Optimization Engineer.AI infrastructure by creating cutting-edge systems that enable scalable and efficient machine learning workloads.Our team tackles the uniqu...Show moreLast updated: 8 days ago
    • Promoted
    Sr. System Engineer - GPU Servers (27156)

    Sr. System Engineer - GPU Servers (27156)

    SupermicroSan Jose, CA, United States
    Full-time
    Supermicro is a top-tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC, and IoT / Embedded customers...Show moreLast updated: 11 days ago
    • Promoted
    Silicon Validation Software Engineer- GPU IP Validation and Integration

    Silicon Validation Software Engineer- GPU IP Validation and Integration

    AppleCupertino, CA, United States
    Full-time
    Do you love creating elegant solutions to highly complex challenges? Do you intrinsically see the importance in every detail? As part of our Silicon Technologies group, you’ll help design and manuf...Show moreLast updated: 30+ days ago
    • Promoted
    Sr. Hardware Design Engineer - x86 / GPU / HPC (27752)

    Sr. Hardware Design Engineer - x86 / GPU / HPC (27752)

    SupermicroSan Jose, CA, United States
    Full-time
    Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...Show moreLast updated: 11 days ago
    • Promoted
    Sr Hardware Developer, GPU / AI and Compute

    Sr Hardware Developer, GPU / AI and Compute

    OracleSanta Clara, CA, United States
    Full-time
    Oracle hardware development engineering, within Oracle's Cloud Infrastructure development, is seeking a highly driven GPU Platform Hardware Engineer at the Senior Engineer level.The GPU Hardware En...Show moreLast updated: 8 days ago
    • Promoted
    • New!
    Software Engineer, Inference AMD GPU Enablement

    Software Engineer, Inference AMD GPU Enablement

    The Rundown AI, Inc.San Francisco, CA, United States
    Full-time
    OpenAIs Inference team ensures that our most advanced models run efficiently, reliably, and at scale.We build and optimize the systems that power our production APIs, internal research tools, and e...Show moreLast updated: 9 hours ago
    • Promoted
    • New!
    Software Engineer, Mobile : Field Ops

    Software Engineer, Mobile : Field Ops

    GridwareSan Francisco, CA, United States
    Full-time
    Gridware is a San Francisco-based technology company dedicated to protecting and enhancing the electrical grid.We pioneered a groundbreaking new class of grid management called active grid response...Show moreLast updated: 9 hours ago
    • Promoted
    Principal GPU Software Engineer II

    Principal GPU Software Engineer II

    F. Hoffmann-La Roche GruppeSanta Clara, CA, United States
    Full-time
    At Roche you can show up as yourself, embraced for the unique qualities you bring.Our culture encourages personal expression, open dialogue, and genuine connections, where you are valued, accepted ...Show moreLast updated: 30+ days ago
    • Promoted
    GPU RTL / FW Engineer

    GPU RTL / FW Engineer

    Mastech DigitalSan Jose, CA, US
    Temporary
    Digital Transformation Services for all American Corporations.We value our professionals, providing comprehensive benefits and the opportunity for growth. San Jose, CA; San Diego, CA; Austin, TX - H...Show moreLast updated: 30+ days ago
    Staff Software Engineer, GPU Algorithms

    Staff Software Engineer, GPU Algorithms

    DeepSight TechnologySanta Clara, CA, USA
    Full-time
    Quick Apply
    Staff Software Engineer, GPU Algorithms.Staff Software Engineer, GPU Algorithms.C++ and CUDA to shape our new ultrasound imaging platform. As a Staff Software Engineer, you will work on developing a...Show moreLast updated: 30+ days ago
    • Promoted
    Lead Post-Silicon Validation Engineer

    Lead Post-Silicon Validation Engineer

    NVIDIASanta Clara, CA, United States
    Full-time
    We are seeking Lead Post-Silicon Validation Engineer within the GPU Engineering Team to help drive development of future GPUs be used in 3D graphics, deep learning, HPC and automotive markets.Make ...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Software Engineer, Inference - AMD GPU Enablement

    Software Engineer, Inference - AMD GPU Enablement

    OpenAISan Francisco, CA, United States
    Full-time
    Our Inference team brings OpenAI's most capable research and technology to the world through our products.We empower consumers, enterprises and developers alike to use and access our state-of-the-a...Show moreLast updated: 9 hours ago
    • Promoted
    Software Engineer, GPU Infrastructure

    Software Engineer, GPU Infrastructure

    OpenAISan Francisco, CA, United States
    Full-time
    This role will support the fleet infrastructure team at OpenAI.The fleet team focuses on running the world's largest, most reliable, and frictionless GPU fleet to support OpenAI's general purpose m...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Infrastructure Engineer, GPU

    Infrastructure Engineer, GPU

    DigitalOceanSan Francisco, CA, United States
    Full-time
    Dive in and do the best work of your career at DigitalOcean.Journey alongside a strong community of top talent who are relentless in their drive to build the simplest scalable cloud.If you have a g...Show moreLast updated: 9 hours ago
    • Promoted
    Sr. Hardware Design Engineer - x86 / GPU / HPC Servers (27733)

    Sr. Hardware Design Engineer - x86 / GPU / HPC Servers (27733)

    SupermicroSan Jose, CA, United States
    Full-time
    Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...Show moreLast updated: 11 days ago