Talent.com
Software Engineer, Data Infrastructure - Research

Software Engineer, Data Infrastructure - Research

OpenAISan Francisco, CA, United States
12 days ago
Job type
  • Full-time
Job description

About the Team

The Workload team is responsible for designing and running OpenAI’s LLM training and inference infrastructure that powers frontier models at massive scale. Our systems unify how researchers train and serve models, abstracting away the complexity of performance, parallelism, and execution across vast GPU / accelerator fleets. By providing this foundation, the Workload team ensures that researchers can focus on advancing model capabilities while we handle the scale, efficiency, and reliability required to bring those models to life.

About the Role

We are looking for an engineer to design and implement the dataset infrastructure that powers OpenAI’s next-generation training stack. You will be responsible for building standardized dataset interfaces, scaling pipelines across thousands of GPUs, and proactively testing performance bottlenecks. In this role, you will collaborate closely with the multimodal researchers, and other infra groups to ensure datasets are unified, efficient, and easy to consume.

In this role, you will :

Design and maintain standardized dataset APIs, including for multimodal (MM) data that cannot fit in memory.

Build proactive testing and scale validation pipelines for dataset loading at GPU scale.

Collaborate with teammates to integrate datasets seamlessly into training and inference pipelines, ensuring smooth adoption and a great user experience.

Document and maintain dataset interfaces so they are discoverable, consistent, and easy for other teams to adopt.

Establish safeguards and validation systems to ensure datasets remain reproducible and unchanged once standardized.

Debug and resolve performance bottlenecks in distributed dataset loading (e.g., straggler systems slowing global training).

Provide visualization and inspection tools to surface errors, bugs, or bottlenecks in datasets.

You might thrive in this role if you :

Have strong engineering fundamentals with experience in distributed systems, data pipelines, or infrastructure.

Have experience building APIs, modular code, and scalable abstractions, while recognizing that abstractions ultimately serve the users and UX is an important part of the abstractions design.

Are comfortable debugging bottlenecks across large fleets of machines.

Take pride in building infrastructure that “just works,” and find joy in being the guardian of reliability and scale.

Are collaborative, humble, and excited to own a foundational (if not glamorous) part of the ML stack.

Bonus points if you :

Have background knowledge in data math, probability, or distributed data theory.

Have worked with GPU-scale distributed systems or dataset scaling for real-time data

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.

For additional information, please see OpenAI’s Affirmative Action and Equal Employment Opportunity Policy Statement.

Qualified applicants with arrest or conviction records will be considered for employment in accordance with applicable law, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act. For unincorporated Los Angeles County workers : we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment : protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations.

To notify OpenAI that you believe this job posting is non-compliant, please submit a report through this form. No response will be provided to inquiries unrelated to job posting compliance.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.

OpenAI Global Applicant Privacy Policy

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

#J-18808-Ljbffr

Create a job alert for this search

Software Engineer Infrastructure • San Francisco, CA, United States

Related jobs
  • Promoted
AI Research Engineer

AI Research Engineer

VirtualVocationsConcord, California, United States
Full-time
A company is looking for an AI Research Engineer specializing in LLM orchestration and prompting.Key Responsibilities Build LLM-powered software by designing prompt flows and orchestrations for o...Show moreLast updated: 30+ days ago
  • Promoted
Data Platform Engineer

Data Platform Engineer

VirtualVocationsHayward, California, United States
Full-time
A company is looking for a Data Platform Engineer, Data Capture.Key Responsibilities Expand developer tools for capturing business events and operational data into the Lakehouse Enhance self-ser...Show moreLast updated: 30+ days ago
  • Promoted
Research Software Engineer

Research Software Engineer

VirtualVocationsSan Francisco, California, United States
Full-time
A company is looking for a Research Software Engineer - Tokenomics.Key Responsibilities Set up and maintain the project's compute infrastructure to meet research and development needs Collaborat...Show moreLast updated: 30+ days ago
  • Promoted
Data Engineer - Multimodal Systems

Data Engineer - Multimodal Systems

ZyphraPalo Alto, CA, US
Full-time
Data Engineer - Multimodal Systems.Zyphra’s datasets and data pipelines across a variety of modalities.Your work will intersect with almost every team at Zyphra. You will be involved in collec...Show moreLast updated: 30+ days ago
  • Promoted
  • New!
Software Engineer - Distributed Data Systems

Software Engineer - Distributed Data Systems

xAIPalo Alto, CA, US
Full-time
AI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering exc...Show moreLast updated: 15 hours ago
  • Promoted
Sr. AI Infrastructure Software Engineer

Sr. AI Infrastructure Software Engineer

KLAMilpitas, CA, United States
Full-time
KLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem.Virtually every electronic device in the world is produced using our technologies.No laptop, smartpho...Show moreLast updated: 30+ days ago
  • Promoted
  • New!
Senior Data Engineer II

Senior Data Engineer II

VirtualVocationsHayward, California, United States
Full-time
A company is looking for a Senior Data Engineer II to join their data engineering team.Key Responsibilities Design, develop, and maintain scalable data pipelines using Apache Spark on Databricks ...Show moreLast updated: 10 hours ago
  • Promoted
Machine Learning Data Engineer - Systems & Retrieval

Machine Learning Data Engineer - Systems & Retrieval

ZyphraPalo Alto, CA, US
Full-time
Machine Learning Data Engineer - Systems & Retrieval.This includes designing high-performance pipelines for collecting, transforming, indexing, and serving massive, heterogeneous datasets from ...Show moreLast updated: 30+ days ago
  • Promoted
AI Infrastructure Engineer, Model Serving Platform

AI Infrastructure Engineer, Model Serving Platform

Scale AI, Inc.San Francisco, CA, United States
Full-time
As a Software Engineer on the ML Infrastructure team, you will design and build platforms for scalable, reliable, and efficient serving of LLMs. Our platform powers cutting-edge research and product...Show moreLast updated: 30+ days ago
  • Promoted
Senior Data Software Engineer

Senior Data Software Engineer

PsiQuantumPalo Alto, CA, United States
Full-time
Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Show moreLast updated: 30+ days ago
  • Promoted
Senior MLOps Engineer

Senior MLOps Engineer

VirtualVocationsConcord, California, United States
Full-time
A company is looking for a Senior MLOps Engineer to design and scale infrastructure for AI research and product development. Key Responsibilities Identify and resolve infrastructure and software b...Show moreLast updated: 30+ days ago
  • Promoted
Software Infrastructure & Platform Engineer

Software Infrastructure & Platform Engineer

PsiQuantumPalo Alto, CA, United States
Full-time
Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Show moreLast updated: 30+ days ago
  • Promoted
AI Infrastructure Engineer, ML Data Platform

AI Infrastructure Engineer, ML Data Platform

Scale AI, Inc.San Francisco, CA, United States
Full-time
Scale's AI Infrastructure team supports both R&D and applied Generative AI initiatives, driving breakthroughs in areas of post-training research such as AI safety, agents, and evaluating state-of-t...Show moreLast updated: 30+ days ago
  • Promoted
Senior Infrastructure Software Engineer, Enterprise AI

Senior Infrastructure Software Engineer, Enterprise AI

Scale AI, Inc.San Francisco, CA, United States
Full-time
Scale GP is building the next generation of enterprise-grade Generative AI products.Our platform provides APIs for knowledge retrieval, inference, and evaluation, enabling customers to build and de...Show moreLast updated: 30+ days ago
  • Promoted
Senior Software Engineer - Data Replication

Senior Software Engineer - Data Replication

TiDBSunnyvale, CA, US
Full-time
Join us as we scale our business by building on our tremendous success around the world.The massive database market is going to double over the next few years (the IDC estimates it to be $119B+ by ...Show moreLast updated: 30+ days ago
  • Promoted
Data Engineer II

Data Engineer II

VirtualVocationsFremont, California, United States
Full-time
A company is looking for a Data Engineer II.Key Responsibilities Produce high-quality data models and maintain data integrity for analytics products Develop scalable ELT pipelines and business i...Show moreLast updated: 30+ days ago
  • Promoted
Senior Research Engineer

Senior Research Engineer

VirtualVocationsConcord, California, United States
Full-time
A company is looking for a Senior Research Engineer - Multimodal & Video Foundation Model.Key Responsibilities Pioneer multimodal and video-centric research, contributing to usable prototypes and...Show moreLast updated: 9 days ago
  • Promoted
Infrastructure Software Engineer, Public Sector

Infrastructure Software Engineer, Public Sector

Scale AI, Inc.San Francisco, CA, United States
Full-time
Scale AI is seeking a highly skilled and motivated.Software Engineer, AI Infrastructure & Security.Public Sector Engineering team. As a part of this team, you will play a critical role in delivering...Show moreLast updated: 30+ days ago
  • Promoted
AI Applications and Data Science Engineer

AI Applications and Data Science Engineer

CXApp US, Inc.San Ramon, CA, US
Full-time
CXAPP is a forward-thinking technology company that leverages AI and data science to drive innovation and deliver cutting-edge solutions. We are seeking talented AI Applications and Data Science Eng...Show moreLast updated: 30+ days ago
  • Promoted
Software Engineer - Data Platform

Software Engineer - Data Platform

xAIPalo Alto, CA, US
Full-time
AI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering exc...Show moreLast updated: 30+ days ago