Talent.com
OpenAI
Hardware Development Infrastructure EngineerOpenAI • San Francisco
Hardware Development Infrastructure Engineer

Hardware Development Infrastructure Engineer

OpenAI • San Francisco
30+ days ago
Job type
  • Full-time
Job description

About the Team:

OpenAI’s Hardware organization develops silicon and system-level solutions designed for the unique demands of advanced AI workloads. The team is responsible for building the next generation of AI-native silicon while working closely with software and research partners to co-design hardware tightly integrated with AI models. In addition to delivering production-grade silicon for OpenAI’s supercomputing infrastructure, the team also creates custom design tools and methodologies that accelerate innovation and enable hardware optimized specifically for AI.

About the Role

We’re looking for a Hardware Development Infrastructure Engineer to build and run the infrastructure that powers OpenAI’s hardware development lifecycle. You’ll work closely with hardware teams to translate their workflows into scalable, observable, and automated systems, and then own the platforms that support them over time.

This role sits at the intersection of hardware, cloud, HPC, DevOps, and data. You’ll design regression systems, CI/CD pipelines, cloud and cluster platforms, and the data foundations that make development efficiency visible and measurable.

In this role, you will:

  • Partner with hardware teams on workflows and tooling: Embed with teams across DV, PD, emulation, formal, and software to understand development flows, identify failure modes, and deliver tooling (CLIs, services, APIs) that reduces manual work and accelerates iteration.

  • Build and operate regression systems at scale: Own regressions end-to-end—from definition and scheduling to execution, results ingestion, triage, and reporting—while improving throughput, reproducibility, and flake reduction.

  • Own CI/CD for infrastructure and tooling: Design and operate pipelines for infrastructure-as-code, services, images, and cluster configuration changes, including testing, gated deploys, staged rollouts, and safe rollback.

  • Run cloud and HPC platforms: Design, provision, and operate cloud infrastructure (Azure preferred) and HPC/HTC clusters (e.g., Slurm), tuning scheduling policies, autoscaling, node lifecycles, and cost-performance tradeoffs.

  • Build data foundations and visibility: Develop ETL pipelines to ingest metrics, logs, and results; operate databases for workflow metadata and outcomes; and build dashboards that surface efficiency, utilization, and reliability trends.

  • Drive operational excellence: Establish monitoring and alerting, lead incident response and postmortems, maintain runbooks, and produce clear, durable documentation.

You might thrive in this role if you have:

  • Familiarity with chip development workflows and at least one deep EDA domain (e.g., DV, PD, emulation, or formal verification).
    Strong infrastructure fundamentals, including cloud platforms, networking, security, performance, and automation.

  • Experience operating cloud environments (Azure preferred; AWS, GCP, or OCI acceptable) with strong infrastructure-as-code practices (e.g., Terraform, Bicep; configuration management tools a plus).
    Strong programming skills (Python preferred) and solid software engineering and scripting practices.

  • Experience building and operating CI/CD systems (e.g., Jenkins, Buildkite, GitHub Actions), including testing and release workflows.

  • Database experience (e.g., Postgres or MySQL), including schema design, migrations, indexing, and operational safety.

  • Clear communicator with strong judgment—able to explain tradeoffs, propose pragmatic solutions, and articulate a realistic vision for scalable infrastructure

Preferred Qualifications

  • Experience operating Slurm or other large-scale cluster schedulers.

  • Experience with enterprise authentication and directory services (e.g., Entra ID, LDAP, FreeIPA, SSSD).

  • Experience building or operating backend and middleware systems such as message queues, caches, artifact stores, or internal service platforms.

  • Familiarity with high-performance storage architectures and data movement optimization.

  • Experience running and monitoring license servers for expensive or capacity-constrained toolchains.

To comply with U.S. export control laws and regulations, candidates for this role may need to meet certain legal status requirements as provided in those laws and regulations.

Create a job alert for this search

Hardware Development Infrastructure Engineer • San Francisco

Similar jobs

Foundry Platform Infrastructure Engineer

The Rundown AI, Inc.Redwood City, CA, United States
Full-time

A cutting-edge technology company in Redwood City is seeking a passionate Software Engineer who will build software at scale and solve real-world problems.You will work on various technologies and ... Show more

 • Promoted

AI Infrastructure Engineer

SpellbrushSan Francisco, California, United States
Full-time

We also happen to be the world's leading generative AI studio — we're the team behind.We are currently investigating how AI can be used to help human artists perform masterpieces in the most comple... Show more

 • Promoted

Process Development Engineer, Data Center Applications

CelLink CorporationSan Carlos, CA, United States
Full-time

Process Development Engineer (Surface-Mount Technology).CelLink is redefining how power and data move through the world’s most advanced electronics, from electric vehicles to hyperscale data center... Show more

 • Promoted

Staff Engineer, Desktop Platform (Electron) — Cross‑Platform Leader

PostmanSan Francisco, CA, United States
Full-time

A leading technology company is seeking a Staff Engineer for their Electron-based desktop application.This role involves shaping the desktop experience across multiple operating systems, focusing o... Show more

 • Promoted

Senior Software Engineer, Tooling and Development Infrastructure

Hp IqSan Francisco, CA, United States
Full-time

Senior Software Engineer, Tooling and Development Infrastructure.HP IQ is HP’s new AI innovation lab.Combining startup agility with HP’s global scale, we’re building intelligent technologies that r... Show more

 • Promoted

Sr.System Development Engineer, AGI Infrastructure

AmazonSan Francisco, California, United States
Full-time

The Artificial General Intelligence (AGI) team is looking for a passionate, talented, and inventive engineer to play a pivotal role in the development and maintenance of industry‑leading multi‑moda... Show more

 • Promoted • New!

Infrastructure Engineer, Sandboxing

AnthropicSan Francisco, California, United States
Full-time

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole.Our team is a quickly growing group ... Show more

 • Promoted

Embedded Onboard Infrastructure Engineer – Autonomous Driving

bot.autoSan Francisco, CA, United States
Full-time

A leading autonomous vehicle company in San Francisco is seeking an Onboard Infrastructure Engineer to develop onboard systems for autonomous trucks.The candidate will design modules for real-time ... Show more

 • Promoted

Space Infrastructure Engineer — Flight Director in Training

Loft OrbitalSan Francisco, CA, United States
Full-time

A leading space infrastructure company in San Francisco is seeking a Space Infrastructure Software Engineer.You will develop and maintain software systems to autonomously operate satellites, ensuri... Show more

 • Promoted

Backend Infrastructure Engineer

Strategic Employment Partners (SEP)San Francisco, California, United States
Full-time

Join a stealth-mode startup on a mission to redefine how people shop online.Our client is building a hyper-personalized, AI-powered shopping experience backed by some of the most successful names i... Show more

 • Promoted

Staff Infrastructure Engineer

Replit, Inc.Foster City, CA, United States
Full-time

Replit is the agentic software creation platform that enables anyone to build applications using natural language.With millions of users worldwide and over 500,000 business users, Replit is democra... Show more

 • Promoted

Principal Infrastructure Engineer

Acceler8 TalentSan Francisco, CA, United States
Full-time

Principal IT Infrastructure Engineer | Build the Backbone of Next-Gen AI.We’re working with a stealth-mode AI infrastructure startup already outperforming industry leaders on some of the world’s la... Show more

 • Promoted

Infrastructure Engineer

TigerSan Francisco, California, United States
Full-time

Example org is a leading software company.Example org allows real-time collaboration on important example workflows.Founded in 2012 we have over 10,000 customers worldwide and are backed by fantast... Show more

 • Promoted

Senior Hardware Systems Engineer

SamsaraSan Francisco, CA, United States
Full-time

Samsara (NYSE: IOT) is the pioneer of the Connected Operations™ Cloud, which is a platform that enables organizations that depend on physical operations to harness Internet of Things (IoT) data to ... Show more

 • Promoted

Staff Infrastructure Engineer — Hybrid, Equity & Bonus

ReplitFoster City, CA, United States
Full-time

A software development platform is seeking a Staff Infrastructure Engineer to ensure reliable and scalable systems.You will drive automation, optimize performance, and elevate the developer experie... Show more

 • Promoted

Staff Infrastructure and Performance Engineer

NashSan Francisco, California, United States
Full-time

Staff Infrastructure & Performance Engineer.Staff Infrastructure Performance & Engineer.You’ll work directly with the Engineering Leadership team, platform, and product engineering teams to design ... Show more

 • Promoted

Avionics Test Hardware Development Engineer

Astranis Space TechnologiesSan Francisco, California, United States
Permanent

Avionics Test Hardware Development Engineer.Astranis builds advanced satellites for high orbits, expanding humanity’s reach into the solar system.Today Astranis satellites provide dedicated, secure... Show more

 • Promoted

Platform Engineer — Infra / Reliability Specialist

PolySan Francisco, California, United States
Full-time

Platform Engineer — Infra / Reliability Specialist.Platform Engineer — Infra / Reliability Specialist.Platform Engineer — Infra / Reliability Specialist.Platform Engineer — Infra / Reliability Spec... Show more

 • Promoted

Field Deployment Engineer: Hardware, RF & Infra

SpecterSan Francisco, California, United States
Full-time

A leading technology company in San Francisco is looking for adaptable field engineers to install and troubleshoot innovative systems.This role combines outdoor work and technical challenges, makin... Show more

 • Promoted

Lead GPU Hardware Engineer for AI Infrastructure

RobloxSan Mateo, CA, United States
Full-time

A leading gaming platform in San Mateo, CA is seeking an experienced GPU and AI Hardware Engineer to lead the GPU and AI accelerator ecosystem.In this role, you will handle the complete lifecycle o... Show more