Talent.com
OpenAI
Hardware Development Infrastructure EngineerOpenAI • San Francisco
Hardware Development Infrastructure Engineer

Hardware Development Infrastructure Engineer

OpenAI • San Francisco
Hace más de 30 días
Tipo de contrato
  • A tiempo completo
Descripción del trabajo

About the Team:

OpenAI’s Hardware organization develops silicon and system-level solutions designed for the unique demands of advanced AI workloads. The team is responsible for building the next generation of AI-native silicon while working closely with software and research partners to co-design hardware tightly integrated with AI models. In addition to delivering production-grade silicon for OpenAI’s supercomputing infrastructure, the team also creates custom design tools and methodologies that accelerate innovation and enable hardware optimized specifically for AI.

About the Role

We’re looking for a Hardware Development Infrastructure Engineer to build and run the infrastructure that powers OpenAI’s hardware development lifecycle. You’ll work closely with hardware teams to translate their workflows into scalable, observable, and automated systems, and then own the platforms that support them over time.

This role sits at the intersection of hardware, cloud, HPC, DevOps, and data. You’ll design regression systems, CI/CD pipelines, cloud and cluster platforms, and the data foundations that make development efficiency visible and measurable.

In this role, you will:

  • Partner with hardware teams on workflows and tooling: Embed with teams across DV, PD, emulation, formal, and software to understand development flows, identify failure modes, and deliver tooling (CLIs, services, APIs) that reduces manual work and accelerates iteration.

  • Build and operate regression systems at scale: Own regressions end-to-end—from definition and scheduling to execution, results ingestion, triage, and reporting—while improving throughput, reproducibility, and flake reduction.

  • Own CI/CD for infrastructure and tooling: Design and operate pipelines for infrastructure-as-code, services, images, and cluster configuration changes, including testing, gated deploys, staged rollouts, and safe rollback.

  • Run cloud and HPC platforms: Design, provision, and operate cloud infrastructure (Azure preferred) and HPC/HTC clusters (e.g., Slurm), tuning scheduling policies, autoscaling, node lifecycles, and cost-performance tradeoffs.

  • Build data foundations and visibility: Develop ETL pipelines to ingest metrics, logs, and results; operate databases for workflow metadata and outcomes; and build dashboards that surface efficiency, utilization, and reliability trends.

  • Drive operational excellence: Establish monitoring and alerting, lead incident response and postmortems, maintain runbooks, and produce clear, durable documentation.

You might thrive in this role if you have:

  • Familiarity with chip development workflows and at least one deep EDA domain (e.g., DV, PD, emulation, or formal verification).
    Strong infrastructure fundamentals, including cloud platforms, networking, security, performance, and automation.

  • Experience operating cloud environments (Azure preferred; AWS, GCP, or OCI acceptable) with strong infrastructure-as-code practices (e.g., Terraform, Bicep; configuration management tools a plus).
    Strong programming skills (Python preferred) and solid software engineering and scripting practices.

  • Experience building and operating CI/CD systems (e.g., Jenkins, Buildkite, GitHub Actions), including testing and release workflows.

  • Database experience (e.g., Postgres or MySQL), including schema design, migrations, indexing, and operational safety.

  • Clear communicator with strong judgment—able to explain tradeoffs, propose pragmatic solutions, and articulate a realistic vision for scalable infrastructure

Preferred Qualifications

  • Experience operating Slurm or other large-scale cluster schedulers.

  • Experience with enterprise authentication and directory services (e.g., Entra ID, LDAP, FreeIPA, SSSD).

  • Experience building or operating backend and middleware systems such as message queues, caches, artifact stores, or internal service platforms.

  • Familiarity with high-performance storage architectures and data movement optimization.

  • Experience running and monitoring license servers for expensive or capacity-constrained toolchains.

To comply with U.S. export control laws and regulations, candidates for this role may need to meet certain legal status requirements as provided in those laws and regulations.

Crear una alerta de empleo para esta búsqueda

Hardware Development Infrastructure Engineer • San Francisco

Ofertas similares

Cloud Infrastructure Engineer

BraintrustSan Francisco, CA, United States
A tiempo completo

Braintrust is the AI observability platform.By connecting evals and observability in one workflow, Braintrust gives builders the visibility to understand how AI behaves in production and the tools ... Mostrar más

 • Oferta promocionada

Foundry Platform Infrastructure Engineer

The Rundown AI, Inc.Redwood City, CA, United States
A tiempo completo

A cutting-edge technology company in Redwood City is seeking a passionate Software Engineer who will build software at scale and solve real-world problems.You will work on various technologies and ... Mostrar más

 • Oferta promocionada

Space Infrastructure Engineer — Flight Director in Training

Loft OrbitalSan Francisco, CA, United States
A tiempo completo

A leading space infrastructure company in San Francisco is seeking a Space Infrastructure Software Engineer.You will develop and maintain software systems to autonomously operate satellites, ensuri... Mostrar más

 • Oferta promocionada

Process Development Engineer, Data Center Applications

CelLink CorporationSan Carlos, CA, United States
A tiempo completo

Process Development Engineer (Surface-Mount Technology).CelLink is redefining how power and data move through the world’s most advanced electronics, from electric vehicles to hyperscale data center... Mostrar más

 • Oferta promocionada

Senior Software Engineer, Tooling and Development Infrastructure

Hp IqSan Francisco, CA, United States
A tiempo completo

Senior Software Engineer, Tooling and Development Infrastructure.HP IQ is HP’s new AI innovation lab.Combining startup agility with HP’s global scale, we’re building intelligent technologies that r... Mostrar más

 • Oferta promocionada

Avionics Test Hardware Development Engineer

Astranis Space TechnologiesSan Francisco, CA, United States
Indefinido

Avionics Test Hardware Development Engineer.Astranis builds advanced satellites for high orbits, expanding humanity’s reach into the solar system.Today Astranis satellites provide dedicated, secure... Mostrar más

 • Oferta promocionada

Onboard Infrastructure Engineer

bot.autoSan Francisco, CA, United States
A tiempo completo

At Bot Auto, we are revolutionizing the transportation of goods with our autonomous trucks.Onboard Infrastructure Engineer.You will be responsible for developing and maintaining the onboard infrast... Mostrar más

 • Oferta promocionada

Staff Infrastructure Engineer

Replit, Inc.Foster City, CA, United States
A tiempo completo

Replit is the agentic software creation platform that enables anyone to build applications using natural language.With millions of users worldwide and over 500,000 business users, Replit is democra... Mostrar más

 • Oferta promocionada

Principal Infrastructure Engineer

Syndesus, Inc.San Francisco, CA, United States
A tiempo completo

Principal Infrastructure Architect – Cloud & SaaS Platforms.San Jose, CA | Newport Beach, CA | Hybrid (2–3 days onsite).We are seeking a highly experienced, hands‑on Principal Infrastructure Archit... Mostrar más

 • Oferta promocionada

Infrastructure Engineer, Sandboxing

AnthropicSan Francisco, CA, United States
A tiempo completo

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole.Our team is a quickly growing group ... Mostrar más

 • Oferta promocionada

Principal Infrastructure Engineer

Acceler8 TalentSan Francisco, CA, United States
A tiempo completo

Principal IT Infrastructure Engineer | Build the Backbone of Next-Gen AI.We’re working with a stealth-mode AI infrastructure startup already outperforming industry leaders on some of the world’s la... Mostrar más

 • Oferta promocionada

Founding Senior Infrastructure Engineer — CI/CD & K8s

Retell AIRedwood City, CA, United States
A tiempo completo

A leading Voice AI startup in Redwood City is seeking a Founding Senior Software Engineer for Infrastructure.This full-time role involves owning and designing deployment pipelines in cloud and on-p... Mostrar más

 • Oferta promocionada

Infrastructure Engineer

TigerSan Francisco, CA, United States
A tiempo completo

Example org is a leading software company.Example org allows real-time collaboration on important example workflows.Founded in 2012 we have over 10,000 customers worldwide and are backed by fantast... Mostrar más

 • Oferta promocionada

Senior Hardware Systems Engineer

SamsaraSan Francisco, CA, United States
A tiempo completo

Samsara (NYSE: IOT) is the pioneer of the Connected Operations™ Cloud, which is a platform that enables organizations that depend on physical operations to harness Internet of Things (IoT) data to ... Mostrar más

 • Oferta promocionada

Staff Infrastructure Engineer — Hybrid, Equity & Bonus

ReplitFoster City, CA, United States
A tiempo completo

A software development platform is seeking a Staff Infrastructure Engineer to ensure reliable and scalable systems.You will drive automation, optimize performance, and elevate the developer experie... Mostrar más

 • Oferta promocionada

Senior / Staff Infrastructure Engineer

Apiphany CorporationSan Francisco, CA, United States
A tiempo completo

Apiphany is a pioneering foundational AI company for physical product development.We empower global innovators in automotive, aerospace, medtech, and energy to transform mountains of unstructured t... Mostrar más

 • Oferta promocionada

Lead GPU Hardware Engineer for AI Infrastructure

RobloxSan Mateo, CA, United States
A tiempo completo

A leading gaming platform in San Mateo, CA is seeking an experienced GPU and AI Hardware Engineer to lead the GPU and AI accelerator ecosystem.In this role, you will handle the complete lifecycle o... Mostrar más

 • Oferta promocionada

Senior GenAI Infra Engineer — Scale & Mentor Teams

AmazonSan Francisco, CA, United States
A tiempo completo

A leading technology company is seeking an experienced engineer for the Artificial General Intelligence (AGI) team to lead the design and maintenance of multi-modal, multi-lingual large language mo... Mostrar más

 • Oferta promocionada

Backend Infrastructure Engineer - Core Services

OpenAISan Francisco, CA, United States
A tiempo completo

A leading AI research company in San Francisco is seeking a Software Engineer for its Core Services team.This role involves designing and operating critical backend platforms to build scalable and ... Mostrar más

 • Oferta promocionada

Senior / Staff Infrastructure Engineer

ApiphanySan Francisco, CA, United States
A tiempo completo

Apiphany is a pioneering foundational AI company for physical product development.We empower global innovators in automotive, aerospace, medtech, and energy to transform mountains of unstructured t... Mostrar más