No longer accepting applications

Research Engineer, Training Infrastructure

GoodfireSan Francisco, CA, United States

1 day ago

Job type

Full-time

Job description

About Goodfire

Behind our name : Like fire, AI holds the potential for both immense benefit and significant risk. Just as mastering fire transformed human history, we believe the safe and intentional development of AI will shape the future of our species. Our goal is to tame this new fire.

Goodfire is an AI interpretability research company focused on understanding and intentionally designing advanced AI systems. We believe advances in interpretability will unlock the next frontier of safe and powerful foundation models and that deep research breakthroughs are necessary to make this possible.

Everything we do is in service of that mission. We move fast, take ownership, and constantly push to improve. We believe in acting today rather than tomorrow. We care deeply about the success of the organization and put the team above ourselves.

Goodfire is a public benefit corporation headquartered in San Francisco with a team of the world's top interpretability researchers and engineers from organizations like OpenAI and DeepMind. We've raised $57M from investors like Menlo, Lightspeed and Anthropic and work with customers including Arc Institute, Mayo Clinic, and Rakuten.

The role

We're seeking a research engineer to lead the development of our model training infrastructure. You'll own the critical systems that transform pre-trained models into safe, capable, and reliable AI systems through fine-tuning, RLVR, and other post-training techniques.

Key responsibilities

Design and implement scalable and customizable post-training pipelines (SFT, RLVR, DPO)
Develop suitable evaluation frameworks
Optimize inference-time interventions and model serving for post-trained models
Collaborate with research teams to rapidly prototype and validate new techniques

What you are

Goodfire is looking for experienced individuals who embody our values and share our deep commitment to making interpretability accessible. We care deeply about building a team who shares our values :

Put mission and team first

All we do is in service of our mission. We trust each other, deeply care about the success of the organization, and choose to put our team above ourselves.

Improve constantly

We are constantly looking to improve every piece of the business. We proactively critique ourselves and others in a kind and thoughtful way that translates to practical improvements in the organization. We are pragmatic and consistently implement the obvious fixes that work.

Take ownership and initiative

There are no bystanders here. We proactively identify problems and take full responsibility over getting a strong result. We are self-driven, own our mistakes, and feel deep responsibility over what we're building.

Action today

We have a small amount of time to do something incredibly hard and meaningful. The pace and intensity of the organization is high. If we can take action today or tomorrow, we will choose to do it today.

If you share our values and have at least two years of relevant experience, we encourage you to apply and join us in shaping the future of how we design AI systems.

What we are looking for

Required experience

4+ years of experience in ML engineering, with at least 2 years focused on LLMs or foundation models

Deep expertise in fine-tuning, RLVR, and modern post-training techniques

Production experience deploying and maintaining language models at scale

Technical proficiency in Python, PyTorch / JAX, and distributed training frameworks

Mission alignment with building safe and powerful AI systems

Core competencies

Post-training excellence

Expert understanding of supervised fine-tuning, RLVR, DPO

Experience with preference modeling and reward model training

Hands-on experience with parameter-efficient fine-tuning (e.g., LoRA, QLoRA)

Infrastructure and scale

Building systems that handle diverse training workflows efficiently

Optimizing training for large models for compute efficiency

Multi-node distributed training

Accelerating research

Rapid prototyping of novel post-training techniques

Building flexible infrastructure that supports multiple research directions

Quickly adapting to new research ideas

Preferred qualifications

Experience with RL and policy gradient methods

Published work on model alignment and / or post-training techniques

Familiarity with interpretability tools and research on the mechanistic understanding of model behavior

Compensation & benefits

This role offers market competitive salary, equity, and competitive benefits. More importantly, you'll have the opportunity to work on groundbreaking technology with a world-class team on the critical path to ensuring a safe and beneficial future for humanity.

The expected salary range for this position is $180,000-350,000 USD.

#J-18808-Ljbffr

Create a job alert for this search

Infrastructure Engineer • San Francisco, CA, United States

Related jobs

Promoted

Senior Research Engineer

VirtualVocationsHayward, California, United States

Full-time

A company is looking for a Senior Research Engineer - Multimodal & Video Foundation Model.Key Responsibilities Pioneer multimodal and video-centric research, contributing to usable prototypes and...Show moreLast updated: 4 days ago

Promoted

Cyberinfrastructure Facilitator

VirtualVocationsSan Jose, California, United States

Full-time

A company is looking for a Cyberinfrastructure Facilitator, Remote.Key Responsibilities Forge strategic partnerships with researchers, educators, and IT teams to enhance CI capabilities Design a...Show moreLast updated: 1 day ago

Promoted

Staff Infrastructure Engineer, Discovery Team

Menlo VenturesSan Francisco, CA, United States

Full-time

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...Show moreLast updated: 14 days ago

Promoted

Software Engineer, ML Infrastructure - Training Platform

Scale AI, Inc.San Francisco, CA, United States

Full-time

Scale is looking for an AI / ML Infrastructure Engineer to join our Machine Learning Infrastructure team to build out our Training Platform. You will partner closely with Machine Learning researchers ...Show moreLast updated: 30+ days ago

Promoted

Machine Learning Engineer — Infrastructure

Fundamental Research LabsMenlo Park, CA, United States

Full-time

Machine Learning Infrastructure Engineer.AI : from high-performance inference engines to the underlying agent technologies and large-scale compute clusters that keep everything running.You’ll collab...Show moreLast updated: 12 days ago

Promoted

LLM Research Engineer

Cypress HCMMountain View, CA, US

Full-time

Design, train, and fine-tune large language models (e.GPT, LLaMA, PaLM) for various applications.Conduct research on cutting-edge techniques in natural language processing (NLP) and machine learnin...Show moreLast updated: 2 days ago

Promoted

Hardcore Engineer - Pre-training Infrastructure

xAIPalo Alto, CA, US

Full-time

AI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering exc...Show moreLast updated: 30+ days ago

Promoted

Machine Learning Data Engineer - Systems & Retrieval

ZyphraPalo Alto, CA, US

Full-time

Machine Learning Data Engineer - Systems & Retrieval.This includes designing high-performance pipelines for collecting, transforming, indexing, and serving massive, heterogeneous datasets from ...Show moreLast updated: 30+ days ago

Promoted

Senior MLOps Engineer

VirtualVocationsConcord, California, United States

Full-time

A company is looking for a Senior MLOps Engineer to design and scale infrastructure for AI research and product development. Key Responsibilities Identify and resolve infrastructure and software b...Show moreLast updated: 30+ days ago

Research Scientist / Engineer – Training Infrastructure

IntelliPro Group Inc.Palo Alto, CA, US

Full-time

Quick Apply

Research Scientist / Engineer – Training Infrastructure Position Type : Full time Location : Palo Alto, CA • Remote - US • Remote - International Salary Range : $220,000 - $300...Show moreLast updated: 11 days ago

Promoted

AI Infrastructure Engineer

SpellbrushSan Francisco, CA, US

Full-time

Spellbrush, the world’s leading generative AI studio behind.AI Infrastructure Engineer to join us in building out end-to-end ML infrastructure to run our models on all platforms.Design, imple...Show moreLast updated: 30+ days ago

Machine Learning Engineer, Training Infrastructure

IntelliPro Group Inc.San Francisco, CA, US

Full-time

Quick Apply

Machine Learning Engineer, Training Infrastructure Position Type : Full time Location : San Francisco, CA, USA Salary Range : $150,000 - $250, 000 (USD) Job ID# : 158135 Job Description : We are l...Show moreLast updated: 11 days ago

Promoted

ML Research Engineer, ML Systems

Scale AI, Inc.San Francisco, CA, United States

Full-time

Scale's ML platform (RLXF) team builds our internal distributed framework for large language model training and inference. The platform has been powering MLEs, researchers, data scientists and opera...Show moreLast updated: 30+ days ago

Promoted

Machine Learning Engineer, Training Infrastructure

Hedra, IncSan Francisco, CA, United States

Full-time

Hedra is a pioneering generative media company backed by top investors at Index, A16Z, and Abstract Ventures.We're building Hedra Studio, a multimodal creation platform capable of control, emotion,...Show moreLast updated: 30+ days ago

Promoted

Research Engineer - Distributed Training

Prime IntellectSan Francisco, CA, United States

Full-time

At Prime Intellect, we are on a mission to accelerate open and decentralized AI progress by enabling anyone to contribute compute, code or capital to train powerful, open models.Our ultimate goal? ...Show moreLast updated: 30+ days ago

Promoted

Machine Learning Engineer, Training Infrastructure

Ipro Networks Pte. Ltd.San Francisco, CA, United States

Full-time

Job Title : Machine Learning Engineer, Training Infrastructure | Position Type : Full time | Location : San Francisco, CA, USA | Salary Range : $150,000 - $250,000 (USD) | Job ID# : 158135.Design, imple...Show moreLast updated: 8 days ago

Promoted

Principal Threat Intelligence Engineer

VirtualVocationsFremont, California, United States

Full-time

A company is looking for a Principal Threat Intelligence Engineer.Key Responsibilities Oversee the entire Threat Intelligence Lifecycle including requirements, collection, processing, analysis, d...Show moreLast updated: 1 day ago

Promoted

AI Research Engineer

VirtualVocationsOakland, California, United States

Full-time

A company is looking for an AI Research Engineer specializing in LLM orchestration and prompting.Key Responsibilities Build LLM-powered software by designing prompt flows and orchestrations for o...Show moreLast updated: 30+ days ago

Promoted

Research Scientist / Engineer - Training Infrastructure

IntelliPro Group Inc.Palo Alto, CA, US

Full-time

Research Scientist / Engineer – Training Infrastructure.Palo Alto, CA • Remote - US • Remote - International.We believe that multimodality is critical for intelligence.To go beyond ...Show moreLast updated: 2 days ago

Promoted

Machine Learning Engineer, Training Infrastructure

HedraSan Francisco, CA, United States

Full-time