Talent.com
No longer accepting applications
Research Engineer, Training Infrastructure

Research Engineer, Training Infrastructure

GoodfireSan Francisco, CA, US
2 days ago
Job type
  • Full-time
Job description

About Goodfire

Behind our name : Like fire, AI holds the potential for both immense benefit and significant risk. Just as mastering fire transformed human history, we believe the safe and intentional development of AI will shape the future of our species. Our goal is to tame this new fire.

Goodfire is an AI interpretability research company focused on understanding and intentionally designing advanced AI systems. We believe advances in interpretability will unlock the next frontier of safe and powerful foundation models and that deep research breakthroughs are necessary to make this possible.

Everything we do is in service of that mission. We move fast, take ownership, and constantly push to improve. We believe in acting today rather than tomorrow. We care deeply about the success of the organization and put the team above ourselves.

Goodfire is a public benefit corporation headquartered in San Francisco with a team of the world's top interpretability researchers and engineers from organizations like OpenAI and DeepMind. We've raised $57M from investors like Menlo, Lightspeed and Anthropic and work with customers including Arc Institute, Mayo Clinic, and Rakuten.

The role

We're seeking a research engineer to lead the development of our model training infrastructure. You'll own the critical systems that transform pre-trained models into safe, capable, and reliable AI systems through fine-tuning, RLVR, and other post-training techniques.

Key responsibilities

  • Design and implement scalable and customizable post-training pipelines (SFT, RLVR, DPO)
  • Develop suitable evaluation frameworks
  • Optimize inference-time interventions and model serving for post-trained models
  • Collaborate with research teams to rapidly prototype and validate new techniques

What you are

Goodfire is looking for experienced individuals who embody our values and share our deep commitment to making interpretability accessible. We care deeply about building a team who shares our values :

Put mission and team first

All we do is in service of our mission. We trust each other, deeply care about the success of the organization, and choose to put our team above ourselves.

Improve constantly

We are constantly looking to improve every piece of the business. We proactively critique ourselves and others in a kind and thoughtful way that translates to practical improvements in the organization. We are pragmatic and consistently implement the obvious fixes that work.

Take ownership and initiative

There are no bystanders here. We proactively identify problems and take full responsibility over getting a strong result. We are self-driven, own our mistakes, and feel deep responsibility over what we're building.

Action today

We have a small amount of time to do something incredibly hard and meaningful. The pace and intensity of the organization is high. If we can take action today or tomorrow, we will choose to do it today.

If you share our values and have at least two years of relevant experience, we encourage you to apply and join us in shaping the future of how we design AI systems.

What we are looking for

Required experience

  • 4+ years of experience in ML engineering, with at least 2 years focused on LLMs or foundation models
  • Deep expertise in fine-tuning, RLVR, and modern post-training techniques
  • Production experience deploying and maintaining language models at scale
  • Technical proficiency in Python, PyTorch / JAX, and distributed training frameworks
  • Mission alignment with building safe and powerful AI systems
  • Core competencies

    Post-training excellence

  • Expert understanding of supervised fine-tuning, RLVR, DPO
  • Experience with preference modeling and reward model training
  • Hands-on experience with parameter-efficient fine-tuning (e.g., LoRA, QLoRA)
  • Infrastructure and scale

  • Building systems that handle diverse training workflows efficiently
  • Optimizing training for large models for compute efficiency
  • Multi-node distributed training
  • Accelerating research

  • Rapid prototyping of novel post-training techniques
  • Building flexible infrastructure that supports multiple research directions
  • Quickly adapting to new research ideas
  • Preferred qualifications

  • Experience with RL and policy gradient methods
  • Published work on model alignment and / or post-training techniques
  • Familiarity with interpretability tools and research on the mechanistic understanding of model behavior
  • Compensation & benefits

    This role offers market competitive salary, equity, and competitive benefits. More importantly, you'll have the opportunity to work on groundbreaking technology with a world-class team on the critical path to ensuring a safe and beneficial future for humanity.

    The expected salary range for this position is $180,000-350,000 USD.

    #J-18808-Ljbffr

    Create a job alert for this search

    Infrastructure Engineer • San Francisco, CA, US

    Related jobs
    • Promoted
    Software Engineer, Data Infrastructure - Research

    Software Engineer, Data Infrastructure - Research

    OpenAISan Francisco, CA, United States
    Full-time
    The Workload team is responsible for designing and running OpenAI’s LLM training and inference infrastructure that powers frontier models at massive scale. Our systems unify how researchers train an...Show moreLast updated: 13 days ago
    • Promoted
    Senior AI Research Engineer, Model Inference (Remote)

    Senior AI Research Engineer, Model Inference (Remote)

    Tether Operations LimitedSan Francisco, CA, United States
    Remote
    Full-time
    Join Tether and Shape the Future of Digital Finance.At Tether, we’re building solutions that empower businesses to integrate reserve-backed tokens across blockchains with transparency and trust in ...Show moreLast updated: 8 days ago
    • Promoted
    Staff Infrastructure Engineer, Discovery Team

    Staff Infrastructure Engineer, Discovery Team

    Menlo VenturesSan Francisco, CA, United States
    Full-time
    Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...Show moreLast updated: 15 days ago
    • Promoted
    Technical Lead, ML Training Infrastructure

    Technical Lead, ML Training Infrastructure

    NuroMountain View, CA, United States
    Full-time
    Nuro is a self-driving technology company on a mission to make autonomy accessible to all.Founded in 2016, Nuro is building the world's most scalable driver, combining cutting-edge AI with automoti...Show moreLast updated: 3 days ago
    • Promoted
    Founding Machine Learning Infrastructure Engineer

    Founding Machine Learning Infrastructure Engineer

    NomadicML Inc.San Francisco, CA, United States
    Full-time
    Harvard, where they both did research in the intersection of computation and evaluations.Between them, they have authored multiple published papers in the machine learning domain and hold numerous ...Show moreLast updated: 30+ days ago
    Research Scientist / Engineer – Training Infrastructure

    Research Scientist / Engineer – Training Infrastructure

    IntelliPro Group Inc.Palo Alto, CA, US
    Full-time
    Quick Apply
    Research Scientist / Engineer – Training Infrastructure Position Type : Full time Location : Palo Alto, CA • Remote - US • Remote - International Salary Range : $220,000 - $300...Show moreLast updated: 12 days ago
    • Promoted
    Senior Research Engineer, LLM

    Senior Research Engineer, LLM

    CerebrasSan Francisco, CA, United States
    Full-time
    Genies is an avatar technology company powering the next era of interactive digital identity through AI companions.With the Avatar Framework and intuitive creation tools, Genies enables developers,...Show moreLast updated: 7 days ago
    Machine Learning Engineer, Training Infrastructure

    Machine Learning Engineer, Training Infrastructure

    IntelliPro Group Inc.San Francisco, CA, US
    Full-time
    Quick Apply
    Machine Learning Engineer, Training Infrastructure Position Type : Full time Location : San Francisco, CA, USA Salary Range : $150,000 - $250, 000 (USD) Job ID# : 158135 Job Description : We are l...Show moreLast updated: 12 days ago
    • Promoted
    Research Engineer, Focused Bets

    Research Engineer, Focused Bets

    OpenAISan Francisco, CA, United States
    Full-time
    The Strategic Deployment team makes frontier models more capable, reliable, and aligned to transform high-impact domains. On one hand, this involves deploying models in real-world, high-stakes setti...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Systems Research Engineer, Electrical

    Senior Systems Research Engineer, Electrical

    Gridware Technologies Inc.San Francisco, CA, United States
    Full-time
    Gridware is a San Francisco-based technology company dedicated to protecting and enhancing the electrical grid.We pioneered a groundbreaking new class of grid management called active grid response...Show moreLast updated: 28 days ago
    • Promoted
    Machine Learning Engineer, Training Infrastructure

    Machine Learning Engineer, Training Infrastructure

    Hedra, IncSan Francisco, CA, United States
    Full-time
    Hedra is a pioneering generative media company backed by top investors at Index, A16Z, and Abstract Ventures.We're building Hedra Studio, a multimodal creation platform capable of control, emotion,...Show moreLast updated: 30+ days ago
    • Promoted
    Research Engineer - Distributed Training

    Research Engineer - Distributed Training

    Prime IntellectSan Francisco, CA, United States
    Full-time
    At Prime Intellect, we are on a mission to accelerate open and decentralized AI progress by enabling anyone to contribute compute, code or capital to train powerful, open models.Our ultimate goal? ...Show moreLast updated: 30+ days ago
    • Promoted
    Machine Learning Engineer, Training Infrastructure

    Machine Learning Engineer, Training Infrastructure

    Ipro Networks Pte. Ltd.San Francisco, CA, United States
    Full-time
    Job Title : Machine Learning Engineer, Training Infrastructure | Position Type : Full time | Location : San Francisco, CA, USA | Salary Range : $150,000 - $250,000 (USD) | Job ID# : 158135.Design, imple...Show moreLast updated: 9 days ago
    • Promoted
    Machine Learning Data Engineer - Systems & Retrieval

    Machine Learning Data Engineer - Systems & Retrieval

    Zyphra Technologies Inc.Palo Alto, CA, United States
    Full-time
    Machine Learning Data Engineer - Systems & Retrieval.This includes designing high-performance pipelines for collecting, transforming, indexing, and serving massive, heterogeneous datasets from raw ...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Systems Research Engineer, Electrical

    Senior Systems Research Engineer, Electrical

    GridwareSan Francisco, CA, United States
    Full-time
    Gridware is a San Francisco-based technology company dedicated to protecting and enhancing the electrical grid.We pioneered a groundbreaking new class of grid management called active grid response...Show moreLast updated: 30+ days ago
    • Promoted
    Machine Learning Engineer, Training Infrastructure

    Machine Learning Engineer, Training Infrastructure

    HedraSan Francisco, CA, United States
    Full-time
    Hedra is a pioneering generative media company backed by top investors at Index, A16Z, and Abstract Ventures.We're building Hedra Studio, a multimodal creation platform capable of control, emotion,...Show moreLast updated: 30+ days ago
    • Promoted
    Research Engineer / Research Scientist - Foundations Retrieval Lead

    Research Engineer / Research Scientist - Foundations Retrieval Lead

    OpenAISan Francisco, CA, United States
    Full-time
    Research Engineer / Research Scientist - Foundations Retrieval Lead.The Foundations Research team works on high-risk, high-reward ideas that could shape the next decade of AI.Our goal is to advance...Show moreLast updated: 19 days ago
    • Promoted
    Senior Research And Development Engineer

    Senior Research And Development Engineer

    Cambridge RecruitersPleasanton, CA, US
    Full-time
    Commercial-Stage, 200-Person Minimally-Invasive Interventional Technology medical device company.Very well-funded with $50MM financing recently secured. We are looking for a Senior R&D Sustainin...Show moreLast updated: 1 day ago
    • Promoted
    Research Engineer, Codex

    Research Engineer, Codex

    OpenAISan Francisco, CA, United States
    Full-time
    The Codex team is responsible for building state-of-the-art AI systems that can write code, reason about software, and act as intelligent agents for developers and non-developers alike.Our mission ...Show moreLast updated: 30+ days ago
    • Promoted
    AI Infrastructure Engineer

    AI Infrastructure Engineer

    StackAISan Francisco, CA, United States
    Full-time
    As a Series A company, your work will be foundational, enabling safe, efficient, and reliable AI workflows from end to end. Design and implement scalable backend architectures for AI workloads (infe...Show moreLast updated: 6 days ago