Talent.com
Research Engineer - Distributed Training
Research Engineer - Distributed TrainingPrime Intellect • San Francisco, CA, United States
No longer accepting applications
Research Engineer - Distributed Training

Research Engineer - Distributed Training

Prime Intellect • San Francisco, CA, United States
15 days ago
Job type
  • Full-time
Job description

Building Open Superintelligence Infrastructure

Prime Intellect is building the open superintelligence stack - from frontier agentic models to the infra that enables anyone to create, train, and deploy them. We aggregate and orchestrate global compute into a single control plane and pair it with the full rl post-training stack : environments, secure sandboxes, verifiable evals, and our async RL trainer. We enable researchers, startups and enterprises to run end-to-end reinforcement learning at frontier scale, adapting models to real tools, workflows, and deployment contexts.

As a Research Engineer working on Distributed Training, you'll play a crucial role in shaping our technological direction, focusing on our decentralizing AI training stack. If you love scaling things and maximizing training efficiency, this role is for you.

Responsibilities

Lead and participate in novel research to build a massive scale, highly reliable and secure decentralized training orchestration solution

Optimize the performance, cost, and resource utilization of AI workloads by leveraging the most recent advances for compute & memory optimization techniques.

Contribute to the development of our open-source libraries and frameworks for distributed model training.

Publish research in top-tier AI conferences such as ICML & NeurIPS.

Distill highly technical project outcomes in layman approachable technical blogs to our customers and developers.

Stay up-to-date with the latest advancements in AI / ML infrastructure and tools, decentralized training research and proactively identify opportunities to enhance our platform's capabilities and user experience.

Requirements

Strong background in AI / ML engineering, with extensive experience in designing and implementing end-to-end pipelines for training and deploying large-scale AI models.

Deep expertise in distributed training techniques, frameworks (e.g., PyTorch Distributed, DeepSpeed, MosaicML’s LLM Foundry), and tools (e.g. Ray) for optimizing the performance and scalability of AI workloads.

Experience in large-scale model training incl. distributed training techniques such as data, tensor & pipeline parallelism

Solid understanding of MLOps best practices, including model versioning, experiment tracking, and continuous integration / deployment (CI / CD) pipelines.

Passion for advancing the state-of-the-art in decentralized AI model training and democratizing access to AI capabilities for researchers, developers, and businesses worldwide.

If you're not familiar with these, but feel like that you can contribute to our mission and you're a high-energy person, get familiar with these resources (here, here and here) and please reach out!

Benefits & Perks

Competitive compensation, including equity incentives, aligning your success with the growth and impact of Prime Intellect.

Flexible work arrangements, with the option to work remotely or in-person at our offices in San Francisco.

Visa sponsorship and relocation assistance for international candidates.

Quarterly team off-sites, hackathons, conferences and learning opportunities.

Opportunity to work with a talented, hard-working and mission-driven team, united by a shared passion for leveraging technology to accelerate science and AI.

We recently raised $15mm in funding (total of $20mm raised) led by Founders Fund, with participation from Menlo Ventures and prominent angels including Andrej Karpathy (Eureka AI, Tesla, OpenAI), Tri Dao (Chief Scientific Officer of Together AI), Dylan Patel (SemiAnalysis), Clem Delangue (Huggingface), Emad Mostaque (Stability AI) and many others.

If you're excited about the opportunity to build the foundation for the future of decentralized AI and create a platform that empowers developers and researchers to push the boundaries of what's possible, we'd love to hear from you.

#J-18808-Ljbffr

Create a job alert for this search

Engineer Distributed • San Francisco, CA, United States

Related jobs
Research Engineer, Pre-training

Research Engineer, Pre-training

Anthropic • San Francisco, CA, United States
Full-time
Research Engineer, Pre-training.Get AI-powered advice on this job and more exclusive features.Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be saf...Show more
Last updated: 19 days ago • Promoted
ML Research Engineer : Scalable Training & Inference

ML Research Engineer : Scalable Training & Inference

Aldea • San Francisco, California, United States
Full-time
A multi-modal AI company is seeking a Research Engineer (Machine Learning) to develop infrastructure for AI research.You will design and optimize training systems for large-scale models, ensuring h...Show more
Last updated: 5 days ago • Promoted
AI Engineer, Evaluation and Reliability

AI Engineer, Evaluation and Reliability

Mice Groups • Redwood City, CA, US
Permanent
Senior Engineer, AI Evaluation and Reliability / Contract-to-Hire or Direct Hire / Redwood City / Hybrid, onsite 3 days per week / This position pays $70-80 / hr. W2 for Contract, $140-190K annually u...Show more
Last updated: 15 days ago • Promoted
Distributed Training Engineer, Sora

Distributed Training Engineer, Sora

Openai • San Francisco, California, United States
Full-time
The Sora team is working on making video a key capability of OpenAI’s foundation models.We are a hybrid research and product team that seeks to understand and expand the capabilities of our video m...Show more
Last updated: 30+ days ago • Promoted
ML Research Engineer - Training

ML Research Engineer - Training

Achira • San Francisco, CA, United States
Full-time
Join a world‑class team of scientists, ML researchers, and engineers working together to make the physical microcosm predictable and reshape the future of drug discovery. Move beyond the beaten path...Show more
Last updated: 19 days ago • Promoted
Director, Research Subject Protection (0377U), Research Admin & Compliance - 82793

Director, Research Subject Protection (0377U), Research Admin & Compliance - 82793

InsideHigherEd • Berkeley, California, United States
Full-time
Director, Research Subject Protection (0377U), Research Admin & Compliance - 82793.At the University of California, Berkeley, we are dedicated to fostering a community where everyone feels welcome ...Show more
Last updated: 3 days ago • Promoted
Staff Systems Engineer

Staff Systems Engineer

Bio-Rad Laboratories • Hercules, CA, United States
Full-time
Working within Bio-Rad's Life Science R&D Group as a Systems Engineer, you will take engineering concepts, requirements and transform them into functional prototypes and finished products that impr...Show more
Last updated: 30+ days ago • Promoted
Research Engineer, ML Systems (All Industry Levels)

Research Engineer, ML Systems (All Industry Levels)

Character.AI • San Francisco, CA, United States
Full-time
Research Engineer, ML Systems (All Industry Levels).Research Engineer, ML Systems (All Industry Levels).Research Engineer, ML Systems (All Industry Levels). Research Engineer, ML Systems (All Indust...Show more
Last updated: 30+ days ago • Promoted
Distributed Training Engineer

Distributed Training Engineer

Periodic Labs • Menlo Park, CA, United States
Full-time
We are an AI + physical sciences lab building state of the art models to make novel scientific discoveries.We are well funded and growing rapidly. Team members are owners who identity and solve prob...Show more
Last updated: 24 days ago • Promoted
Research Engineer, Pre-training

Research Engineer, Pre-training

Menlo Ventures • San Francisco, CA, United States
Full-time
Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...Show more
Last updated: 14 days ago • Promoted
Research Engineer - Reinforcement Learning

Research Engineer - Reinforcement Learning

Prime Intellect, Inc. • San Francisco, CA, United States
Full-time
Building Open Superintelligence Infrastructure.Prime Intellect is building the open superintelligence stack - from frontier agentic models to the infra that enables anyone to create, train, and dep...Show more
Last updated: 11 days ago • Promoted
Remote M&A Associate - AI Trainer ($50-$60 / hour)

Remote M&A Associate - AI Trainer ($50-$60 / hour)

Data Annotation • Redwood City, California
Remote
Full-time +1
We are looking for a finance professional to join our team to train AI models.You will measure the progress of these AI chatbots, evaluate their logic, and solve problems to improve the quality of ...Show more
Last updated: 24 days ago • Promoted
Research Engineer (Pre-training)

Research Engineer (Pre-training)

HartleyCo • San Francisco, CA, United States
Full-time
This range is provided by HartleyCo.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Our client is assembling a world-class team to push the boun...Show more
Last updated: 30+ days ago • Promoted
Reinforcement Learning Research Engineer

Reinforcement Learning Research Engineer

Strativ Group • San Francisco, CA, United States
Full-time
Reinforcement Learning Research Engineer.A scaling, SOTA Generative AI Startup operating with a world class team (Founders have multiple prior exits) with talent from Open AI, IBM, MIT and several ...Show more
Last updated: 1 day ago • Promoted
Research Engineer – Scalable ML Training & Inference

Research Engineer – Scalable ML Training & Inference

Aldea Inc • San Francisco, California, United States
Full-time
A leading AI company in San Francisco is looking for a Research Engineer (Machine Learning) to enhance their multi-modal AI capabilities. The role involves building and optimizing infrastructure for...Show more
Last updated: 5 days ago • Promoted
Research Engineer - Distributed Training

Research Engineer - Distributed Training

Kubelt • San Francisco, CA, United States
Full-time
Building Open Superintelligence Infrastructure.Prime Intellect is building the open superintelligence stack - from frontier agentic models to the infra that enables anyone to create, train, and dep...Show more
Last updated: 1 day ago • Promoted
Research Engineer

Research Engineer

Appliedcompute • San Francisco, CA, United States
Full-time
Applied Compute builds Specific Intelligence for enterprises, unlocking the knowledge inside a company to train custom models and deploy an in-house agent workforce. Today’s state-of-the-art AI isn’...Show more
Last updated: 9 days ago • Promoted
Neuroanesthesiologist

Neuroanesthesiologist

AMN Healthcare • Stanford, US
Full-time
Job Description & Requirements.StartDate : ASAP Pay Rate : $432000.Fascinating case mix, including open neurovascular, skull base, pituitary, and tumor cases (DBS, etc. Multiple formal and informal me...Show more
Last updated: 30+ days ago • Promoted