Senior Research Engineer, Model Evaluation

CohereSan Francisco, CA, United States

4 days ago

Job type

Full-time

Job description

Overview

Senior Research Engineer, Model Evaluation. Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents.

Cohere is a team of researchers, engineers, designers, and more, who are passionate about their craft. We believe that a diverse range of perspectives is a requirement for building great products. We obsess over what we build and strive to increase the capabilities and value of our models for our customers.

Join us on our mission and shape the future!

Why this role?

Evaluation is critical to making progress in scaling intelligence. As models continue to become superhuman in many real-world use cases, we must continue to develop new techniques to accurately measure our models\' performance on frontier capabilities. In this role, you are responsible for creating next-generation evaluation methods and scalable infrastructure to measure LLM progress.

As a Senior Research Engineer, Model Evaluation, You Will

Develop evaluation benchmarks, datasets, and environments for measuring the bleeding edge of model capabilities
Conduct research to push the state-of-the-art in LLM evaluation methods, including training LLM judges; improving evaluation efficiency; and scalably building high-quality datasets
Build scalable tools for investigating and understanding evaluation results that are used by all members of technical staff at Cohere, as well as leadership and our CEO
Learn from and work with the best researchers and engineers in the field

You May Be a Good Fit If

You enjoy pushing the limits of what LLMs are capable of, and you have built high-quality evaluation resources to measure those capabilities (datasets, simulators, environments, etc.)

You have a track record of developing new methods and / or data to evaluate LLMs, e.g. publications at top-tier conferences, popular benchmarks, etc.

You have deep experience building with and around LLMs, and you have built tools for analyzing and understanding their performance

You have strong software engineering skills

If some of the above doesn’t line up perfectly with your experience, we still encourage you to apply. If you want to work really hard on a glorious mission with teammates that want the same thing, Cohere is the place for you.

We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants from all backgrounds and are committed to providing equal opportunities. Should you require any accommodations during the recruitment process, please submit an Accommodations Request Form, and we will work together to meet your needs.

Benefits

An open and inclusive culture and work environment

Work closely with a team on the cutting edge of AI research

Weekly lunch stipend, in-office lunches & snacks

Full health and dental benefits, including a separate budget to take care of your mental health

100% Parental Leave top-up for 6 months for employees based in Canada, the US, and the UK

Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement

Remote-flexible, offices in Toronto, New York, San Francisco and London and co-working stipend

6 weeks of vacation

Note : This post is co-authored by both Cohere humans and Cohere technology.

#J-18808-Ljbffr

Create a job alert for this search

Senior Research Engineer • San Francisco, CA, United States

Related jobs

Promoted

Senior Research Engineer, LLM

GeniesSan Francisco, CA, United States

Full-time

Genies is an avatar technology company powering the next era of interactive digital identity through AI companions.With the Avatar Framework and intuitive creation tools, Genies enables developers,...Show moreLast updated: 7 days ago

Promoted

Senior Software Engineer, Model Evaluation - Simulation

WaymoMountain View, CA, United States

Full-time

Waymo is an autonomous driving technology company with the mission to be the most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building the Wa...Show moreLast updated: 30+ days ago

Promoted

Senior AI Research Engineer, Model Inference (Remote)

Tether Operations LimitedSan Francisco, CA, United States

Remote

Full-time

Join Tether and Shape the Future of Digital Finance.At Tether, we’re building solutions that empower businesses to integrate reserve-backed tokens across blockchains with transparency and trust in ...Show moreLast updated: 30+ days ago

Promoted

Research Engineer, Model Evaluations

AnthropicSan Francisco, CA, United States

Full-time

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...Show moreLast updated: 4 days ago

Promoted

Machine Learning Engineer - Model Evaluations, Public Sector

Scale AISan Francisco, CA, United States

Full-time

Machine Learning Engineer - Model Evaluations, Public Sector.The Public Sector ML team at Scale deploys advanced AI systems-including LLMs, agentic models, and multimodal pipelines-into mission-cri...Show moreLast updated: 2 days ago

Promoted

Senior Research Engineer, LLM

Genies, Inc.San Francisco, CA, United States

Full-time

Genies is looking for a passionate Senior Research Engineer to join our core AI team at our Bay Area office in San Mateo, CA. This is a critical role where you will take end-to-end ownership of the ...Show moreLast updated: 4 days ago

Promoted

Senior Research Engineer - Datasets

black.aiSan Francisco, CA, United States

Full-time

Join the team redefining how the world experiences design.Hey, hello, g'day, mabuhay, kia ora, 你好, hallo, vítejte!.We know job hunting can be a little time consuming and you're probably keen to fin...Show moreLast updated: 30+ days ago

Promoted

Research Engineer, ML Systems (All Industry Levels)

Character.AISan Francisco, CA, United States

Full-time

Research Engineer, ML Systems (All Industry Levels).Research Engineer, ML Systems (All Industry Levels).Research Engineer, ML Systems (All Industry Levels). Research Engineer, ML Systems (All Indust...Show moreLast updated: 30+ days ago

Promoted

Research Engineer, Interpretability

AnthropicSan Francisco, CA, United States

Full-time

Promoted

Research Engineer - Design Generation Modeling

black.aiSan Francisco, CA, United States

Full-time

Join the team redefining how the world experiences design.Hey, g'day, mabuhay, kia ora,你好, hallo, vítejte!.We know job hunting can be a little time consuming and you're probably keen to find out wh...Show moreLast updated: 30+ days ago

Promoted

Senior Engineer, AI Evaluation & Reliability (Agentic AI)

AnomaliRedwood City, CA, United States

Full-time

Anomali is headquartered in Silicon Valley and is the Leading AI-Powered Security Operations Platform that is modernizing security operations. At the center of it is an omnipresent, intelligent, and...Show moreLast updated: 11 days ago

Promoted

ML Research Engineer, ML Systems

Scale AI, Inc.San Francisco, CA, United States

Full-time

Scale's ML platform (RLXF) team builds our internal distributed framework for large language model training and inference. The platform has been powering MLEs, researchers, data scientists and opera...Show moreLast updated: 30+ days ago

Promoted

GenAI Evaluation Scientist Enterprise LLM Systems

Scale AISan Francisco, CA, United States

Full-time

A leading AI technology company is seeking an AI Research Engineer to join their Enterprise Evaluations team.In this critical role, you will enhance evaluation systems for LLM-powered workflows.Can...Show moreLast updated: 1 day ago

Promoted

Research Engineer, Model Performance & Quality

AnthropicSan Francisco, CA, United States

Full-time

Research Engineer, Model Performance & Quality.Be among the first 25 applicants.Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and benefici...Show moreLast updated: 30+ days ago

Promoted

Senior Research Engineer

Mem0San Francisco, CA, United States

Full-time

Own the end-to-end lifecycle of memory features—from research to production.You’ll fine‑tune models for extraction, updates, consolidation / forgetting, and conflict resolution; turn customer pain po...Show moreLast updated: 7 days ago

Promoted

ML Research Engineer, ML Systems Research San Francisco, CA

Scale AI, Inc.San Francisco, CA, United States

Full-time

Join the team shaping the future of AI at Scale.Scale’s ML platform (RLXF) team builds our internal distributed framework for large language model training and inference. The platform has been power...Show moreLast updated: 3 days ago

Promoted

Research Engineer

SolcoaSan Francisco, CA, United States

Full-time

Making the Metals Powering the World.Solcoa exists to stabilize the western rare-earth metal supply chain—powering every fighter jet, EV, wind turbine, phone, and generator.We’re among the very few...Show moreLast updated: 30+ days ago

Promoted

AIML - Sr. Machine Learning Infrastructure Engineer, Evaluation

Apple Inc.San Francisco, CA, United States

Full-time

Machine Learning Infrastructure Engineer, Evaluation.San Francisco, California, United States Software and Services.How do we ensure that Apple's most advanced AI features perform flawlessly for ev...Show moreLast updated: 9 days ago