Talent.com
Senior Research Engineer, Model Evaluation

Senior Research Engineer, Model Evaluation

CohereSan Francisco, CA, United States
4 days ago
Job type
  • Full-time
Job description

Overview

Senior Research Engineer, Model Evaluation. Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents.

Cohere is a team of researchers, engineers, designers, and more, who are passionate about their craft. We believe that a diverse range of perspectives is a requirement for building great products. We obsess over what we build and strive to increase the capabilities and value of our models for our customers.

Join us on our mission and shape the future!

Why this role?

Evaluation is critical to making progress in scaling intelligence. As models continue to become superhuman in many real-world use cases, we must continue to develop new techniques to accurately measure our models\' performance on frontier capabilities. In this role, you are responsible for creating next-generation evaluation methods and scalable infrastructure to measure LLM progress.

As a Senior Research Engineer, Model Evaluation, You Will

  • Develop evaluation benchmarks, datasets, and environments for measuring the bleeding edge of model capabilities
  • Conduct research to push the state-of-the-art in LLM evaluation methods, including training LLM judges; improving evaluation efficiency; and scalably building high-quality datasets
  • Build scalable tools for investigating and understanding evaluation results that are used by all members of technical staff at Cohere, as well as leadership and our CEO
  • Learn from and work with the best researchers and engineers in the field

You May Be a Good Fit If

  • You enjoy pushing the limits of what LLMs are capable of, and you have built high-quality evaluation resources to measure those capabilities (datasets, simulators, environments, etc.)
  • You have a track record of developing new methods and / or data to evaluate LLMs, e.g. publications at top-tier conferences, popular benchmarks, etc.
  • You have deep experience building with and around LLMs, and you have built tools for analyzing and understanding their performance
  • You have strong software engineering skills
  • If some of the above doesn’t line up perfectly with your experience, we still encourage you to apply. If you want to work really hard on a glorious mission with teammates that want the same thing, Cohere is the place for you.

    We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants from all backgrounds and are committed to providing equal opportunities. Should you require any accommodations during the recruitment process, please submit an Accommodations Request Form, and we will work together to meet your needs.

    Benefits

  • An open and inclusive culture and work environment
  • Work closely with a team on the cutting edge of AI research
  • Weekly lunch stipend, in-office lunches & snacks
  • Full health and dental benefits, including a separate budget to take care of your mental health
  • 100% Parental Leave top-up for 6 months for employees based in Canada, the US, and the UK
  • Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
  • Remote-flexible, offices in Toronto, New York, San Francisco and London and co-working stipend
  • 6 weeks of vacation
  • Note : This post is co-authored by both Cohere humans and Cohere technology.

    #J-18808-Ljbffr

    Create a job alert for this search

    Senior Research Engineer • San Francisco, CA, United States

    Related jobs
    • Promoted
    Senior Research Engineer, LLM

    Senior Research Engineer, LLM

    GeniesSan Francisco, CA, United States
    Full-time
    Genies is an avatar technology company powering the next era of interactive digital identity through AI companions.With the Avatar Framework and intuitive creation tools, Genies enables developers,...Show moreLast updated: 7 days ago
    • Promoted
    Senior Software Engineer, Model Evaluation - Simulation

    Senior Software Engineer, Model Evaluation - Simulation

    WaymoMountain View, CA, United States
    Full-time
    Waymo is an autonomous driving technology company with the mission to be the most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building the Wa...Show moreLast updated: 30+ days ago
    • Promoted
    Senior AI Research Engineer, Model Inference (Remote)

    Senior AI Research Engineer, Model Inference (Remote)

    Tether Operations LimitedSan Francisco, CA, United States
    Remote
    Full-time
    Join Tether and Shape the Future of Digital Finance.At Tether, we’re building solutions that empower businesses to integrate reserve-backed tokens across blockchains with transparency and trust in ...Show moreLast updated: 30+ days ago
    • Promoted
    Research Engineer, Model Evaluations

    Research Engineer, Model Evaluations

    AnthropicSan Francisco, CA, United States
    Full-time
    Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...Show moreLast updated: 4 days ago
    • Promoted
    Machine Learning Engineer - Model Evaluations, Public Sector

    Machine Learning Engineer - Model Evaluations, Public Sector

    Scale AISan Francisco, CA, United States
    Full-time
    Machine Learning Engineer - Model Evaluations, Public Sector.The Public Sector ML team at Scale deploys advanced AI systems-including LLMs, agentic models, and multimodal pipelines-into mission-cri...Show moreLast updated: 2 days ago
    • Promoted
    Senior Research Engineer, LLM

    Senior Research Engineer, LLM

    Genies, Inc.San Francisco, CA, United States
    Full-time
    Genies is looking for a passionate Senior Research Engineer to join our core AI team at our Bay Area office in San Mateo, CA. This is a critical role where you will take end-to-end ownership of the ...Show moreLast updated: 4 days ago
    • Promoted
    Senior Research Engineer - Datasets

    Senior Research Engineer - Datasets

    black.aiSan Francisco, CA, United States
    Full-time
    Join the team redefining how the world experiences design.Hey, hello, g'day, mabuhay, kia ora, 你好, hallo, vítejte!.We know job hunting can be a little time consuming and you're probably keen to fin...Show moreLast updated: 30+ days ago
    • Promoted
    Research Engineer, ML Systems (All Industry Levels)

    Research Engineer, ML Systems (All Industry Levels)

    Character.AISan Francisco, CA, United States
    Full-time
    Research Engineer, ML Systems (All Industry Levels).Research Engineer, ML Systems (All Industry Levels).Research Engineer, ML Systems (All Industry Levels). Research Engineer, ML Systems (All Indust...Show moreLast updated: 30+ days ago
    • Promoted
    Research Engineer, Interpretability

    Research Engineer, Interpretability

    AnthropicSan Francisco, CA, United States
    Full-time
    Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...Show moreLast updated: 30+ days ago
    • Promoted
    Research Engineer - Design Generation Modeling

    Research Engineer - Design Generation Modeling

    black.aiSan Francisco, CA, United States
    Full-time
    Join the team redefining how the world experiences design.Hey, g'day, mabuhay, kia ora,你好, hallo, vítejte!.We know job hunting can be a little time consuming and you're probably keen to find out wh...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Engineer, AI Evaluation & Reliability (Agentic AI)

    Senior Engineer, AI Evaluation & Reliability (Agentic AI)

    AnomaliRedwood City, CA, United States
    Full-time
    Anomali is headquartered in Silicon Valley and is the Leading AI-Powered Security Operations Platform that is modernizing security operations. At the center of it is an omnipresent, intelligent, and...Show moreLast updated: 11 days ago
    • Promoted
    ML Research Engineer, ML Systems

    ML Research Engineer, ML Systems

    Scale AI, Inc.San Francisco, CA, United States
    Full-time
    Scale's ML platform (RLXF) team builds our internal distributed framework for large language model training and inference. The platform has been powering MLEs, researchers, data scientists and opera...Show moreLast updated: 30+ days ago
    • Promoted
    GenAI Evaluation Scientist Enterprise LLM Systems

    GenAI Evaluation Scientist Enterprise LLM Systems

    Scale AISan Francisco, CA, United States
    Full-time
    A leading AI technology company is seeking an AI Research Engineer to join their Enterprise Evaluations team.In this critical role, you will enhance evaluation systems for LLM-powered workflows.Can...Show moreLast updated: 1 day ago
    • Promoted
    Research Engineer, Model Performance & Quality

    Research Engineer, Model Performance & Quality

    AnthropicSan Francisco, CA, United States
    Full-time
    Research Engineer, Model Performance & Quality.Be among the first 25 applicants.Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and benefici...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Research Engineer

    Senior Research Engineer

    Mem0San Francisco, CA, United States
    Full-time
    Own the end-to-end lifecycle of memory features—from research to production.You’ll fine‑tune models for extraction, updates, consolidation / forgetting, and conflict resolution; turn customer pain po...Show moreLast updated: 7 days ago
    • Promoted
    ML Research Engineer, ML Systems Research San Francisco, CA

    ML Research Engineer, ML Systems Research San Francisco, CA

    Scale AI, Inc.San Francisco, CA, United States
    Full-time
    Join the team shaping the future of AI at Scale.Scale’s ML platform (RLXF) team builds our internal distributed framework for large language model training and inference. The platform has been power...Show moreLast updated: 3 days ago
    • Promoted
    Research Engineer

    Research Engineer

    SolcoaSan Francisco, CA, United States
    Full-time
    Making the Metals Powering the World.Solcoa exists to stabilize the western rare-earth metal supply chain—powering every fighter jet, EV, wind turbine, phone, and generator.We’re among the very few...Show moreLast updated: 30+ days ago
    • Promoted
    AIML - Sr. Machine Learning Infrastructure Engineer, Evaluation

    AIML - Sr. Machine Learning Infrastructure Engineer, Evaluation

    Apple Inc.San Francisco, CA, United States
    Full-time
    Machine Learning Infrastructure Engineer, Evaluation.San Francisco, California, United States Software and Services.How do we ensure that Apple's most advanced AI features perform flawlessly for ev...Show moreLast updated: 9 days ago