Talent.com
Head of Evaluation and Oversight Research
Head of Evaluation and Oversight ResearchScale AI • New York, NY, United States
No longer accepting applications
Head of Evaluation and Oversight Research

Head of Evaluation and Oversight Research

Scale AI • New York, NY, United States
30+ days ago
Job type
  • Full-time
Job description

Head of Evaluation and Oversight Research

Scale is the leading data and evaluation partner for frontier AI companies, playing an integral role in advancing the science of evaluating and characterizing large language models (LLMs). Our research focuses on tackling the hardest problems in scalable oversight and the evaluation of advanced AI capabilities. We collaborate broadly across industry and academia and regularly publish our findings.

Our Research team is shaping the next generation of evaluation science for frontier AI models and works at the leading edge of model assessment and oversight. Some of our current research includes :

  • Developing AI-assisted evaluation pipelines, where models help critique, grade, and explain outputs (RLAIF, model-judging-model).
  • Advancing scalable oversight methods, such as rubric-guided evaluations, recursive oversight, and weak-to-strong generalization.
  • Designing benchmarks for frontier capabilities (reasoning, coding, multi-modal, and agentic tasks), inspired by efforts like MMMU, GPQA, SWE-Bench.
  • Building evaluation frameworks for agentic systems, measuring multi-step workflows and real-world task success.

You will

  • Lead a team of research scientists and engineers on foundational work in evaluation and oversight.
  • Drive research initiatives on frameworks and benchmarks for frontier AI models, spanning reasoning, coding, multi-modal, and agentic behaviors.
  • Design and advance scalable oversight methods, leveraging model-assisted evaluation, rubric-guided judgments, and recursive oversight.
  • Collaborate with leading research labs across industry and academia.
  • Publish research at top-tier venues and contribute to open-source benchmarking initiatives.
  • Remain deeply engaged with the research community, both understanding trends and setting them.
  • Ideal background

  • Track record of impactful research in machine learning, especially in generative AI, evaluation, or oversight.
  • Significant experience leading ML research in academia or industry.
  • Strong written and verbal communication skills for cross-functional collaboration.
  • Experience building and mentoring teams of research scientists and engineers.
  • Publications at major ML / AI conferences (e.g. NeurIPS, ICML, ICLR, ACL, EMNLP, CVPR) and / or journals.
  • Compensation, location and how to apply

    Compensation packages for eligible roles include base salary, equity, and benefits. The base salary range for this full-time position in locations such as San Francisco, New York, and Seattle is $260,000 - $350,000 USD. The range reflects the minimum and maximum targets for new hires and may vary by location, skills, experience, and other factors. Equity grants may be available subject to Board approval. Benefits include health, dental and vision coverage, retirement benefits, a learning and development stipend, PTO, and potential commuter benefits.

    Our research interviews are crafted to assess candidates' skills in practical ML prototyping and debugging, their grasp of research concepts, and their alignment with our organizational culture. We do not ask LeetCode-style questions.

    About Scale : At Scale, we believe the transition from traditional software to AI is a major shift. Our mission is to accelerate this transition across industries by powering the development and deployment of AI applications.

    EEO statement : We are an inclusive and equal opportunity workplace. We do not discriminate on the basis of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability status, gender identity, or veteran status. We provide reasonable accommodations to applicants with disabilities and can be contacted at accommodations@scale.com for the application process. Please see our privacy policy for additional information.

    #J-18808-Ljbffr

    Create a job alert for this search

    Head Of Research And • New York, NY, United States

    Related jobs
    Quality Improvement Data Manager

    Quality Improvement Data Manager

    VirtualVocations • Paterson, New Jersey, United States
    Full-time
    A company is looking for a Mgr, QI Data Analysis.Key Responsibilities Oversee the Quality Data Compilation, Review, and Analysis process for Quality Management Serve as the primary liaison betwe...Show more
    Last updated: 15 hours ago • Promoted • New!
    Director, Reimbursement Insights & Analytics - NS

    Director, Reimbursement Insights & Analytics - NS

    Novartis Group Companies • East Hanover, NJ, United States
    Full-time
    The Insights and Decision Science (IDS) team is dedicated to enabling improved decision making at Novartis by leveraging data and advanced analytics capabilities to generate actionable insights tha...Show more
    Last updated: 25 days ago • Promoted
    Lead Report Developer

    Lead Report Developer

    VirtualVocations • Paterson, New Jersey, United States
    Full-time
    A company is looking for a Lead Report Developer to deliver pharmacovigilance reporting and analytic solutions.Key Responsibilities Deliver pharmacovigilance reporting and analytic solutions for ...Show more
    Last updated: 4 days ago • Promoted
    Director of Performance Analytics

    Director of Performance Analytics

    VirtualVocations • Elizabeth, New Jersey, United States
    Full-time
    A company is looking for a Director, Performance Analytics to optimize marketing performance across multi-channel B2B programs. Key Responsibilities Serve as a senior client representative for per...Show more
    Last updated: 30+ days ago • Promoted
    Regulatory Data Governance Lead

    Regulatory Data Governance Lead

    VirtualVocations • Elizabeth, New Jersey, United States
    Full-time
    A company is looking for an Associate Director, Regulatory Data Governance Lead.Key Responsibilities Coordinate the assessment, transfer, and integration of Regulatory data during M&A activities ...Show more
    Last updated: 4 days ago • Promoted
    Principal Specialist, Regulatory Affairs

    Principal Specialist, Regulatory Affairs

    ImmunityBio • Summit, NJ, United States
    Full-time
    NASDAQ : IBRX) is a commercial-stage biotechnology company developing cell and immunotherapy products that are designed to help strengthen each patient's natural immune system, potentially enabling ...Show more
    Last updated: 30+ days ago • Promoted
    Quality Improvement Analytics Manager

    Quality Improvement Analytics Manager

    VirtualVocations • Jamaica, New York, United States
    Full-time
    A company is looking for a Manager, Quality Improvement Analytics.Key Responsibilities Manage staff to ensure the accuracy and timeliness of data requests Evaluate reports for trends and opportu...Show more
    Last updated: 5 days ago • Promoted
    PK / PD Analyst

    PK / PD Analyst

    VirtualVocations • Elizabeth, New Jersey, United States
    Part-time
    A company is looking for a PK / PD Analyst (part-time, remote).Key Responsibilities Review PK / PD analyses and provide input to study protocols and analysis plans Collaborate with internal and spon...Show more
    Last updated: 17 hours ago • Promoted • New!
    Manager, Analytics & Insights

    Manager, Analytics & Insights

    IPG Health • New York, NY, United States
    Full-time
    As a Manager, Analytics / Insights, you will conceptualize, design, and deliver marketing analytics solutions, spanning opportunity identification, segmentation, media analytics, and CRM evaluations....Show more
    Last updated: 4 days ago • Promoted
    Quality Analytics Specialist

    Quality Analytics Specialist

    VirtualVocations • Newark, New Jersey, United States
    Full-time
    A company is looking for a Quality Analytics Specialist, Agency Operations.Key Responsibilities Conduct regular quality assurance evaluations of customer interactions across various channels Ana...Show more
    Last updated: 4 days ago • Promoted
    Manager of AI Performance Analytics

    Manager of AI Performance Analytics

    VirtualVocations • Newark, New Jersey, United States
    Full-time
    A company is looking for a Manager, Analytics - AI Performance.Key Responsibilities Lead the analytical vision and roadmap for AI agent performance, designing measurement strategies Define key p...Show more
    Last updated: 4 days ago • Promoted
    CPQ Functional Consultant

    CPQ Functional Consultant

    VirtualVocations • Paterson, New Jersey, United States
    Full-time
    A company is looking for a CPQ Functional Consultant.Key Responsibilities Analyze and implement Configure, Price, Quote (CPQ) solutions Collaborate with stakeholders to gather requirements and d...Show more
    Last updated: 30+ days ago • Promoted
    Advisor, Revenue Cycle Quality & Performance Management

    Advisor, Revenue Cycle Quality & Performance Management

    Northwell Health • Lake Success, NY, United States
    Full-time
    Coordinates the daily activities of staff and information systems activities of organization departments.Assesses and monitors staff productivity through standards and metrics to optimize performan...Show more
    Last updated: 30+ days ago • Promoted
    Head of Evaluation and Oversight Research

    Head of Evaluation and Oversight Research

    Scale AI, Inc. • New York, NY, United States
    Full-time
    Scale is the leading data and evaluation partner for frontier AI companies, playing an integral role in advancing the science of evaluating and characterizing large language models (LLMs).Our resea...Show more
    Last updated: 30+ days ago • Promoted
    Financial Planning & Analysis Manager

    Financial Planning & Analysis Manager

    VirtualVocations • Yonkers, New York, United States
    Full-time
    A company is looking for a Manager of Financial Planning & Analysis.Key Responsibilities Manage FP&A budgeting, forecasting, financial reporting, and analysis to support executive decision-making...Show more
    Last updated: 5 days ago • Promoted
    Business Intelligence Manager

    Business Intelligence Manager

    VirtualVocations • Yonkers, New York, United States
    Full-time
    A company is looking for a BI Manager to oversee data analytics and governance initiatives.Key Responsibilities Design, build, and implement dashboards aligned with the client's BI platform Lead...Show more
    Last updated: 30+ days ago • Promoted
    Manager, Quantitative Analysis - Model Risk Audit

    Manager, Quantitative Analysis - Model Risk Audit

    Capital One • New York, New York, United States
    Full-time +1
    Manager, Quantitative Analysis - Model Risk Audit At Capital One data is at the center of everything we do.As a startup, we disrupted the credit card industry by individually personalizing every cr...Show more
    Last updated: 20 hours ago • Promoted • New!
    Senior Quality Analyst

    Senior Quality Analyst

    VirtualVocations • Jamaica, New York, United States
    Full-time
    Quality Analyst who will ensure products and services meet high-quality standards while collaborating with cross-functional teams. Key Responsibilities Drive quality strategy for assigned product ...Show more
    Last updated: 30+ days ago • Promoted