Staff AI Research Scientist - Evaluation, Handshake AIHandshake • San Francisco, CA, US

No longer accepting applications

Staff AI Research Scientist - Evaluation, Handshake AI

Handshake • San Francisco, CA, US

2 days ago

Job type

Full-time

Job description

About Handshake AI

Handshake is building the career network for the AI economy. Our three-sided marketplace connects 18 million students and alumni, 1,500+ academic institutions across the U.S. and Europe, and 1 million employers to power how the next generation explores careers, builds skills, and gets hired.

Handshake AI is a human data labeling business that leverages the scale of the largest early career network. We work directly with the world's leading AI research labs to build a new generation of human data products. From PhDs in physics to undergrads fluent in LLMs, Handshake AI is the trusted partner for domain-specific data and evaluation at scale.

This is a unique opportunity to join a fast-growing team shaping the future of AI through better data, better tools, and better systems—for experts, by experts.

Now's a great time to join Handshake. Here's why :

Leading the AI Career Revolution : Be part of the team redefining work in the AI economy for millions worldwide.

Proven Market Demand : Deep employer partnerships across Fortune 500s and the world's leading AI research labs.

World-Class Team : Leadership from Scale AI, Meta, xAI, Notion, Coinbase, and Palantir, just to name a few.

Capitalized & Scaling : $3.5B valuation from top investors including Kleiner Perkins, True Ventures, Notable Capital, and more.

About the Role

As a Staff Research Scientist, you will drive frontier research on how we define intelligence of frontier models, i.e. develop benchmarks and measurements that help the research community to understand how large language models (LLMs) understand, reason, and interact with human knowledge. You will :

Lead teams of researchers to produce original research in LLM evaluation methodologies, interpretability, and human-AI knowledge alignment.

Develop novel frameworks and assessment techniques that reveal deep insights into model capabilities, limitations, and emergent behaviors.

Collaborate with engineers to translate research breakthroughs into scalable benchmarks, evaluation systems, and standards.

Pioneer new approaches to measuring reasoning, alignment, and trustworthiness in frontier AI systems.

Author high-quality code to enable large-scale experimentation, reproducible evaluation, and knowledge assessment workflows.

Publish in top-tier conferences and journals, establishing new directions in the science of AI evaluation.

Work cross-functionally with leadership, engineers, and external partners to set industry standards for responsible AI evaluation and alignment.

Desired Capabilities

PhD or equivalent research experience in machine learning, computer science, cognitive science, or related fields with focus on AI evaluation, interpretability, or model understanding.

6+ years of academic or industry experience post-doc in a research-first environment

Strong background in LLM research, evaluation methodologies, and / or foundational AI assessment techniques.

Proven ability to independently design, lead, and execute evaluation research programs with novel data types end-to-end.

Deep proficiency in Python and PyTorch for large-scale model analysis, benchmarking, and evaluation.

Experience building or leading novel benchmark development, systematic model assessment, or interpretability studies.

Strong publication record in post-training, evaluation, or interpretability that demonstrates field-defining contributions.

Ability to clearly communicate complex insights and influence both technical and non-technical stakeholders.

Extra Credit

Experience with RLHF, agent modeling, or AI alignment research.

Familiarity with data-centric AI approaches, synthetic data generation, or human-in-the-loop systems.

Understanding of challenges in scaling foundation models (training stability, safety, inference efficiency).

Contributions to open-source libraries or research tooling.

Interest in the societal impact, deployment ethics, and governance of frontier AI systems.

Perks

Handshake delivers benefits that help you feel supported—and thrive at work and in life.

The below benefits are for full-time US employees.

Ownership : Equity in a fast-growing company

Financial Wellness

401(k) match, competitive compensation, financial coaching

Family Support : Paid parental leave, fertility benefits, parental coaching

Wellbeing : Medical, dental, and vision, mental health support, $500 wellness stipend

Growth : $2,000 learning stipend, ongoing development

Remote & Office : Stipends for home office setup, internet, commuting, and free lunch / gym in our SF office

Time Off : Flexible PTO, 15 holidays + 2 flex days, winter #ShakeBreak where our whole office closes for a week!

Connection : Team outings & referral bonuses

Joinhandshake.com / careers

J-18808-Ljbffr

Create a job alert for this search

Research Scientist • San Francisco, CA, US

Related jobs

Data Scientist, Generative AI

VirtualVocations • Fremont, California, United States

Full-time

A company is looking for a Data Scientist focused on Generative AI.Key Responsibilities : Design, develop, and maintain generative AI solutions to support enterprise use cases Collaborate with da...Show more

Last updated: 5 days ago • Promoted

Applied ML Scientist

VirtualVocations • Fremont, California, United States

Full-time

A company is looking for an Applied ML Scientist specializing in Model Calibration & Personalization.Key Responsibilities Fine-tune existing embeddings and learn weight calibration functions for ...Show more

Last updated: 6 hours ago • Promoted • New!

Staff Data Scientist

VirtualVocations • Concord, California, United States

Full-time

A company is looking for a Staff Data Scientist.Key Responsibilities Develop and evaluate AI and LLM applications for retrieval, generation, and conversation Build evaluation pipelines and estab...Show more

Last updated: 30+ days ago • Promoted

Staff AI Engineer

VirtualVocations • Fremont, California, United States

Full-time

A company is looking for a Staff AI Engineer.Key Responsibilities Build 0-1 AI systems to transform core brokering workflows and manage client renewals Embed with brokering teams to identify aut...Show more

Last updated: 30+ days ago • Promoted

AI Agent Evaluation Analyst

VirtualVocations • Fremont, California, United States

Full-time

A company is looking for an AI Agent Evaluation Analyst.Key Responsibilities Review evaluation tasks and scenarios for logic, completeness, and realism Identify inconsistencies, missing assumpti...Show more

Last updated: 20 days ago • Promoted

Staff QA Engineer with AI Expertise

VirtualVocations • Fremont, California, United States

Full-time

A company is looking for a Staff Quality Assurance Engineer with AI expertise.Key Responsibilities Scale QA operations using AI tools and frameworks Write, execute, and report comprehensive test...Show more

Last updated: 4 hours ago • Promoted • New!

Multimodal AI Evaluation Analyst

VirtualVocations • Concord, California, United States

Full-time

A company is looking for a Multimodal AI Evaluation Analyst.Key Responsibilities Evaluate and score AI-generated outputs across various modalities Assess quality, correctness, coherence, style, ...Show more

Last updated: 1 day ago • Promoted

Staff Machine Learning Engineer

VirtualVocations • Concord, California, United States

Full-time

A company is looking for a Staff Machine Learning Engineer - Wildfire.Key Responsibilities Architect and build advanced ML models to predict vegetation and fuel conditions Design and maintain da...Show more

Last updated: 30+ days ago • Promoted

Senior AI / ML Scientist

VirtualVocations • Concord, California, United States

Full-time

A company is looking for a Senior AI / ML Applied Scientist.Key Responsibilities Investigate and evaluate the latest LLM and MLM models for healthcare applications Develop data pipelines for ML tr...Show more

Last updated: 30+ days ago • Promoted

AI Evaluation Analyst

VirtualVocations • Fremont, California, United States

Full-time

A company is looking for an AI Evaluator / Annotator (Remote- freelance, 100+ openings).Key Responsibilities Evaluate outputs generated by large language models (LLMs) across multiple modalities ...Show more

Last updated: 20 days ago • Promoted

Senior Research Scientist

VirtualVocations • Concord, California, United States

Full-time

A company is looking for a Senior Research Scientist - Digital Biology.Key Responsibilities Developing, implementing, benchmarking, and accelerating multimodal models in digital biology and drug ...Show more

Last updated: 30+ days ago • Promoted

Senior Researcher in Space Data

VirtualVocations • Concord, California, United States

Full-time

A company is looking for a Senior Researcher (Space Data Frontier Research).Key Responsibilities Conduct original research and publish findings in peer-reviewed journals and conferences Develop ...Show more

Last updated: 3 days ago • Promoted

Senior AI Data Scientist

VirtualVocations • Concord, California, United States

Full-time

A company is looking for a Senior AI Data Scientist - Consumer Lending.Key Responsibilities Design, code, and test AI agents for marketing campaigns and customer acquisition Conduct exploratory ...Show more

Last updated: 30+ days ago • Promoted

Senior / Staff Applied AI Engineer, Agents

Scale AI, Inc. • San Francisco, CA, United States

Full-time

At Scale, our mission is to accelerate the development of AI applications.For 8 years, Scale has been the leading AI data foundry, helping fuel the most exciting advancements in AI, including : gene...Show more

Last updated: 30+ days ago • Promoted

Sr. Staff Scientist

Bio-Rad Laboratories • Hercules, CA, United States

Full-time

In this role, the successful candidate will provide supervisory oversight to the R&D team and technical leadership to project teams, ensuring the effective synthesis and characterization of polymer...Show more

Last updated: 30+ days ago • Promoted

Senior UX Researcher - AI / ML

VirtualVocations • Fremont, California, United States

Full-time

A company is looking for a Senior UX Researcher (AI / ML - Healthcare / SaaS).Key Responsibilities Drive AI-enhanced product insights through user behavior analysis and feedback Leverage continuous ...Show more

Last updated: 1 day ago • Promoted

Senior AI Engineer

VirtualVocations • Concord, California, United States

Full-time

AI Engineer to join their team remotely.Key Responsibilities Designing, developing, and implementing AI models Analyzing and improving existing AI architectures Researching and implementing new...Show more

Last updated: 30+ days ago • Promoted

Forward Deployed Research Engineer

VirtualVocations • Concord, California, United States

Full-time

A company is looking for a Forward Deployed Research Engineer (FDRE - Clearance).Key Responsibilities Own the mission end-to-end, clarifying goals and shipping systems that impact real metrics P...Show more

Last updated: 1 day ago • Promoted