Talent.com
Evaluation Scenario Writer – AI Agent Testing Specialist
Evaluation Scenario Writer – AI Agent Testing SpecialistMindrift • Jackson, MS, United States
Evaluation Scenario Writer – AI Agent Testing Specialist

Evaluation Scenario Writer – AI Agent Testing Specialist

Mindrift • Jackson, MS, United States
30+ days ago
Job type
  • Full-time
Job description

This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English.

At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI.

What We Do

The Mindrift platform, launched and powered by Toloka, connects domain experts with cutting-edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real-world expertise from across the globe.

About The Role

We’re looking for someone who can design realistic and structured evaluation scenarios for LLM-based agents. You’ll create test cases that simulate human-performed tasks and define gold-standard behavior to compare agent actions against. You’ll work to ensure each scenario is clearly defined, well-scored, and easy to execute and reuse. You’ll need a sharp analytical mindset, attention to detail, and an interest in how AI agents make decisions.

Although every project is unique, you might typically :

Design structured test scenarios based on real-world tasks

Define the golden path and acceptable agent behavior

Annotate task steps, expected outputs, and edge cases

Work with devs to test your scenarios and improve clarity

Review agent outputs and adapt tests accordingly

How To Get Started

Simply apply to this post, qualify, and get the chance to contribute to projects aligned with your skills, on your own schedule. From creating training prompts to refining model responses, you’ll help shape the future of AI while ensuring technology benefits everyone.

Requirements

Bachelor’s and / or Master’s Degree in Computer Science, Software Engineering, Data Science / Data Analytics, Artificial Intelligence / Machine Learning, Computational Linguistics / Natural Language Processing (NLP), Information Systems or related fields.

Background in QA, software testing, data analysis, or NLP annotation

Good understanding of test design principles (e.g., reproducibility, coverage, edge cases)

Strong written communication skills in English

Comfortable with structured formats like JSON / YAML for scenario description

Can define expected agent behaviors (gold paths) and scoring logic

Basic experience with Python and JS

Curious and open to working with AI-generated content, agent logs, and prompt-based behavior

Ready to learn new methods, able to switch between tasks and topics quickly and sometimes work with challenging, complex guidelines

Our freelance role is fully remote so, you just need a laptop, internet connection, time available and enthusiasm to take on a challenge

Nice to Have

Experience in writing manual or automated test cases

Familiarity with LLM capabilities and typical failure modes

Understanding of scoring metrics (precision, recall, coverage, reward functions)

Benefits

Get paid for your expertise, with rates that can go up to $60 / hour depending on your skills, experience, and project needs>

Take part in a flexible, remote, freelance project that fits around your primary professional or academic commitments

Participate in an advanced AI project and gain valuable experience to enhance your portfolio

Influence how future AI models understand and communicate in your field of expertise

#J-18808-Ljbffr

Create a job alert for this search

Evaluation Scenario Writer AI Agent Testing Specialist • Jackson, MS, United States

Similar jobs
AI Data Specialist

AI Data Specialist

VirtualVocations • Jackson, Mississippi, United States
Part-time
A company is looking for an AI Data Specialist to support the improvement of AI-generated content in English.Key Responsibilities Perform data collection, evaluation, and annotation Conduct pair...Show more
Last updated: 16 days ago • Promoted
AI Deployment Engineer

AI Deployment Engineer

VirtualVocations • Jackson, Mississippi, United States
Full-time
A company is looking for an AI Forward Deployed Engineer.Key Responsibilities Identify and refine use cases based on implementation insights Create demos and proof of concepts to showcase platfo...Show more
Last updated: 1 day ago • Promoted
Senior AI Developer

Senior AI Developer

VirtualVocations • Jackson, Mississippi, United States
Full-time
A company is looking for a Senior AI Developer to architect and build autonomous AI applications in healthcare.Key Responsibilities Architect and implement Agentic Workflows for autonomous AI app...Show more
Last updated: 14 days ago • Promoted
AI Automation Specialist

AI Automation Specialist

VirtualVocations • Jackson, Mississippi, United States
Full-time
A company is looking for an AI Automation Engineer to support digital transformation efforts in the life sciences industry through automation engineering and AI-driven workflows.Key Responsibilitie...Show more
Last updated: 6 days ago • Promoted
AI Research Engineer

AI Research Engineer

VirtualVocations • Jackson, Mississippi, United States
Full-time
A company is looking for an AI Research Engineer - Pre training.Key Responsibilities Conduct pre-training AI models on large, distributed servers equipped with thousands of NVIDIA GPUs Design, p...Show more
Last updated: 16 hours ago • Promoted • New!
Remote R Engineer - AI Trainer

Remote R Engineer - AI Trainer

SuperAnnotate • Clinton, Mississippi, US
Remote
Full-time
As a remote, hourly paid R Engineer, you will review AI-generated responses and generate high-quality R and data-analysis-focused content, evaluating the reasoning quality and step-by-step problem-...Show more
Last updated: 9 days ago
English Language Evaluation Specialist

English Language Evaluation Specialist

VirtualVocations • Jackson, Mississippi, United States
Full-time
A company is looking for an English Language Specialist.Key Responsibilities Evaluate outputs generated by LLMs across multiple modalities (text, image captions, video descriptions, and multimoda...Show more
Last updated: 3 days ago • Promoted
AI Engineer

AI Engineer

VirtualVocations • Jackson, Mississippi, United States
Full-time
A company is looking for an AI Engineer to support modernization efforts across HR systems.Key Responsibilities Assess the current HRTS application landscape and identify gaps and opportunities f...Show more
Last updated: 16 days ago • Promoted
Principal AI Workflow Engineer

Principal AI Workflow Engineer

VirtualVocations • Jackson, Mississippi, United States
Full-time
A company is looking for a Principal AI Workflow Engineer to lead enterprise-wide AI workflow modernization.Key Responsibilities Lead discovery sessions to map processes and identify automation o...Show more
Last updated: 1 day ago • Promoted
Remote Go Engineer - AI Trainer

Remote Go Engineer - AI Trainer

SuperAnnotate • Clinton, Mississippi, US
Remote
Full-time
As an hourly paid, fully remote Go Engineer for AI Data Training, you will review AI-generated Go code and explanations or generate your own, evaluate the reasoning quality and step-by-step problem...Show more
Last updated: 9 days ago
Freelance AI Trainer

Freelance AI Trainer

VirtualVocations • Jackson, Mississippi, United States
Full-time
A company is looking for a Creative Writer - Freelance AI Trainer.Key Responsibilities Engage in realistic multi-turn conversations with LLMs, simulating various user personas Stress-test AI mod...Show more
Last updated: 1 day ago • Promoted
Remote Senior C Engineer - AI Trainer

Remote Senior C Engineer - AI Trainer

SuperAnnotate • Byram, Mississippi, US
Remote
Full-time
As a Senior C Engineer, you will work remotely on an hourly paid basis to review AI-generated C code, low-level systems designs, and technical explanations, as well as generate high-quality referen...Show more
Last updated: 9 days ago
Senior Applied Scientist - Agentic AI Development

Senior Applied Scientist - Agentic AI Development

Oracle • Jackson, MS, United States
Full-time
Join Oracle Analytics as we shape the future of enterprise AI with innovative products that deliver intelligent data analysis at scale. We leverage our extensive expertise in data management and ent...Show more
Last updated: 10 days ago • Promoted
AI Solutions Specialist

AI Solutions Specialist

VirtualVocations • Jackson, Mississippi, United States
Full-time
A company is looking for an AI Solutions Specialist to design, integrate, and deliver AI-powered solutions across the enterprise. Key Responsibilities Identify automation opportunities and lead th...Show more
Last updated: 2 hours ago • Promoted • New!
Senior Applied Scientist - Agentic AI

Senior Applied Scientist - Agentic AI

Oracle • Jackson, MS, United States
Full-time
At Oracle Analytics, we are building the next generation of enterprise AI products to enable intelligent data analysis at scale. Leveraging our foundational strengths in data management and enterpri...Show more
Last updated: 2 days ago • Promoted
Remote Senior C++ Engineer - AI Trainer

Remote Senior C++ Engineer - AI Trainer

SuperAnnotate • Clinton, Mississippi, US
Remote
Full-time
As a Senior C++ Engineer, you will work remotely on an hourly paid basis to review AI-generated C++ code, systems designs, and technical explanations, as well as generate high-quality reference imp...Show more
Last updated: 9 days ago
AI Specialist, Identity and Access Management (IAM)

AI Specialist, Identity and Access Management (IAM)

META • Jackson, MS, United States
Full-time
Protecting Meta's data and workforce is an explicit top priority for the company.We are part of Security Foundations within the Cross-Meta Security team, dedicated to building and supporting the cr...Show more
Last updated: 10 days ago • Promoted
Remote JavaScript Engineer - AI Trainer

Remote JavaScript Engineer - AI Trainer

SuperAnnotate • Jackson, Mississippi, US
Remote
Full-time
As an hourly paid, fully remote JavaScript Engineer for AI Data Training, you will review complex AI-generated code and explanations or generate new ones, evaluate the reasoning quality and step-by...Show more
Last updated: 30+ days ago