Product Manager - Agent Evaluation Platform (Ona)

HyperskillWinston Salem, NC, US

2 days ago

Job type

Full-time

Job description

Product Manager For Agent Evaluation Platform

Everyone's building AI agents now, but here's the problem : nobody really knows if their agents are actually working well. Sure, you can see that your agent completed a task, but did it solve the user's actual problem? Did it deliver real business value, or just go through the motions? Right now, most people test their agents manually, which doesn't scale and isn't reliable.

The Agent Evaluation Platform (the name is to be defined) will automatically evaluate agent performance not just "did it finish the task" but "did it achieve the outcome the user actually wanted." Think of it like Langfuse, but instead of testing individual prompts, we're evaluating entire agent workflows, complex chains of actions, and multi-agent systems.

This is especially important as companies start paying for agents based on outcomes rather than just usage. You need to know your agent is actually delivering value, not just burning through API calls.

What You'll Do

As a Product Manager, you'll figure out how to turn agent evaluation research into a real product. That means understanding how companies currently test their AI systems, what they're missing, and how our platform can fill that gap. You'll work with our engineering team to build evaluation pipelines that can assess everything from simple chatbots to complex multi-step agent workflows. You'll also need to find early customers companies building AI agents who are frustrated with current testing methods.

Build Product Strategy from Scratch Define how we position and sell this technology : embeddable SaaS for AI platforms, standalone evaluation service, developer tool integration, or something else.
Execute Go-to-Market Find first customers among companies building AI agents, build sales processes, and determine pricing models for an emerging evaluation category.
Own Technical Product Vision Work with engineering on evaluation pipelines, understand testing approaches, and translate assessment methodologies into clear business value.

Who You Are

You have AI product experience you've built AI features and faced challenges evaluating their real-world quality and business impact.

You understand evaluation methodologies you've created evals, built testing datasets, and you distinguish between evaluation approaches.

You've worked with AI agents not just used them, but understand their workflows, failure modes, and performance measurement challenges.

You think in outcomes you distinguish between "agent completed the task" and "agent solved the user's problem effectively".

You thrive in ambiguity you'll need to build a lot, figure things out on the go, experiment constantly, and handle multiple different tasks across various areas simultaneously.

What We Offer

Contractor agreement with a US-registered legal entity.

100% remote work from anywhere in the world.

Competitive salary in USD + equity in the product you're working on we focus on market rates, ready to hear your expectations and prepare an offer matching your expertise.

Resources budget for tools, learning, and whatever you need to succeed.

Fast-moving environment we ship fast, learn fast, and iterate based on real customer feedback.

Upload your resume and tell us a few words about yourself we'd love to hear from you!

Create a job alert for this search

Product Manager Platform • Winston Salem, NC, US