This isn’t your typical QA role – it’s a unique blend of technical engineering, machine learning evaluation, and data analysis.
You’ll work closely with cutting-edge conversational AI technology, de...Show moreLast updated: 30+ days ago
Promoted
Member of Technical Staff, Evaluation
Boson AISanta Clara, CA, US
Full-time
Boson AI is an early-stage startup building large language tools for everyone to use.Our founders (Alex Smola,Mu Li), and a team of Deep Learning, Optimization, NLP, AutoML and Statistics scientist...Show moreLast updated: 30+ days ago
Promoted
Member of Technical Staff, Model Evaluation
xAIPalo Alto, CA, US
Full-time
AI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge.
Our team is small, highly motivated, and focused on engineering exc...Show moreLast updated: 30+ days ago
Promoted
Product Manager, Evaluation & Data Generation
Hippocratic AIPalo Alto, CA, United States
Full-time
Hippocratic AI is seeking a PM to lead the development of our model evaluation and data generation platform.In this role, you'll drive the creation of high-quality training and test datasets that i...Show moreLast updated: 12 days ago
Promoted
Program Director
William and Flora Hewlett FoundationMenlo Park, CA, United States
Full-time
Career Opportunities with The William and Flora Hewlett Foundation A great place to work.Careers At The William and Flora Hewlett Foundation Current job opportunities are posted here as they become...Show moreLast updated: 2 days ago
Program Manager
ABMMountain View, California, USA
Full-time
Establish and maintain strong client relationships with assigned account(s).Identify areas of opportunity and lead the team to implement process changes in a positive and effective manner.Promote a...Show moreLast updated: 4 days ago
Promoted
Program Manager
LHH USSan Jose, CA, US
Full-time
Program Manager (Temp?to?Hire) - San Jose, CA - $30-$40 / hr.A confidential technology client in San Jose is seeking a.Lead cross?functional standups and governance routines; remove blockers.Define s...Show moreLast updated: 7 days ago
Promoted
Responsible AI ML Engineer – Safety & Evaluation
Apple Inc.Cupertino, CA, United States
Full-time
A leading technology company in Cupertino seeks a Machine Learning Engineer focused on Responsible AI.You'll work on developing evaluations for safety and fairness in generative AI applications, co...Show moreLast updated: 30+ days ago
Promoted
Evaluation Consultant
Paradise Architectural Panels and SteelSan Jose, CA, United States
Full-time
About the job Evaluation Consultant.Paradise Architectural PANELS & STEEL is a leading manufacturer of high-quality architectural panels and steel products.
We are committed to providing our clients...Show moreLast updated: 11 days ago
Promoted
New!
Licensed Lawyer for AI Evaluation
VirtualVocationsSunnyvale, California, United States
Full-time
A company is looking for Lawyers to support AI research through flexible, hourly contract work.Key Responsibilities Evaluate AI-generated content for legal accuracy and sound reasoning Design jo...Show moreLast updated: 16 hours ago
Senior AI Data and Evaluation Engineer
StrykerMenlo Park, California, USA
Full-time
We are looking for an experienced and highly skilled Senior AI Data and Validation Engineer.A successful candidate will be responsible for both dry and wet lab experiments for AI functionality acqu...Show moreLast updated: 20 days ago
Senior Planner Evaluation Engineer Hybrid
WaymoMountain View, CA, United States
Full-time
A leading autonomous driving technology firm is seeking experienced data-minded software engineers to join their Planner Evaluation team.
You will develop signals to measure performance and driving ...Show moreLast updated: 9 days ago
Promoted
AI Evaluation Engineer (QA) Manager
PwCSan Jose, CA, United States
Full-time
At PwC, we are at the forefront of data and analytics engineering, leveraging advanced technologies to create robust data solutions.
We empower businesses to transform raw data into actionable insig...Show moreLast updated: 3 days ago
New!
Staff Software Engineer, Autonomy Evaluation
GMSunnyvale, California, USA
Full-time
General Motors is a global leader in advanced driver assistance.With Super Cruise hands-free technology in more than 500000 Super Cruise-equipped vehicles on the road and over 700 million handsfree...Show moreLast updated: 16 hours ago
Promoted
Program Analyst
US Government JobsMenlo Park, CA, US
Full-time
This position is within the Program Evaluation and Resource Center (PERC) and is under the direct supervision of the Director of PERC.
PERC conducts nationwide evaluations of VA treatment methods, s...Show moreLast updated: 2 days ago
Promoted
Perceptual Audio Evaluation Manager
METASunnyvale, CA, United States
Full-time
The Wearables Audio Technology Team (WATT) is looking for an experienced leader to manage a team of researchers and engineers in Perceptual Audio Evaluation (PAE) within Audio Experience Team (AXT)...Show moreLast updated: 2 days ago
Promoted
Lead APP Pre Anesthesia Evaluation
Stanford Health Care - ValleyCarePalo Alto, CA, United States
Full-time
If you're ready to be part of our legacy of hope and innovation, we encourage you to take the first step and explore our current job openings.
Your best is waiting to be discovered.Day - 08 Hour (Un...Show moreLast updated: 2 days ago
Wireless Technologies Evaluation Engineer
Tata Consultancy ServicesCupertino, CA
Full-time
In this role, you will be part of Product RF Definition team and support the Evaluation and Characterization of various technologies from RF perspective.
You will work independently under Product RF...Show moreLast updated: 30+ days ago
We are seeking a LLM Evaluation Engineer to join a forward-thinking team responsible for developing a sophisticated voice assistant platform. This isn’t your typical QA role – it’s a unique blend of technical engineering, machine learning evaluation, and data analysis. You’ll work closely with cutting-edge conversational AI technology, designing evaluation frameworks, building custom scripts, and creating data visualizations to assess platform performance.
Key Responsibilities :
Design and implement evaluation strategies for voice and language models, including automated testing approaches.
Analyze unstructured data from log store systems to identify performance gaps and optimize user experiences.
Build and maintain custom Python scripts to streamline data processing and generate actionable insights.
Develop visual reports to communicate findings and drive continuous improvement.
Collaborate with cross-functional teams globally to identify and address pain points in conversational AI performance.
Use prompt engineering techniques to refine LLM outputs and articulate system health.
Ideal Candidate :
3+ years of experience in machine learning evaluation, data analysis, or related technical roles.
Intermediate to advanced Python scripting, including log parsing and API testing.
Familiarity with GenAI and LLMs, including automated workflows and API integrations.
Strong analytical mindset, capable of working independently and identifying innovative solutions.
Excellent communication skills, able to present complex findings clearly to both technical and non-technical stakeholders.