AI Model Evaluation Specialist

Inizio PartnersNew York, NY, United States

30+ days ago

Job type

Full-time

Job description

About the job AI Model Evaluation Specialist

Key Responsibilities :

Perform scoring and qualitative evaluations ofLLM-generated responses across multiple use cases.
Develop and maintain scoring guidelines and rubrics toensure consistency and objectivity.
Collaborate with data scientists, product managers, andengineering teams to align scoring with project goals.
Assist in the creation and labeling of high-qualityevaluation datasets for prompt tuning or model fine-tuning.
Utilize NLP-based metrics and tools (e.g., ROUGE, BLEU,cosine similarity) for automated scoring support.
Document scoring patterns, common model errors, andimprovement opportunities.
Contribute to prompt experimentation and help compareeffectiveness of different prompt strategies.

Qualifications :

Prior experience with LLMs (e.g., GPT, Claude, LLaMA,etc.) or AI / NLP projects is highly preferred.

Strong analytical skills and attention to detail,especially in assessing language quality.

Familiarity with prompt engineering, generative AI, orconversational AI tools is a plus.

Hands-on experience with Python, Jupyter, or evaluationlibraries (optional but desirable).

Experience working with evaluation frameworks orannotation tools (Label Studio, Prodigy, etc.) is a bonus.

Excellent written and verbal communication skills

Create a job alert for this search

Model • New York, NY, United States

Related jobs

Promoted

VP - Global Markets AI / ML Modeller

Barclays Bank PLCNew York, NY, US

Full-time

Join us as a VP - Global Markets AI / ML Modeller.At Barclays, our vision is clear—to redefine the future of banking and help craft innovative solutions. Within Global Markets, we are building...Show moreLast updated: 6 days ago

Promoted

Market Research Contributor

Prime InsightsSomerset, NJ, US

Full-time

Join thousands of members already earning with top-paying surveys and offers.Get started today and enjoy competitive rewards, fast payouts with no waiting periods, and the flexibility to participat...Show moreLast updated: 5 days ago

Promoted

AI-Enabled Global Study Manager

SanofiMORRISTOWN, NJ, US

Full-time

AI-Enabled Global Study Manager.Are you ready to shape the future of medicine? The race is on to speed up drug discovery and development to find answers for patients and their families.Your skills ...Show moreLast updated: 1 day ago

Promoted

Field Technical Support Scientist (Mass Spectrometry)

Shimadzu Scientific InstrumentsSomerset, NJ, United States

Full-time

Field Technical Support Scientist (Mass Spectrometry).New Jersey (Princeton, Morristown, Newark) or Pennsylvania (Doylestown, Allentown). Based on your location, a Cost of Living Adjustment (COLA) i...Show moreLast updated: 30+ days ago

Promoted

Freelance Market Research Contributor

Earn HausFreehold, New Jersey, US

Full-time +1

We are urgently looking for people interested in taking online surveys for Fortune 500 brands.If you are a self-starter, looking for flexible hours throughout the week, this may be for you! Earn up...Show moreLast updated: 1 day ago

Promoted

AI Implementation Specialist

PNY TechnologiesParsippany, NJ, United States

Full-time

The AI Implementation Specialist leads and supports the integration of artificial intelligence solutions within an organization. The main function is to assist in the development and execution of AI...Show moreLast updated: 30+ days ago

Promoted

AI Implementation Specialist

PNY Technologies IncParsippany, NJ, US

Full-time

Promoted

Remote Investment Analyst – AI Trainer ($50-$60 / hour)

Data AnnotationFreeport, New York

Remote

Full-time +1

We are looking for a finance professional to join our team to train AI models.You will measure the progress of these AI chatbots, evaluate their logic, and solve problems to improve the quality of ...Show moreLast updated: 20 days ago

Promoted

Remote FP&A Manager – AI Trainer ($50-$60 / hour)

Data AnnotationNew Brunswick, New Jersey

Remote

Full-time +1

Promoted

Remodeling Specialist

GunnerStamford, CT, United States

Full-time

Gunner Roofing is pioneering innovation in the industry blending incredible people with technology.Our mission is to enrich homeowners’ lives through trusted guidance, exceptional service, and last...Show moreLast updated: 5 days ago

Promoted

Remote Market Research Contributor

Earn HausUpper Montclair, New Jersey, US

Remote

Full-time +1

Promoted
New!

AI specialist

Tekfortune IncJersey City, NJ, United States

Permanent

Tekfortune is a fast-growing consulting firm specialized in permanent, contract & project-based staffing services for world's leading organizations in a broad range of industries.In this quickly ch...Show moreLast updated: 6 hours ago

Promoted

Staff Machine Learning Engineer, MLOps / LLMOps

Teladoc HealthPurchase, NY, United States

Full-time

Teladoc Health is a global, whole person care company made up of a diverse community of people dedicated to transforming the healthcare experience. As an employee, you're empowered to show up every ...Show moreLast updated: 5 days ago

Promoted
New!

AVP - Global Markets AI / ML Modeller

BarclaysNew York, NY, United States

Full-time

AVP - Global Markets AI / ML Modeller.Join us at Barclays, where our vision is to redefine the future of banking and craft innovative solutions. Within Global Markets, we are building an AI / ML capab...Show moreLast updated: 6 hours ago

Promoted

Senior Research Engineer, Model Evaluation

CohereNew York, NY, United States

Full-time

Our mission is to scale intelligence to serve humanity.We're training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like cont...Show moreLast updated: 30+ days ago

Promoted

Remote FinTech Product Analyst - AI Trainer ($50-$60 / hour)

Data AnnotationFreeport, New York

Remote

Full-time +1

Promoted

Machine Learning Research Lead, Security & Policy Research Lab

Scale AI, Inc.New York, NY, United States

Full-time

As the leading data and evaluation partner for frontier AI companies, Scale plays an integral role in understanding the capabilities and safeguarding AI models and systems.Building on this expertis...Show moreLast updated: 24 days ago

Promoted
New!

AI / ML - LLM Expertise

Atika TechnologiesJersey City, NJ, United States

Full-time

Location : NJ,USA , OR Tampa,FL.We are looking for immediate onboarding of resource who has expertise in designing and implementing LLM models. Senior dev lead with experience in implementing LLM mod...Show moreLast updated: 6 hours ago