Test Engineer-AI/LLMOPPO US Research Center • Palo Alto, CA, US

Test Engineer-AI / LLM

OPPO US Research Center • Palo Alto, CA, US

30+ days ago

Job type

Full-time

Quick Apply

Job description

OPPO US Research Center is seeking a full-time meticulous and innovative AI / LLM Test Engineer to join our cutting-edge AI team. In this critical role, you will evaluate the performance, reliability, and safety of Large Language Models (LLMs) in real-world product scenarios and test end-to-end generative AI solutions. Your work will directly shape how users experience AI-powered features by ensuring robustness, accuracy, and alignment with product goals. This is a unique opportunity to pioneer testing methodologies for next-generation AI systems at the forefront of technology.

We are also seeking a Contractor based LLM Evaluation & QA Engineer to support the testing and validation of large language model (LLM)-powered applications. You will help implement test strategies, execute evaluation workflows, and assist in model performance validation across diverse generative AI use cases.

This contract role is ideal for someone with hands-on experience in AI / ML evaluation, QA engineering, or data analysis who wants to deepen their exposure to generative AI systems.

Requirements

Full-time position requirement :

Core Testing & Evaluation

Design and execute performance tests for LLMs across diverse product use cases (e.g., chatbots, content generation etc.).
Develop automated test frameworks to evaluate LLM outputs for accuracy, bias, safety, and coherence.
Conduct end-to-end testing of integrated generative AI solutions, including APIs, data pipelines, and user interfaces.

Optimization & Validation

Collaborate with ML engineers to validate fine-tuned models and optimize prompts for target scenarios.

Analyze model failures, edge cases, and adversarial inputs to identify risks and improvement areas.

Benchmark LLM performance against industry standards and product-specific KPIs.

Collaboration & Quality Assurance

Partner with product, engineering, and research teams to define test requirements and acceptance criteria.

Document defects, performance metrics, and test results to drive data-driven improvements.

Advocate for AI ethics and safety through rigorous testing of fairness, bias mitigation, and content moderation.

Innovation & Tooling

Build scalable tools for synthetic test data generation, prompt variation testing, and automated evaluation workflows.

Stay current with advancements in generative AI testing, including red-teaming techniques and evaluation frameworks (e.g., HELM, Dynabench).

Propose novel testing strategies for emerging challenges (e.g., hallucinations, context drift).

Basic Qualifications :

Bachelor’s degree in Computer Science, Data Science, Engineering, or a related technical field, or equivalent practical experience.

1+ years of experience in software testing, data science, or ML validation, with exposure to AI / ML systems.

Proficiency in Python and testing frameworks (e.g., PyTest, Selenium).

Hands-on experience evaluating LLMs in production environments (e.g., GPT, Claude, Llama, Gemini).

Strong analytical skills for dissecting model behavior, statistical performance, and failure modes.

Familiarity with cloud platforms (GCP, Azure, or AWS) and MLOps tooling (e.g., MLflow, Weights & Biases).

Experience with version control (Git) and agile development methodologies.

Preferred Qualifications :

Master’s degree in AI, Machine Learning, or a related field.

Expertise in prompt engineering, LLM fine-tuning (e.g., LoRA, RLHF), or optimization techniques.

Experience with automated evaluation tools (e.g., LangChain, TruLens) or LLM-specific test suites.

Knowledge of data pipelines, SQL / NoSQL databases, and API testing (e.g., Postman).

Background in statistics, quantitative analysis, or data visualization for test insights.

Contributions to AI safety / ethics initiatives or open-source LLM evaluation projects.

Experience testing mobile-integrated AI solutions (Android / iOS).

Contractor position requirements :

Testing & Evaluation Support :

Execute pre-defined performance tests for LLMs across various tasks (e.g., summarization, Q&A, chatbot flows).

Run scripted evaluations to assess outputs for factuality, coherence, and safety.

Perform manual and automated test execution on APIs and LLM-integrated user interfaces.

Prompt & model validation :

Assist ML engineers in evaluating prompt variations and prompt-tuning outcomes.

Log and analyze failure cases, anomalies, and edge cases based on provided guidelines.

Collabration & Documentation

Work with QA leads, product managers, and ML engineers to understand test goals and criteria.

Report defects, compile evaluation summaries, and maintain testing logs.

Tooling & Antomation :

Use existing internal tools or frameworks to automate test runs and result collection.

Contribute to prompt generation, input templating, or result tagging processes.

Basic Qualifications :

Bachelor's degree or equivalent work experience in a technical field (e.g., Computer Science, Engineering, Data Science).

6+ months experience in software QA, data labeling, LLM evaluation, or ML testing projects.

Basic Python proficiency, especially for data processing and automation tasks.

Familiarity with LLMs (e.g., GPT, Claude, Gemini) and prompt-based outputs.

Comfortable working with tools like Jupyter, Postman, or testing dashboards.

Detail-oriented with good documentation habits.

Contractor Details :

Duration : Long term

Rate : Commensurate with experience

Conversion Opportunity : High-performing contractors may be considered for full-time roles

Benefits

OPPO is proud to be an equal opportunity workplace. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements.

The US base salary range for this full-time position is $100,000-$200,000 + bonus + long term incentives benefits. Our salary ranges are determined by role, level, and location.

Create a job alert for this search

Test • Palo Alto, CA, US

Related jobs

Test Engineer - ATE & SLT

Celestial AI • Santa Clara, CA, US

Full-time

As Generative AI continues to advance, the performance drivers for data center infrastructure are shifting from systems-on-chip (SOCs) to systems of chips. In the era of Accelerated Computing, data ...Show more

Last updated: 10 days ago • Promoted

Test Engineer II

VirtualVocations • San Francisco, California, United States

Full-time

A company is looking for a Test Engineer II.Key Responsibilities Verify and validate solutions against specified requirements throughout the development lifecycle Define and implement quality as...Show more

Last updated: 30+ days ago • Promoted

Test Automation Engineer

VirtualVocations • Fremont, California, United States

Full-time

A company is looking for a Test Automation Engineer to develop automated testing solutions for software applications.Key Responsibilities : Design, develop, and maintain test automation frameworks...Show more

Last updated: 30+ days ago • Promoted

Test Engineer 1

Eliassen Group • San Ramon, CA, US

Temporary

Our client, a leading multinational telecommunications company, has an excellent opportunity for a Test Engineer 1 to work on a 12-month contract opportunity. Work will be on-site in San Ramon, CA.T...Show more

Last updated: 22 days ago • Promoted

Manual Test Engineer

VirtualVocations • Santa Clara, California, United States

Full-time

A company is looking for a Manual Test Engineer - Remote.Key Responsibilities Develop and execute manual test cases based on user stories and design specifications Perform functional, regression...Show more

Last updated: 30+ days ago • Promoted

GNC Test Engineer - Model Based Design

Pivotal • Palo Alto, CA, US

Full-time

Pivotal is the leader in the emerging market of electric Vertical Takeoff and Landing (eVTOL) aircraft.We design, develop, and manufacture light eVTOL aircraft and are renowned for the BlackFly, th...Show more

Last updated: 15 days ago • Promoted

Senior Systems Integration & Test Engineer - Power, Propulsion, and Hardware

Pivotal • Palo Alto, CA, US

Full-time

Last updated: 15 days ago • Promoted

Senior Software Engineer - Hardware Test

Zipline • South San Francisco, CA, US

Full-time

Do you want to change the world? Zipline is on a mission to transform the way goods move.Our aim is to solve the world's most urgent and complex access challenges by building, manufacturing and...Show more

Last updated: 30+ days ago • Promoted

System Test Software Engineer

Etched • San Jose, CA, US

Full-time

Etched is building the world’s first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically lower cost and latency than a B200.With Etche...Show more

Last updated: 30+ days ago • Promoted

Senior Integration & Test Engineer (Optomechanical, Thermal)

PsiQuantum • Milpitas, CA, United States

Full-time

Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Show more

Last updated: 15 days ago • Promoted

Test Engineer, Hardware-in-the-Loop (HIL)

Nimble Robotics • South San Francisco, CA, US

Full-time

You will be taking on a critical responsibility for ensuring the reliability and functional safety of our core robotics control software and firmware. This role sits at the intersection of developme...Show more

Last updated: 15 days ago • Promoted

Manufacturing Test Software Engineer

Ouster • San Francisco, CA, US

Full-time

At Ouster, we build sensors and tools for engineers, roboticists, and researchers, so they can make the world safer and more efficient. We've transformed LIDAR from an analog device with thousan...Show more

Last updated: 29 days ago • Promoted

Senior Robotics Release and Test Engineer

Chef Robotics • San Francisco, CA, US

Full-time

Chef Robotics is on a mission to accelerate the advent of intelligent machines in the physical world.As the rise of LLMs like ChatGPT has shown, AI has the potential to drive immense change.However...Show more

Last updated: 15 days ago • Promoted

Test Automation Architect

Rockwoods Inc • San Francisco, CA, US

Full-time

Job Title : Test Automation Architect.Pleasanton, CA (Onsite – 5 Days a Week).We are seeking a highly experienced.This is a hands-on technical leadership role requiring deep expertise in automation ...Show more

Last updated: 9 days ago • Promoted

Test Engineer-AI / LLM

OPPO US Research Center • Palo Alto, CA, US

Full-time

OPPO US Research Center is seeking a.In this critical role, you will evaluate the performance, reliability, and safety of Large Language Models (LLMs) in real-world product scenarios and test end-t...Show more

Last updated: 30+ days ago • Promoted

Test Engineer I

VirtualVocations • Santa Clara, California, United States

Full-time

A company is looking for a Test Engineer I.Key Responsibilities Verify and validate solutions against specified requirements throughout the development lifecycle Define and track quality assuran...Show more

Last updated: 3 days ago • Promoted

Avionics Test Engineer

Reliable Robotics • Mountain View, CA, United States

Permanent

We're building safety-enhancing technology for aviation that will save lives.Automated aviation systems will enable a future where air transportation is safer, more convenient and fundamentally tra...Show more

Last updated: 30+ days ago • Promoted

Test Engineer, Hardware-in-the-Loop (HIL)

Nimble • San Francisco, CA, United States

Full-time

Last updated: 25 days ago • Promoted