Software Engineer - GenAI Evaluations, AiDP

AppleSan Francisco, CA, United States

1 day ago

Job type

Full-time

Job description

Weekly Hours : 40

Role Number : 200616065-3401

Summary

We are seeking a driven and analytical Software Engineer to join Apple’s Generative AI Evaluations team. In this role, you will help define how we measure, monitor, and improve the performance of AI systems that power next-generation user experiences. You will design robust evaluation frameworks, translate cutting-edge research into practical tooling, and collaborate closely with cross-functional teams to ensure our GenAI solutions are trustworthy, efficient, and high-quality. This is a unique opportunity to influence both the inner workings of Apple’s AI platforms and the broader standard for evaluating generative applications at scale.

Description

As a Software Development Engineer on the Evaluations Team, you’ll join a phenomenal team of hardworking engineers and will be entrusted with a range of responsibilities. Your tasks will include :

Designing and developing platform features for helping solution developers to experiment and identify optimal configurations for delivering high quality GenAI applications.

Evaluating and analyzing the performance of GenAI applications, and actively collaborating with the team in driving performance improvements

Translating the latest research into reliable and scalable evaluations that can deliver high quality experiences for our users.

Actively engaging in all aspects of feature development, from ideation and experimentation to deployment and maintenance.

Communicating complex technical topics effectively to a diverse audience.

Minimum Qualifications

Bachelor’s in Computer Science, Artificial Intelligence, Machine Learning, or a related field or experience

2+ years of software engineering experience

Programming skills in Python

Experience developing scalable and robust services with FastAPI or similar frameworks.

Experience in Machine Learning, with a particular emphasis on Large Language Models (LLMs), Retrieval Augmented Generation (RAG) or GenAI Agents

Experience with evaluating and optimizing Generative AI platforms or applications

Preferred Qualifications

Experience with GenAI RAG and Agent evaluation frameworks like RAGAS, DeepEvals, OpenEvals, AgentEvals or OpenAI Evals

Familiarity with LLM Observability techniques and best practices

Proven ability to comprehend, interpret, and apply cutting-edge research into tangible applications

Proven problem-solving and leadership abilities, with the capacity to steer the team's research and build practical applications in a collaborative and fast-paced environment

Customer-focused with strong business acumen, capable of translating business needs into impactful technical solutions and a proven history of successfully shipping products that drive significant outcomes

Experience with cloud platforms like AWS, GCP, or Azure

Knowledge of containerization and orchestration tools like Docker and Kubernetes

Creative, collaborative and project focused with an ability to work hands-on in multi-functional teams

Excellent communication skills with the ability to communicate with all stakeholders effectively, including senior leadership

Master’s in Computer Science, Artificial Intelligence, Machine Learning, or a related field

Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant () .

Create a job alert for this search

Software Engineer • San Francisco, CA, United States