2 months ago

Human Evaluation Program Manager

Netflix

Hybrid

Full Time

$285,000

Hybrid

Apply

Job Overview

Job TitleHuman Evaluation Program Manager

Job TypeFull Time

Offered Salary$285,000

LocationHybrid

Who's the hiring manager?

Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Uncover Hiring Manager

Job Description

Human Evaluation Program Manager

At Netflix, our mission is to entertain the world. Together, we are writing the next episode - pushing the boundaries of storytelling, global fandom and making the unimaginable a reality. We are a dream team obsessed with the uncomfortable excitement of discovering what happens when you merge creativity, intuition and cutting-edge technology. Come be a part of what’s next.

Now as Netflix explores a broader world of entertainment—expanding into Games, Ads and a world of AI – we are looking for an experienced Human Evaluations Program Manager to help drive the growth of a new business unit at Netflix focused on how we evaluate and train ML and Generative AI models for use in our in-product experience.

About The Role

Netflix is building toward more intelligent and responsive systems—and thoughtful, high-quality evaluation is essential to making sure we’re moving in the right direction. Join a team who are creating the frameworks, tools, and workflows that ensure human judgment is applied with consistency, clarity, and care—whether we’re evaluating helpfulness, tone, safety, relevance, or creative quality.

You’ll not only shape how human and AI-driven evaluations are designed—but also own the day-to-day execution of these efforts. From scoping and planning, to rater onboarding and calibration, you’ll be accountable for driving delivery from start to finish. Just as critically, you’ll act as a thought partner and influencer—bringing stakeholders along as you introduce new ways of working, build alignment across teams, and establish a shared language around quality. Your work will help ensure that AI features at Netflix are not only high-performing, but also aligned with our values, our users, and the creative integrity that defines our brand.

You’ll work in a small team to ensure that evaluation designs are not only rigorous and aligned—but also effectively resourced, scoped, and executed at scale.

The Opportunity

This is a rare opportunity to get in on the ground floor of a function that will shape how we measure and guide the performance of AI systems at Netflix. In this role, you’ll partner across research, product, UX, and engineering to develop frameworks, rubrics, and workflows that enable rigorous, scalable human evaluation. But beyond shaping the “what” and “how,” you’ll also lead the “when” and “done.” You’ll be responsible for keeping evaluation projects on track—ensuring consistent execution, timely delivery, and high rater alignment. If you're excited to bring structure to ambiguity and influence how Netflix develops responsible AI—while being accountable for tangible delivery—this is your chance to create meaningful impact from day one.

The Ideal Candidate

The ideal candidate brings a rare combination of structure and flexibility. You know how to create evaluation frameworks that are rigorous and scalable—and you’re also a driver who gets them out the door. You’re skilled at translating vision into workflows, defining milestones, and delivering consistent results in a dynamic environment. You can steer teams across functions, keep timelines on track, and ensure rater quality without micromanaging. You thrive in spaces where there’s no roadmap, and you take pride in making things real, not just possible.

Responsibilities

Lead end-to-end execution of human evaluation and data operations initiatives—from intake and scoping to delivery
Develop and operationalize frameworks for evaluating GenAI and ML outputs
Collaborate across research, product, UX, and engineering to embed evaluation into model development cycles
Build and maintain project timelines, proactively manage blockers, and ensure timely execution
Develop clear, scalable guidelines and scoring rubrics to ensure consistent rater judgment
Oversee rater onboarding, calibration, and QA workflows
Define and monitor success metrics such as speed to IRR, throughput, and task effectiveness
Pilot and refine evaluation tasks to improve clarity, inter-rater reliability, and feedback quality
Build foundational documentation and drive adoption of best practices across teams
Track evaluation health and proactively communicate progress to stakeholders clearly and proactively
Anticipate and proactively resolve bottlenecks and blockers
Act as the connective tissue across multiple partners to ensure alignment and effective execution of evaluations at scale

Qualifications

4+ years of experience working in human evaluations, data collection, labeling, or annotation operations in GenAI/ML environments
Track record of implementing process improvements or quality control systems for data collection needs
Prior experience managing human annotation vendors, raters, or data labeling teams
Strong understanding of evaluation design, including guidelines, rubrics, and scoring protocols
Proven ability in end-to-end management of complex, cross-functional programs, demonstrating strong Program Management skills and clear accountability for successful delivery.
Experience with human labeling platforms
Excellent written and verbal communication skills
Ability to synthesize feedback into clear recommendations and process improvements
Familiarity with responsible AI principles and how to embed them into evaluation design
Strong organizational skills and executional focus; ability to track details while seeing the bigger picture

Key skills/competency

Human Evaluation
Program Management
ML Models
Generative AI
Data Operations
Evaluation Frameworks
Rater Management
Cross-functional Collaboration
Quality Control
Responsible AI

Tags:

Human Evaluation Program Manager

human evaluation

program management

ML evaluation

Generative AI

data operations

evaluation frameworks

cross-functional collaboration

quality control

stakeholder management

process improvement

Machine Learning

Generative AI

AI models

data labeling platforms

evaluation design

workflow automation

data annotation

model development

Responsible AI

quality systems

How to Get Hired at Netflix

Research Netflix's culture: Study their mission, values, recent news, and employee testimonials on LinkedIn and Glassdoor.
Customize your resume: Tailor your Human Evaluation Program Manager resume to highlight experience in GenAI/ML evaluation and cross-functional program management, using keywords from the job description.
Showcase relevant experience: Prepare to discuss specific projects where you led data operations, developed evaluation frameworks, and managed rater teams in AI/ML environments.
Demonstrate program management skills: Emphasize your ability to drive end-to-end execution, manage timelines, and ensure stakeholder alignment for complex initiatives.
Prepare for behavioral questions: Be ready to articulate how you handle ambiguity, foster collaboration, and embed responsible AI principles in your work, aligning with Netflix's unique environment.

Frequently Asked Questions

Find answers to common questions about this job opportunity

01What is the primary focus of the Human Evaluation Program Manager role at Netflix?

02What kind of AI systems will this role help evaluate at Netflix?

03What specific experience is Netflix looking for in a Human Evaluation Program Manager candidate?

04How does the Human Evaluation Program Manager collaborate with other teams at Netflix?

05What are the key responsibilities for project execution in this Human Evaluation Program Manager role?

06How important is a strong understanding of Responsible AI for this Netflix role?

07What defines the ideal candidate for the Human Evaluation Program Manager position at Netflix?

Explore similar opportunities that match your background

This job post expired on March 17, 2026

Human Evaluation Program Manager

Netflix

Job Overview

Who's the hiring manager?

Job Description

Human Evaluation Program Manager

About The Role

The Opportunity

The Ideal Candidate

Responsibilities

Qualifications

Key skills/competency

Tags:

Share Job:

How to Get Hired at Netflix

Frequently Asked Questions