AI Applied Scientist Intern, Evaluation Systems and Metrics
Zillow
Job Overview
Who's the hiring manager?
Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Job Description
About The Team
Are you passionate about building rigorous evaluation frameworks that advance AI systems? The Zillow AI Applied Science team develops next-generation evaluation methodologies for generative AI, computer vision, and agentic systems. We work at the intersection of research and production, designing evaluation frameworks that assess current AI capabilities and adapt as technology advances.
About The Role
We are seeking remote PhD interns for Summer 2026!
As an AI Applied Scientist Intern, Evaluation Systems and Metrics, you will help develop cutting-edge evaluation methodologies for AI systems. Your research will focus on creating robust, scalable metrics and frameworks to assess the quality, consistency, and performance of generative models across multiple modalities. You may contribute in one or more of the following areas:
- Novel Evaluation Metrics: Develop innovative assessment methodologies for emerging AI capabilities, focusing on consistency and quality across complex multi-modal outputs
- Self-Improving Assessment: Design evaluation systems that learn and adapt from feedback, automatically discovering new evaluation criteria and improving assessment quality over time
- Privacy-Preserving Evaluation: Design frameworks that incorporate domain-specific implementations of differential privacy to protect sensitive user information while maintaining utility for model training and assessment.
- Ethical Fair Housing Evaluation: Develop scalable methodologies for assessing agentic systems, ensuring compliance with fair housing standards and promoting ethical, responsible AI deployment
This role has been categorized as a Remote position. “Remote” employees do not have a permanent corporate office workplace and, instead, work from a physical location of their choice, which must be identified to the Company. U.S. employees may live in any of the 50 United States, with limited exceptions.
Who you are
- Currently enrolled as a PhD student in computer science, machine learning, computer vision, or a related field, with strong publication record
- Candidates should have a background in one or more of the following areas:
- Evaluation methodologies for AI/ML systems
- Computer vision metrics and 3D consistency assessment
- Generative model evaluation (text, image, video, 3D)
- Multi-modal assessment and automated feedback systems
- Knowledge of data privacy methods (e.g., differential privacy, federated learning, secure ML) and their application.
- Single agent or multi-agent system evaluations
- Familiarity with modern deep learning frameworks (e.g., PyTorch, Hugging Face Transformers)
- Strong research mindset, with motivation to publish
- Interest in applying AI to complex, multi-stakeholder domains
- A record of publication in conferences, workshops, or journals is a plus
Here at Zillow - we value the experience and perspective of candidates with non-traditional backgrounds. We encourage you to apply if you have transferable skills or related experiences.
Key skills/competency
- AI Evaluation
- Machine Learning
- Generative Models
- Computer Vision
- Multi-modal Assessment
- Differential Privacy
- Agentic Systems
- Ethical AI
- PyTorch
- Hugging Face
How to Get Hired at Zillow
- Research Zillow's culture: Study their mission, values, recent news, and employee testimonials on LinkedIn and Glassdoor.
- Tailor your resume: Highlight your PhD research, AI evaluation expertise, machine learning projects, and any relevant publications.
- Showcase relevant projects: Detail academic or personal projects in generative AI, computer vision, or agentic systems evaluation.
- Prepare for technical deep dives: Brush up on AI/ML evaluation methodologies, deep learning frameworks like PyTorch, and data privacy methods.
- Emphasize ethical AI understanding: Discuss your perspectives on fairness, consistency, and privacy in AI deployment and assessment.
Frequently Asked Questions
Find answers to common questions about this job opportunity
Explore similar opportunities that match your background