AI Model Evaluator
Mercor
Job Overview
Who's the hiring manager?
Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Job Description
About Mercor
Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, Mercor boasts a strong investor backing including Benchmark, General Catalyst, Peter Thiel, Adam D'Angelo, Larry Summers, and Jack Dorsey.
The AI Model Evaluator Role
As an AI Model Evaluator, you will play a crucial role in assessing the quality and effectiveness of large language model (LLM) responses. This contract position offers flexibility as either a full-time or part-time opportunity, with compensation at $45/hour. This role is geographically restricted to candidates located in the US, UK, or Canada.
Role Responsibilities
- Evaluate LLM-generated responses for their ability to effectively answer user queries.
- Conduct thorough fact-checking using trusted public sources and external tools.
- Generate high-quality human evaluation data by annotating response strengths, areas for improvement, and factual inaccuracies.
- Assess reasoning quality, clarity, tone, and completeness of AI model responses.
- Ensure model responses adhere to expected conversational behavior and system guidelines.
- Apply consistent annotations by diligently following clear taxonomies, benchmarks, and detailed evaluation guidelines.
Qualifications
Must-Have
- Bachelor’s degree.
- Significant experience utilizing large language models (LLMs).
- Excellent writing skills.
- Strong attention to detail.
- Adaptable and comfortable moving across diverse topics, domains, and customer requirements.
- Background or experience in fields requiring structured analytical thinking.
- Excellent college-level mathematics skills.
Preferred
- Prior experience with Reinforcement Learning from Human Feedback (RLHF), model evaluation, or data annotation work.
- Experience writing or editing high-quality written content.
- Experience comparing multiple outputs and making fine-grained qualitative judgments.
- Familiarity with evaluation rubrics, benchmarks, or quality scoring systems.
Application Process
The application process takes approximately 20-30 minutes to complete and involves three key steps:
- Upload your resume.
- Complete an AI interview tailored to your resume.
- Submit the application form.
Our team reviews applications daily. Please ensure you complete all steps for consideration. For detailed information regarding the interview process and platform, please visit: https://talent.docs.mercor.com/welcome/welcome. For support, reach out to: support@mercor.com.
Key skills/competency
- LLM Evaluation
- Fact-Checking
- Data Annotation
- Analytical Thinking
- Writing Skills
- Attention to Detail
- AI Models
- Quality Assurance
- Taxonomy Adherence
- RLHF
How to Get Hired at Mercor
- Research Mercor's culture: Study their mission, values, recent news, and investor backing (Benchmark, Peter Thiel) on LinkedIn and Glassdoor.
- Tailor your resume for AI evaluation: Highlight experience with LLMs, data annotation, quality assurance, and analytical thinking for the AI Model Evaluator role.
- Excel in the AI interview: Prepare for questions on LLM interaction, critical analysis, and data interpretation, demonstrating strong communication skills.
- Showcase attention to detail: Emphasize your precision in fact-checking, adherence to guidelines, and ability to make fine-grained qualitative judgments.
- Connect with Mercor professionals: Network on LinkedIn with current employees or hiring managers to gain insights and express your interest.
Frequently Asked Questions
Find answers to common questions about this job opportunity
Explore similar opportunities that match your background