Job Overview
Who's the hiring manager?
Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Job Description
About the Role
This role exists to support a leading AI research organization in improving the quality, usefulness, and reliability of general-purpose conversational AI systems. The focus is on ensuring AI systems demonstrate rigorous formal reasoning, conceptual clarity, and mathematical correctness in mathematics-related use cases. You will be instrumental in evaluating and enhancing how models reason about mathematical problems, explanations, and proofs.
What You’ll Do
- Write and refine prompts to guide AI model behavior in mathematical contexts.
- Evaluate large language model (LLM) responses to mathematics-related queries for correctness, rigor, and logical coherence.
- Verify mathematical claims, derivations, and proofs using deep domain expertise.
- Conduct fact-checking using authoritative public sources and mathematical knowledge.
- Annotate model responses by identifying strengths, weaknesses, and factual or conceptual errors.
- Assess clarity, structure, and suitability of explanations for different audience levels.
- Ensure outputs align with expected conversational standards and evaluation guidelines.
- Apply consistent evaluation frameworks, benchmarks, and taxonomies across tasks.
Who You Are
- PhD in Mathematics or a closely related field.
- Demonstrated expertise in Probability & Statistics, with experience in one or more of the following areas: Algebra & Number Theory, Calculus & Analysis, Geometry & Topology, Discrete Mathematics, Logic & Computation.
- Comfortable working with large language models and understanding real-world usage patterns.
- Excellent written communication skills with the ability to explain complex concepts clearly.
- Strong attention to detail and ability to identify subtle logical or conceptual issues.
- Experience reviewing, editing, or evaluating technical or academic writing.
Nice-to-Have Specialties
- Prior experience with RLHF, model evaluation, or data annotation.
- Teaching, mentoring, or explaining mathematics to non-expert audiences.
- Familiarity with structured evaluation rubrics, benchmarks, or review frameworks.
What Success Looks Like
- You consistently identify inaccuracies or weak reasoning in mathematical AI outputs.
- Your feedback improves rigor, correctness, and clarity of model responses.
- You deliver reliable, reproducible evaluation artifacts that strengthen AI performance.
- The client’s AI systems are trusted in mathematical contexts due to your rigorous evaluations.
Why Join This Project
This opportunity allows mathematicians to apply deep theoretical expertise to the evaluation and improvement of cutting-edge AI systems. The role is fully remote and flexible, enabling you to directly influence how mathematical reasoning is represented and communicated at scale.
Contract and Payment Terms
- Engagement as an independent contractor.
- Fully remote with flexible scheduling.
- Projects may be extended, shortened, or concluded based on performance and project needs.
- Work will not involve access to confidential or proprietary information from any employer, client, or institution.
- Payments are issued weekly via Stripe or Wise based on services rendered.
- H1-B and STEM OPT candidates cannot be supported at this time.
Key skills/competency
- Mathematics
- AI Evaluation
- Large Language Models (LLMs)
- Prompt Engineering
- Reasoning
- Accuracy
- Mathematical Proofs
- Data Annotation
- Technical Writing
- Problem Solving
How to Get Hired at Taskify AI
- Tailor your resume: Highlight your PhD in Mathematics, expertise in Probability & Statistics, and any experience with Algebra, Calculus, or Logic.
- Showcase LLM experience: Detail any work with large language models, prompt engineering, or data annotation in your application.
- Emphasize communication skills: Provide examples of explaining complex mathematical concepts clearly in writing.
- Prepare for technical questions: Be ready to discuss your approach to verifying mathematical proofs and identifying logical errors.
- Highlight attention to detail: Showcase your ability to spot subtle inaccuracies in technical content.
Frequently Asked Questions
Find answers to common questions about this job opportunity
Explore similar opportunities that match your background