AI Agent Testing Specialist
Braintrust
Job Overview
Who's the hiring manager?
Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Job Description
About the Role
At Mindrift, innovation meets opportunity. As an AI Agent Testing Specialist, you will design realistic and structured evaluation scenarios for LLM-based agents. Your responsibilities include creating test cases to simulate human-performed tasks, defining gold-standard behavior, annotating task steps and expected outputs, and working closely with developers to improve clarity.
Responsibilities
- Design structured test scenarios from real-world tasks.
- Define golden path and acceptable agent behavior.
- Annotate task steps, expected outputs, and edge cases.
- Collaborate with developers to test and refine scenarios.
- Review agent outputs and adapt tests accordingly.
How To Get Started
Simply apply to this post, qualify, and contribute to projects matched with your skills on your own schedule. Participate in creating training prompts, refining model responses, and shaping the future of AI.
Requirements
- Bachelor's and/or Master’s in related technical field.
- Background in QA, software testing, data analysis, or NLP annotation.
- Good understanding of test design principles and structured formats (JSON/YAML).
- Ability to define expected agent behaviors and scoring logic.
- Basic programming experience in Python and JS.
- Strong written communication in English.
Nice to Have
- Experience in writing manual or automated test cases.
- Familiarity with LLM capabilities and typical failure modes.
- Understanding of scoring metrics like precision and recall.
Benefits
This is a freelance, fully remote role that offers flexibility, the opportunity to work on advanced AI projects, and the chance to build a valuable portfolio by influencing how future AI models understand and communicate.
Key skills/competency
- AI Testing
- QA
- Test Design
- LLM
- NLP
- JSON
- Python
- JavaScript
- Data Analysis
- Attention to Detail
How to Get Hired at Braintrust
- Customize your resume: Highlight relevant QA and testing skills.
- Showcase technical expertise: Emphasize Python, JS, and test design proficiency.
- Prepare for interviews: Research Mindrift and their AI projects.
- Connect online: Leverage LinkedIn for company culture insights.
Frequently Asked Questions
Find answers to common questions about this job opportunity
Explore similar opportunities that match your background