AI Agent Testing Specialist
Braintrust
Job Overview
Who's the hiring manager?
Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Job Description
About Braintrust's AI Agent Testing Specialist Role
Please submit your resume in English and indicate your level of English.
At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI. The platform connects domain experts with cutting-edge AI projects from innovative tech clients, unlocking the potential of GenAI through real-world expertise.
Role Overview
As the AI Agent Testing Specialist, you will design realistic and structured evaluation scenarios for LLM-based agents. You will create test cases that simulate human-performed tasks, define the golden path for expected outcomes, annotate task steps, and collaborate with developers to review outputs and refine tests.
Main Responsibilities
- Design structured test scenarios based on real-world tasks.
- Define golden path and acceptable agent behaviors.
- Annotate task steps, expected outputs, and edge cases.
- Collaborate with developers to test and clarify scenarios.
- Review agent outputs and adjust tests accordingly.
How To Get Started
Simply apply to this post, qualify, and contribute to projects aligned with your skills on your own schedule. From creating training prompts to refining model responses, your work will help shape the future of AI.
Requirements
- Bachelor's and/or Master’s degree in Computer Science, Software Engineering, Data Science, AI/ML, Computational Linguistics, or related fields.
- Background in QA, software testing, data analysis, or NLP annotation.
- Good understanding of test design principles including reproducibility, coverage, and handling edge cases.
- Strong written communication skills in English with ability to work in structured formats like JSON/YAML.
- Basic experience with Python and JavaScript.
- Curiosity and open-mindedness to work with AI-generated content and complex guidelines.
Nice to Have
- Experience in writing both manual and automated test cases.
- Familiarity with LLM capabilities and typical failure modes.
- Understanding of scoring metrics like precision, recall, and reward functions.
Benefits
This fully remote freelance opportunity allows you to work on your own schedule from anywhere. You will gain valuable experience, enhance your portfolio, and directly influence how future AI models interpret and communicate.
Key skills/competency
- AI Testing
- LLM Evaluation
- Test Design
- NLP Annotation
- QA
- Python
- JavaScript
- JSON
- Analytical Thinking
- Attention to Detail
How to Get Hired at Braintrust
- Customize Your Resume: Highlight relevant AI testing and IT expertise.
- Emphasize Structured Testing: Detail testing scenarios and QA experience.
- Showcase Communication: Demonstrate strong written English skills.
- Prepare for Technical Interviews: Review Python, JS, and testing principles.
Frequently Asked Questions
Find answers to common questions about this job opportunity
Explore similar opportunities that match your background