AI Agent Testing Specialist
Braintrust
Job Overview
Who's the hiring manager?
Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Job Description
About the Role
The AI Agent Testing Specialist role at Braintrust calls for designing realistic and structured evaluation scenarios for LLM-based agents. Candidates will create test cases simulating human-performed tasks, define gold-standard behaviors, and collaborate with developers to refine test scenarios.
Responsibilities
- Design structured test scenarios based on real-world tasks.
- Define the golden path and set acceptable agent behavior.
- Annotate task steps, expected outputs, and edge cases.
- Collaborate with developers to test and clarify scenarios.
- Review agent outputs and adapt tests accordingly.
How to Get Started
Apply to this post with your resume in English and indicate your level of English proficiency. Enjoy a flexible, remote, freelance schedule that fits your current commitments.
Requirements
- Bachelor's/Master’s degree in Computer Science, Software Engineering, Data Science, AI/ML, or related fields.
- Background in QA, software testing, data analysis, or NLP annotation.
- Good understanding of test design principles including reproducibility and edge cases.
- Strong written communication skills in English.
- Familiarity with structured formats like JSON/YAML.
- Basic experience with Python and JavaScript.
Nice to Have
- Experience in manual or automated test case writing.
- Familiarity with LLM capabilities and common failure modes.
- Understanding of scoring metrics such as precision and recall.
Benefits
This freelance, remote role allows you to work on your own schedule, be part of an advanced AI project, and gain valuable experience that will enhance your portfolio.
Key skills/competency
- Test Design
- Scenario Development
- LLM Evaluation
- NLP Annotation
- QA Methodologies
- Python
- JavaScript
- Analytical Thinking
- Attention to Detail
- Remote Collaboration
How to Get Hired at Braintrust
- Customize Your Resume: Tailor your skills to AI testing and QA.
- Highlight Technical Skills: Emphasize Python, JS, and JSON/YAML experience.
- Prepare Scenario Examples: Showcase your test design expertise.
- Research Braintrust: Understand their projects and remote culture.
Frequently Asked Questions
Find answers to common questions about this job opportunity
Explore similar opportunities that match your background