13 days ago

AI Agent Testing Specialist

Braintrust

Hybrid
Full Time
$120,000
Hybrid

Job Overview

Job TitleAI Agent Testing Specialist
Job TypeFull Time
CategoryCommerce
Experience5 Years
DegreeMaster
Offered Salary$120,000
LocationHybrid

Who's the hiring manager?

Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Uncover Hiring Manager

Job Description

About Braintrust's AI Agent Testing Specialist Role

Please submit your resume in English and indicate your level of English.

At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI. The platform connects domain experts with cutting-edge AI projects from innovative tech clients, unlocking the potential of GenAI through real-world expertise.

Role Overview

As the AI Agent Testing Specialist, you will design realistic and structured evaluation scenarios for LLM-based agents. You will create test cases that simulate human-performed tasks, define the golden path for expected outcomes, annotate task steps, and collaborate with developers to review outputs and refine tests.

Main Responsibilities

  • Design structured test scenarios based on real-world tasks.
  • Define golden path and acceptable agent behaviors.
  • Annotate task steps, expected outputs, and edge cases.
  • Collaborate with developers to test and clarify scenarios.
  • Review agent outputs and adjust tests accordingly.

How To Get Started

Simply apply to this post, qualify, and contribute to projects aligned with your skills on your own schedule. From creating training prompts to refining model responses, your work will help shape the future of AI.

Requirements

  • Bachelor's and/or Master’s degree in Computer Science, Software Engineering, Data Science, AI/ML, Computational Linguistics, or related fields.
  • Background in QA, software testing, data analysis, or NLP annotation.
  • Good understanding of test design principles including reproducibility, coverage, and handling edge cases.
  • Strong written communication skills in English with ability to work in structured formats like JSON/YAML.
  • Basic experience with Python and JavaScript.
  • Curiosity and open-mindedness to work with AI-generated content and complex guidelines.

Nice to Have

  • Experience in writing both manual and automated test cases.
  • Familiarity with LLM capabilities and typical failure modes.
  • Understanding of scoring metrics like precision, recall, and reward functions.

Benefits

This fully remote freelance opportunity allows you to work on your own schedule from anywhere. You will gain valuable experience, enhance your portfolio, and directly influence how future AI models interpret and communicate.

Key skills/competency

  • AI Testing
  • LLM Evaluation
  • Test Design
  • NLP Annotation
  • QA
  • Python
  • JavaScript
  • JSON
  • Analytical Thinking
  • Attention to Detail

Tags:

AI Agent Testing Specialist
QA
Test Design
LLM Evaluation
NLP
Python
JavaScript
JSON
Structured Testing
Remote

Share Job:

How to Get Hired at Braintrust

  • Customize Your Resume: Highlight relevant AI testing and IT expertise.
  • Emphasize Structured Testing: Detail testing scenarios and QA experience.
  • Showcase Communication: Demonstrate strong written English skills.
  • Prepare for Technical Interviews: Review Python, JS, and testing principles.

Frequently Asked Questions

Find answers to common questions about this job opportunity

Explore similar opportunities that match your background