13 days ago

AI Agent Testing Specialist

Braintrust

Hybrid
Full Time
$80,000
Hybrid

Job Overview

Job TitleAI Agent Testing Specialist
Job TypeFull Time
CategoryCommerce
Experience5 Years
DegreeMaster
Offered Salary$80,000
LocationHybrid

Who's the hiring manager?

Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Uncover Hiring Manager

Job Description

About the Role

The AI Agent Testing Specialist role at Braintrust calls for designing realistic and structured evaluation scenarios for LLM-based agents. Candidates will create test cases simulating human-performed tasks, define gold-standard behaviors, and collaborate with developers to refine test scenarios.

Responsibilities

  • Design structured test scenarios based on real-world tasks.
  • Define the golden path and set acceptable agent behavior.
  • Annotate task steps, expected outputs, and edge cases.
  • Collaborate with developers to test and clarify scenarios.
  • Review agent outputs and adapt tests accordingly.

How to Get Started

Apply to this post with your resume in English and indicate your level of English proficiency. Enjoy a flexible, remote, freelance schedule that fits your current commitments.

Requirements

  • Bachelor's/Master’s degree in Computer Science, Software Engineering, Data Science, AI/ML, or related fields.
  • Background in QA, software testing, data analysis, or NLP annotation.
  • Good understanding of test design principles including reproducibility and edge cases.
  • Strong written communication skills in English.
  • Familiarity with structured formats like JSON/YAML.
  • Basic experience with Python and JavaScript.

Nice to Have

  • Experience in manual or automated test case writing.
  • Familiarity with LLM capabilities and common failure modes.
  • Understanding of scoring metrics such as precision and recall.

Benefits

This freelance, remote role allows you to work on your own schedule, be part of an advanced AI project, and gain valuable experience that will enhance your portfolio.

Key skills/competency

  • Test Design
  • Scenario Development
  • LLM Evaluation
  • NLP Annotation
  • QA Methodologies
  • Python
  • JavaScript
  • Analytical Thinking
  • Attention to Detail
  • Remote Collaboration

Tags:

AI Agent Testing Specialist
Test Design
LLM Evaluation
QA
NLP Annotation
Python
JavaScript
Remote Work
Freelance
Scenario Development

Share Job:

How to Get Hired at Braintrust

  • Customize Your Resume: Tailor your skills to AI testing and QA.
  • Highlight Technical Skills: Emphasize Python, JS, and JSON/YAML experience.
  • Prepare Scenario Examples: Showcase your test design expertise.
  • Research Braintrust: Understand their projects and remote culture.

Frequently Asked Questions

Find answers to common questions about this job opportunity

Explore similar opportunities that match your background