7 days ago

Coding - Adversarial Prompt Expert

Reinforce Labs

Hybrid

Contractor

$180,000

Hybrid

Job Overview

Job TitleCoding - Adversarial Prompt Expert

Job TypeContractor

CategoryCommerce

Experience5 Years

DegreeMaster

Offered Salary$180,000

LocationHybrid

Who's the hiring manager?

Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Uncover Hiring Manager

Job Description

We are seeking an Adversarial Prompt Security Specialist with strong technical instincts and coding proficiency to join Reinforce Labs' Trust & Safety team. In this role, you will use your knowledge of LLM behavior and scripting skills to probe, bypass, and stress-test safety systems. Your focus will be on discovering vulnerabilities—crafting prompt injection sequences, writing scripts to automate exploit attempts, manipulating API interactions, and identifying novel attack vectors that evade existing safeguards. This is a hands-on offensive testing role that rewards creativity, persistence, and an attacker’s mindset over formal engineering credentials.

Key Responsibilities

Code-Assisted Adversarial Probing: Write and execute scripts (primarily Python) to systematically test LLM safety boundaries. This includes automating prompt injection chains, encoding and obfuscating payloads, manipulating conversation context through API calls, and iterating on attack strategies programmatically rather than relying solely on manual interaction.
Jailbreak Discovery and Development: Design multi-step jailbreak sequences that exploit model behavior through technical means, such as token-level manipulation, system prompt extraction, role-play escalation, instruction hierarchy subversion, and context window exploitation. Identify bypass vectors that circumvent safety classifiers and content filters.
Cross-Vector Exploitation: Test attack surfaces that span code generation, tool use, multi-turn conversation, and multi-modal inputs. Explore how code-mediated interactions—such as requesting the model to write, execute, or interpret code—can be leveraged to bypass safety controls that apply to natural language interactions.
Vulnerability Documentation: Document discovered vulnerabilities with clear severity assessments, step-by-step reproduction instructions, and sample exploit code. Provide context on why a given bypass is dangerous and recommend potential mitigations for the alignment and engineering teams.
Attack Landscape Monitoring: Stay current with emerging adversarial techniques from the AI security research community, open-source exploit repositories, academic publications, and real-world misuse patterns. Adapt and apply novel methods to internal testing workflows.
Safety Policy Input: Provide technical feedback to content policy and safety classification teams based on observed model behaviors. Flag gaps between intended safety enforcement and actual model output, particularly in edge cases involving code generation, indirect prompt injection, and agentic tool-use scenarios.

Candidate Profile

Adversarial Mindset: You instinctively look for ways to break systems. You approach LLM safety from an attacker’s perspective and can creatively combine technical and social engineering techniques to find vulnerabilities others miss.
Technically Resourceful: You are comfortable writing scripts to test ideas quickly, interacting with APIs, and using code as a tool for exploration—even if you don’t identify as a traditional software engineer. You solve problems by building things, not just describing them.
Persistent and Methodical: You approach red-teaming as a structured practice. You systematically vary your attack strategies, document what works and what doesn’t, and iterate methodically rather than relying on luck.
Clear Communicator: You can explain complex technical exploits to non-technical stakeholders—including policy, legal, and leadership teams—in a way that conveys both the mechanism and the real-world risk.
Ethically Grounded: You understand the responsibility inherent in this work. You are motivated by strengthening AI safety and operate with integrity within established testing protocols.

Qualifications

Proficiency in Python scripting, with the ability to write functional scripts for task automation, API interaction, and data manipulation. Formal software engineering training is not required.
Demonstrated experience in adversarial prompt engineering, jailbreak development, or LLM red-teaming—whether in a professional, academic, independent research, or community context (e.g., bug bounties, CTFs, responsible disclosure).
Working familiarity with LLM APIs (e.g., OpenAI, Anthropic, open-source model endpoints) and a practical understanding of how large language models process input, generate output, and enforce safety constraints.
Knowledge of common LLM attack vectors, including direct and indirect prompt injection, payload encoding and obfuscation, context window manipulation, system prompt leakage, and role-play exploitation.
Strong written communication skills, with the ability to produce clear vulnerability reports that include reproduction steps, severity context, and mitigation recommendations.

Preferred Qualifications

Background in cybersecurity, penetration testing, or application security—formal or self-taught. Relevant certifications (e.g., OSCP, CEH) are valued but not required.
Familiarity with AI safety evaluation frameworks such as the OWASP Top 10 for LLM Applications, NIST AI RMF, or MITRE ATLAS.
Understanding of LLM alignment techniques (e.g., RLHF, constitutional AI) and their known failure modes and exploitable edge cases.
Experience with multi-modal model testing (vision, code generation, tool use) and awareness of cross-modal attack surfaces.
Proficiency in additional scripting or programming languages (e.g., JavaScript, Bash, Go) that expand testing capabilities.

Key skills/competency

LLM Security
Adversarial Prompt Engineering
Jailbreak Development
Python Scripting
API Interaction
Vulnerability Assessment
Red Teaming
Cybersecurity
Attack Vector Identification
AI Safety

Tags:

Adversarial Prompt Expert

LLM Security

Prompt Injection

Jailbreak Development

Vulnerability Assessment

Red Teaming

Offensive Testing

Python Scripting

API Interaction

Cybersecurity

AI Safety

OpenAI

Anthropic

OWASP

NIST AI RMF

MITRE ATLAS

Penetration Testing

Application Security

Exploit Development

Trust & Safety

How to Get Hired at Reinforce Labs

Research Reinforce Labs' culture: Study their mission, values, recent news, and employee testimonials on LinkedIn and Glassdoor to understand their focus on AI trust and safety.
Tailor your resume: Highlight your expertise in adversarial prompting, LLM red-teaming, Python scripting, and vulnerability discovery, directly aligning with the Coding - Adversarial Prompt Expert role requirements.
Showcase practical skills: Prepare to demonstrate hands-on experience in prompt injection, jailbreak development, API manipulation, and exploit automation, emphasizing your attacker's mindset.
Prepare for technical interviews: Practice explaining complex technical exploits, discuss LLM attack vectors, and articulate your ethical approach to AI security testing at Reinforce Labs.
Network strategically: Connect with current and former Reinforce Labs employees on LinkedIn to gain insights into the team dynamics and interview process for security roles.

Frequently Asked Questions

Find answers to common questions about this job opportunity

01What kind of LLM models will I be testing as a Coding - Adversarial Prompt Expert at Reinforce Labs?

02Is a formal software engineering background required for this role at Reinforce Labs?

03How does Reinforce Labs define an 'adversarial mindset' for this position?

04What specific Python scripting tasks will be central to the Coding - Adversarial Prompt Expert role at Reinforce Labs?

05What kind of vulnerabilities will I be documenting at Reinforce Labs in this role?

Explore similar opportunities that match your background

Coding - Adversarial Prompt Expert

Reinforce Labs

Job Overview

Who's the hiring manager?

Job Description

Job Description

Key Responsibilities

Candidate Profile

Qualifications

Preferred Qualifications

Key skills/competency

Tags:

Share Job:

How to Get Hired at Reinforce Labs

Frequently Asked Questions