Research Engineer, Reward Models
@ Anthropic

Seattle, WA
$315,000
On Site
Full Time
Posted 19 hours ago

Your Application Journey

Personalized Resume
Apply
Email Hiring Manager
Interview

Email Hiring Manager

XXXXXXXX XXXXXXXXXXX XXXXXXXXX***** @anthropic.com
Recommended after applying

Job Details

About Anthropic

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems that are safe and beneficial for users and society. Our team consists of researchers, engineers, policy experts, and business leaders working together on beneficial AI systems.

About The Role

In the role of Research Engineer, Reward Models you will work on advancing reward modeling techniques for aligning AI with human values. You will help push the science of reward modeling, focusing on teaching AI systems to understand and embody human values while enhancing their capabilities.

Responsibilities

  • Implement novel reward modeling architectures and techniques
  • Optimize training pipelines
  • Build and optimize data pipelines
  • Collaborate across teams to integrate advances into production systems
  • Document engineering progress and contribute to publications

Candidate Profile

You may be a good fit if you have a robust machine learning engineering background, proficiency in Python and deep learning frameworks, and experience with distributed computing. Familiarity with LLM architectures, reinforcement learning, and building data pipelines is essential. You should be comfortable with experimental research and capable of clearly communicating complex technical concepts. Experience with AI alignment, safety, or LLMs is a significant plus.

Compensation & Logistics

The expected annual base salary is between $315,000 and $340,000 USD. Our full-time package includes equity, benefits, and potential incentive compensation.

Education: A Bachelor's degree in a related field or equivalent experience is required.

Location: This is a location-based hybrid role requiring staff to be in one of our offices at least 25% of the time. Anthropic is headquartered in San Francisco.

Visa Sponsorship: We do sponsor visas, subject to specific requirements.

How We're Different

At Anthropic, we focus on big science and collaborative research. We emphasize impact, long-term safety, and clear communication. We welcome diverse perspectives and encourage candidates from underrepresented groups to apply.

Key skills/competency

  • Python
  • Deep Learning
  • Reward Modeling
  • LLM
  • Reinforcement Learning
  • Distributed Computing
  • Data Pipelines
  • Technical Communication
  • Research Integration
  • AI Alignment

How to Get Hired at Anthropic

🎯 Tips for Getting Hired

  • Customize your resume: Highlight Python, AI alignment, and deep learning skills.
  • Research Anthropic: Understand their mission and big science focus.
  • Prepare projects: Showcase experience in reward modeling and LLMs.
  • Practice technical interviews: Expect Python and deep learning questions.

📝 Interview Preparation Advice

Technical Preparation

Review Python deep learning libraries.
Study reinforcement learning fundamentals.
Practice distributed computing challenges.
Analyze reward modeling case studies.

Behavioral Questions

Describe teamwork on complex projects.
Explain handling research setbacks.
Discuss balancing experimental work and delivery.
Share examples of clear technical communication.

Frequently Asked Questions