
Research Engineer, Search and Knowledge Post-Training
Anthropic · San Francisco Bay Area
- Hybrid
- Full-time
- $675,000 / year
- San Francisco Bay Area
Job highlights
- Advance AI search and knowledge capabilities.
- Define research hypotheses and design experiments.
- Build infrastructure for controlled AI testing.
- Evaluate LLM reasoning over evidence.
- Collaborate on cutting-edge AI research.
About the role
About Anthropic
Anthropic pursues the development of reliable, interpretable, and steerable AI systems to ensure AI is safe and beneficial for users and society. We are a rapidly growing team of researchers, engineers, policy experts, and business leaders united by this mission.
About The Role
We envision future AI systems with superhuman epistemic capabilities – the ability to analyze vast amounts of evidence and draw rigorous conclusions for both the AI and the user. Search is fundamental to this vision, enabling models to discern signal from noise, weigh conflicting evidence, and recognize their own knowledge gaps. A trustworthy search capability is crucial for Claude to act as a reliable collaborator in knowledge-intensive tasks.
We are seeking a Research Engineer to advance the science and engineering behind making Claude a trustworthy searcher. This is a research-focused position for a highly rigorous individual. You will define hypotheses regarding epistemically sound searcher capabilities, design experiments to test them, and transform search post-training from a craft into a measurable science. You will be responsible for ensuring cleanly isolated variables, calibrated metrics, and reproducible signals, complemented by the engineering expertise to build the necessary infrastructure.
This role operates at the intersection of reinforcement learning, retrieval, and evaluation, directly influencing Claude's behavior in evidence-based applications such as research, analysis, and agentic workflows.
What You'll Do
- Lead a research direction for a class of search post-training problems end-to-end: formulate hypotheses about latent capabilities, design experiments to isolate them, conduct training, and determine subsequent steps.
- Develop instrumentation to transform environment design into controlled experiments, enabling the study of how each environmental factor contributes to desired capabilities without overfitting to specific regimes.
- Design frontier-discriminating evaluations that differentiate genuine reasoning over evidence from plausible pattern matching, ensuring continued effectiveness as models advance.
- Drive optimization rigor across the entire stack, including efficient experiment design, ablations, training run economics, and the discipline to validate results.
- Collaborate closely with researchers in post-training, RL infrastructure, and product teams to translate real-world model behavior into concrete training signals and vice versa.
- Establish the team's experimental standards for measurement, methodology, and result validation.
Minimum Qualifications (Must-Have)
- Possess an exceptionally rigorous and quantitative mindset.
- Be an outstanding software engineer in Python, proficient across the stack from data pipelines to RL training and evaluation infrastructure.
- Have a track record of shipping real ML research, with a discerning taste for impactful experiments.
- Instinctively utilize ablations, controls, and confidence intervals for understanding model behavior.
- Operate with a high degree of autonomy and comfort in ambiguous situations, identifying the most impactful next steps without explicit direction.
- Desire to set research direction, advocate for experimental rigor, and elevate the standards of your colleagues.
- Communicate research clearly in written and verbal forms, capable of defending design choices and presenting evidence-based updates.
Preferred Qualifications (Nice-to-Have)
- Hands-on experience with Reinforcement Learning (RL) on large language models, including environment design, reward engineering, training stability, and scaling behavior.
- Background in search, retrieval, Retrieval-Augmented Generation (RAG), or agents that reason over external information sources.
- Experience building evaluations for open-ended or knowledge-intensive LLM behavior.
- Prior work in a research-intensive environment (e.g., frontier AI lab, quant research firm) demonstrating a default standard of rigor.
- Published research in areas such as LLMs, RL, retrieval, calibration, or related fields.
- Experience with distributed training systems and large-scale experimentation infrastructure.
Representative Projects
- Designing a controlled-noise search environment that allows independent adjustment of failure rates, conflicting sources, and adversarial content to characterize their impact on learned policies.
- Developing an evaluation suite to distinguish calibrated source judgment from confident-sounding guesswork, maintaining discrimination as models improve.
Annual Compensation Range
$500,000 - $850,000 USD
Logistics
- Minimum Education: Bachelor’s degree or equivalent combination of education, training, and/or experience.
- Required Field of Study: A field relevant to the role, demonstrated through coursework, training, or professional experience.
- Minimum Years of Experience: Correlates with internal job level requirements.
- Location-Based Hybrid Policy: All staff are expected to be in office at least 25% of the time; some roles may require more in-office presence.
- Visa Sponsorship: Available; efforts will be made to sponsor visas if an offer is extended.
Application Guidance
- We encourage applications even if you don't meet every qualification.
- We value diverse perspectives and strongly encourage applications from individuals in underrepresented groups.
- Beware of recruitment scams: Anthropic recruiters exclusively use @anthropic.com email addresses. Never provide payment or banking information. Verify openings at anthropic.com/careers.
How We're Different
Anthropic champions
Skills & topics
- Research Engineer
- AI
- Machine Learning
- Reinforcement Learning
- Search
- Knowledge Management
- LLM
- Python
- Software Engineering
- Evaluation
- Post-training
- Retrieval
- RAG
- Agentic Workflows
- Quantitative Mindset
- Experimental Design
- San Francisco
- Hybrid
- Full-time
How to get hired
- Tailor your resume: Highlight Python, ML research, RL, and evaluation experience. Emphasize quantitative mindset and autonomy.
- Showcase rigorous thinking: Detail experiments, ablations, and confidence interval usage in your application.
- Demonstrate software engineering skills: Provide examples of building data pipelines, RL training, or evaluation infrastructure.
- Prepare for technical interviews: Expect deep dives into ML concepts, Python coding, and experimental design.
- Articulate your research vision: Be ready to discuss your ideas for advancing AI search and knowledge post-training.
Technical preparation
Behavioral questions
Frequently asked questions
- What is the salary range for the Research Engineer role at Anthropic?
- The Research Engineer, Search and Knowledge Post-Training role at Anthropic offers an annual compensation range of $500,000 to $850,000 USD.
- Does Anthropic offer visa sponsorship for the Research Engineer position?
- Yes, Anthropic does offer visa sponsorship for this role. If an offer is extended, they will make every reasonable effort to secure the necessary visa, working with an immigration lawyer.
- What is the hybrid work policy for this Research Engineer role at Anthropic?
- Anthropic has a location-based hybrid policy, requiring all staff to be in one of their offices at least 25% of the time. Some roles may necessitate more in-office presence.
- What are the minimum educational requirements for the Research Engineer position?
- A minimum of a Bachelor’s degree or an equivalent combination of education, training, and/or experience is required for the Research Engineer role.
- What specific technical skills are most important for the Research Engineer role at Anthropic?
- Outstanding Python software engineering skills across the stack (data pipelines, RL training, evaluation infrastructure) are crucial. Experience with ML research, particularly in RL, search, retrieval, or RAG, is highly preferred.
- How does Anthropic approach AI research and development for its roles like Research Engineer?
- Anthropic focuses on 'big science' research, emphasizing collaborative efforts on large-scale projects aimed at creating steerable, trustworthy AI. They view AI research as an empirical science with significant social and ethical implications.
- What kind of research projects might a Research Engineer at Anthropic work on?
- Projects could involve designing controlled-noise search environments to study failure modes, building evaluation suites to differentiate genuine reasoning from pattern matching, and characterizing how environmental factors influence learned policies in LLMs.
- Should I still apply for the Research Engineer role if I don't meet every qualification?
- Yes, Anthropic strongly encourages you to apply even if you don't meet every single qualification. They believe not all strong candidates will meet every listed requirement and want to avoid candidates prematurely excluding themselves.
- What should I do if I receive a suspicious email regarding a job at Anthropic?
- Anthropic recruiters only use @anthropic.com email addresses. Be cautious of other domains. Legitimate recruiters will never ask for money, fees, or banking information. Verify job openings directly on anthropic.com/careers.
- What is the core focus of the Research Engineer role in Search and Knowledge Post-Training?
- The core focus is to advance the science and engineering of making AI systems, like Claude, trustworthy searchers. This involves rigorous experimentation to understand and improve how models handle evidence, weigh information, and recognize knowledge gaps.