Research Engineer, Frontier Evaluations - Finance
OpenAI
Job Overview
Who's the hiring manager?
Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Job Description
About The Team
The Frontier Evaluations team at OpenAI is dedicated to building north star model evaluations that drive progress toward safe AGI/ASI. This team constructs ambitious evaluations to measure and steer our models, implementing self-improvement loops that guide our training, safety, and launch decisions. Notable open-sourced evaluations from the team include SWE-bench Verified, MLE-bench, PaperBench, and SWE-Lancer. They have also developed and run frontier evaluations for major releases such as GPT4o, o1, o3, GPT 4.5, ChatGPT Agent, and GPT5. If you are eager to experience the rapid advancement of our models firsthand and contribute to steering them responsibly, this team offers a unique opportunity.
About You
We are seeking exceptional Research Engineers who can significantly advance the capabilities of our frontier models within the finance domain. The ideal candidate will help define and shape AI evaluations for financial reasoning and related capabilities, taking ownership of individual initiatives from conception to completion.
In This Role, You'll
- Identify critical model capabilities, skills, and behaviors essential for financial workflows, and design robust methods to quantify performance in these areas.
- Own and drive a research agenda focused on identifying an important model capability, especially concerning financial reasoning, and develop evaluations to measure it effectively.
- Continuously refine evaluations of frontier AI models to precisely assess the extent of their advanced capabilities.
We Expect You To
- Possess strong engineering and statistical analysis skills, backed by at least 2-3 years of full-time technical experience.
- Demonstrate a passion for evaluations applied to real-world applications and knowledge work.
- Be highly detail-oriented and thorough in your approach.
- Be a collaborative team player, willing to undertake various tasks to advance team objectives.
- Be passionate about and knowledgeable in AGI/ASI measurement.
- Be capable of operating effectively in a dynamic and extremely fast-paced research environment, as well as scoping and delivering projects end-to-end.
It Would Be Great If You Also Have
- An ability to work effectively cross-functionally across different teams.
- Excellent communication skills to articulate complex ideas and findings.
Key skills/competency
- AI Model Evaluation
- Financial Reasoning
- Research Agenda Development
- Statistical Analysis
- Software Engineering
- AGI/ASI Measurement
- End-to-End Project Ownership
- Dynamic Research Environment
- Cross-functional Collaboration
- Detail-Oriented
How to Get Hired at OpenAI
- Research OpenAI's mission: Study their dedication to safe AGI/ASI, values, and groundbreaking research.
- Showcase relevant experience: Highlight projects demonstrating strong engineering, statistical analysis, and AI model evaluation in your resume.
- Tailor your application: Customize your resume and cover letter to emphasize experience in financial reasoning evaluations and frontier AI capabilities.
- Prepare for technical interviews: Practice problem-solving related to AI model evaluation, data analysis, and software engineering challenges.
- Demonstrate passion for AGI/ASI: Articulate your enthusiasm for measuring and steering advanced AI systems towards beneficial outcomes.
Frequently Asked Questions
Find answers to common questions about this job opportunity
Explore similar opportunities that match your background