Senior AI Engineer
Ruby Labs
Job Overview
Who's the hiring manager?
Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Job Description
About Us
Ruby Labs is a leading tech company dedicated to creating and operating innovative consumer products across health, education, and entertainment. Our dynamic teams are shaping the future of consumer-led products, and we're always eager to welcome passionate individuals. Discover more about our journey at rubylabs.com/about-us/.
About The Role
At Ruby Labs, we are looking for a Senior AI Engineer to play a pivotal role in shaping our AI infrastructure and driving production-ready Large Language Model (LLM) experiences. You will operate within a modern tech stack, making data-driven decisions regarding model performance, reliability, and cost efficiency.
You will take ownership of advanced prompt systems, structured outputs, and complex LLM workflows utilizing tools like LangChain or LlamaIndex. Observability, debugging, and evaluation are central to this position, requiring expertise in Langfuse and AI gateways such as OpenRouter to continuously enhance model quality and operational efficiency. This role involves full ownership of key AI features, from initial experimentation through to live production deployment.
Key Responsibilities
- Advanced Prompt Engineering: Design complex, dynamic prompt templates with conditional logic and efficiently reuse information and context within prompts to maximize generation quality and reasoning.
- Structured Outputs & Schemas: Implement various response schemes (JSON mode, function calling, Zod/JSON schemas) to ensure AI outputs are predictable and ready for seamless integration into application logic.
- Prompt Engineering & Evaluations: Build robust evaluation pipelines and use Langfuse to collect feedback and score the quality of responses in real time.
- Tracing & Debugging: Perform deep debugging of complex LLM chains using Langfuse traces to identify bottlenecks and optimize for cost, latency, and context window usage.
- AI A/B Testing: Run systematic experiments across different models via OpenRouter (e.g., comparing Claude 3.5 Sonnet vs. GPT-4o) and analyze results based on quantitative metrics.
- Data-Driven Decisions: Make deployment decisions for new prompts or models strictly based on quantitative benchmarks and trace data, rather than intuition.
- Output Scoring & Analysis: Develop scoring systems to analyze the “Problem → Solution” chain and identify root causes of hallucinations or logic errors using Langfuse analytics.
- Model Performance & Fine-Tuning: Regularly re-evaluate model performance as new architectures emerge and perform fine-tuning when necessary to meet specific domain requirements.
Qualifications
- Node.js & Next.js: Deep knowledge of the stack to build reliable services and handle complex LLM-generated data.
- Dynamic Prompting Skills: Proven experience in building prompts where content is highly dependent on input variables and context injection.
- OpenRouter Experience: Experience working with unified APIs, managing rate limits, and selecting the most cost-effective models for specific tasks.
- Langfuse (or similar): Understanding of LLM observability principles — setting up tracing, creating test datasets, and integrating scoring systems.
- Evaluation Methodology: Experience with frameworks like RAGAS or building custom “LLM-as-a-judge” systems.
- Analytical Mindset: Ability to transform raw generation logs into actionable business metrics and technical insights.
- Iterative Mindset: Focus on continuous product improvement through constant feedback loops.
Nice to have
- Fine-Tuning: Practical experience in fine-tuning models for specific domain tasks or JSON compliance.
- RAG Architecture: Understanding how to build and optimize Retrieval-Augmented Generation systems, including indexing, retrieval, and re-ranking.
- Python: Basic knowledge for working with data science scripts or AI evaluation libraries.
Benefits
Join our vibrant team and discover excellent perks:
- Remote Work Environment: Enjoy the freedom to work from anywhere, anytime.
- Unlimited PTO: Recharge and prioritize well-being without counting days.
- Paid National Holidays: Celebrate and relax with paid time off.
- Company-provided MacBook: Experience seamless productivity with top-notch Apple MacBooks.
- Flexible Independent Contractor Agreement: Benefit from autonomy, tax advantages, and entrepreneurial opportunities.
Be part of our fast-growing team and seize this excellent opportunity for personal and professional growth!
Key skills/competency
- Prompt Engineering
- LLM Workflows
- Node.js
- Next.js
- TypeScript
- LangChain
- LlamaIndex
- Langfuse
- OpenRouter
- AI A/B Testing
How to Get Hired at Ruby Labs
- Research Ruby Labs's culture: Study their mission, values, recent news, and employee testimonials on LinkedIn and Glassdoor.
- Tailor your resume for AI: Customize your resume to highlight experience in LLM development, Node.js, Next.js, and prompt engineering, aligning with the Senior AI Engineer role.
- Showcase practical AI projects: Include a portfolio or links to projects demonstrating your skills in prompt engineering, structured outputs, or LLM evaluation.
- Prepare for technical depth: Brush up on modern AI stack components like Langfuse, OpenRouter, and LLM evaluation methodologies for Ruby Labs's interview.
- Demonstrate an iterative mindset: During interviews, emphasize your approach to continuous improvement, data-driven decision-making, and owning AI features end-to-end.
Frequently Asked Questions
Find answers to common questions about this job opportunity
Explore similar opportunities that match your background