Principal Site Reliability Engineer @ Upstart
Your Application Journey
Email Hiring Manager
Job Details
About Upstart
Upstart is the leading AI lending marketplace partnering with banks and credit unions to expand access to affordable credit. By leveraging Upstart's AI marketplace, banks and credit unions can achieve higher approval rates and lower loss rates while delivering an exceptional digital-first lending experience.
The Team
Upstart’s Site Reliability Engineering (SRE) team is responsible for the reliability, resiliency, and observability of production systems. This team builds automation, tooling, and frameworks to ensure a healthy, scalable infrastructure while supporting seamless experiences for both engineers and customers.
Role Overview
As the Principal Site Reliability Engineer, you will be a thought leader and SRE evangelist. You will drive the adoption of SRE best practices across the organization, mentor engineers, and influence decisions across multiple teams including Product Engineering, DevEx, Development Productivity (Quality), DevOps, Data Engineering, and Machine Learning.
How You’ll Make an Impact
- Lead and advocate SRE principles across teams.
- Shape long-term reliability, resiliency, and observability strategies with leadership.
- Champion distributed tracing, real user monitoring (RUM), and performance metrics.
- Build self-healing systems and drive improvements in incident response processes.
- Manage cross-functional initiatives from concept through execution.
Requirements
Minimum Requirements: 10+ years of combined experience in Software Engineering and SRE, strong communication and mentoring skills, proficiency in Python, Go, and JavaScript/TypeScript, experience with Infrastructure as Code tools (Terraform, CDK, CloudFormation), and hands-on experience with observability and incident management.
Preferred Qualifications: Experience with service mesh, full stack development skills, development productivity, high-scale SaaS environments, and background in building or extending observability platforms.
Position Details
This role is available in Remote, San Mateo, Columbus, and Austin. The team operates across all U.S. time zones with occasional on-site collaboration sessions (3 days per quarter with all travel expenses covered).
Compensation & Benefits
- Competitive base pay with bonus and equity.
- Comprehensive medical, dental, and vision coverage.
- 401(k) with company match and immediate vesting.
- Employee Stock Purchase Plan (ESPP) and additional benefits.
Equal Opportunity
Upstart is a proud Equal Opportunity Employer dedicated to diversity and inclusion. Applicants requiring accommodation should email candidate_accommodations@upstart.com.
Key skills/competency
- SRE
- Reliability
- Resiliency
- Observability
- Distributed Tracing
- Incident Management
- Automation
- Infrastructure as Code
- Mentoring
- Program Management
How to Get Hired at Upstart
🎯 Tips for Getting Hired
- Research Upstart's culture: Understand its digital-first, AI lending focus.
- Tailor your resume: Highlight SRE and automation experience.
- Prepare technical examples: Emphasize distributed tracing and both coding and infrastructure projects.
- Brush up on collaboration: Show proven cross-team communication skills.