Want to get hired at Atlan?

This job post expired on November 8, 2025

But don't worry! We can still help you get hired at Atlan for similar Site Reliability Engineer II roles

Site Reliability Engineer II

Atlan

HybridHybrid

Original Job Summary

About the Role

At Atlan, our Site Reliability Engineer II is a key member of the Platform & Reliability Engineering team. You will strengthen alert management and incident response capabilities to ensure fast, reliable, and uninterrupted customer experiences.

Your Mission at Atlan

As a Site Reliability Engineer II, you will:

Own and operate end-to-end system reliability
Manage incidents within defined SLAs (60 mins for Critical, 180 mins for High)
Enhance observability by refining monitoring systems and alerts
Automate operations and incident workflows to eliminate manual tasks
Collaborate with Platform, Observability, and Product Engineering teams
Contribute to documentation and playbooks for process improvement

What Makes You a Great Fit

You possess proven experience in managing alerts, incidents, and root cause analysis in production environments. You have hands-on experience with cloud platforms (AWS, GCP, or Azure) and Kubernetes, along with expertise in monitoring tools like Prometheus, Grafana, ELK/EFK, or Datadog. Strong scripting skills (Python, Bash, or Shell) and excellent communication abilities are essential.

Why You'll Love Working at Atlan

Joining Atlan means real impact from day one with a modern tech stack including Kubernetes, Terraform, Prometheus, and Datadog. You will work with world-class engineers in a learning culture, enjoy autonomy, and have a clear growth path from SRE II to principal levels.

About Atlan

At Atlan, we transform data chaos into clarity for Fortune 500 leaders and hyper-growth startups alike. Backed by top investors and recognized by Gartner and Forrester, we are a fully remote company trusted by global leaders like Cisco, Nasdaq, and HubSpot.

Key skills/competency

reliability, incident response, automation, monitoring, Kubernetes, cloud, scripting, observability, troubleshooting, documentation

How to Get Hired at Atlan

🎯 Tips for Getting Hired

Research Atlan's culture: Understand their mission, values, and tech stack.
Customize your resume: Highlight cloud, Kubernetes, and automation skills.
Prepare for technical interviews: Practice incident management and scripting challenges.
Show case collaboration: Emphasize teamwork and communication experiences.

📝 Interview Preparation Advice

Technical Preparation

Review cloud and Kubernetes fundamentals.

Practice writing automation scripts in Python.

Study monitoring tools configuration and troubleshooting.

Prepare incident management case studies.

Behavioral Questions

Describe a challenging incident and resolution.

Explain collaboration during high-pressure situations.

Discuss how you handle repetitive operational tasks.

Share teamwork experiences in distributed settings.

Ready to optimize your application for Atlan?

Our Al will adapt your resume for Atlan's hiring patterns and similar Site Reliability Engineer II roles.