Principal Site Reliability Engineer
@ Playson

Remote
$150,000
Remote
Full Time
Posted 12 hours ago

Your Application Journey

Personalized Resume
Apply
Email Hiring Manager
Interview

Email Hiring Manager

XXXXXXXX XXXXXXXXXXX XXXXXXXXXX***** @playson.com
Recommended after applying

Job Details

About Playson

Founded in 2012, Playson is a leading iGaming supplier recognized worldwide. Our high-end micro-service-based platform processes billions of financial transactions per day with a cross-regional setup aiming for near zero latency.

Role Overview

As the Principal Site Reliability Engineer, you will join our dynamic Platform Tribe to ensure our cloud infrastructure is robust and efficient.

What Will You Be Doing

  • Manage day-to-day alerts, system checks, and issue escalations.
  • Provide 24x7 on-call support for critical SaaS events.
  • Document issues and remediation steps.
  • Create and maintain monitors within the EKS/K8s ecosystem.
  • Deploy to EKS/K8s cluster using Terraform and Helm/Flux.
  • Enhance infrastructure health with checks and scripts.
  • Maintain and develop deployment code.
  • Integrate new technologies into our Cloud Infrastructure.
  • Collaborate with other teams for top-notch support.
  • Conduct RCA and implement corrective actions.
  • Assign alert-related actions to appropriate teams.
  • Handle environment-specific support requests.

Required Skills and Experience

  • Proficiency in Kubernetes deployment, scaling, and troubleshooting.
  • Experience with configuration management tools like FluxCD/ArgoCD.
  • Strong background in issue processing, RCA and postmortems.
  • Familiarity with AWS, Terraform, Docker, and CI/CD pipelines.
  • Hands on experience with monitoring tools (DataDog, Prometheus, Grafana) and logging solutions (ELK Stack or AWS CloudWatch).
  • Strong networking and scripting skills (Python, NodeJS, Go).
  • Experience with Git and version control systems.
  • Familiarity with incident response tools like PagerDuty, Opsgenie or VictorOps.
  • Ownership, proactivity, persistence, and passion for maintaining a high-traffic online platform.

What We Offer

  • Quarterly bonuses based on transparent evaluations.
  • Flexible work schedule and remote work option.
  • Comprehensive medical insurance.
  • Financial support for life events.
  • Unlimited paid vacation and sick leave.
  • Reimbursement for professional development.

Key Skills/Competency

  • Kubernetes
  • EKS
  • Terraform
  • Helm
  • Cloud
  • AWS
  • Monitoring
  • CI/CD
  • Scripting
  • DevOps

How to Get Hired at Playson

🎯 Tips for Getting Hired

  • Research Playson culture: Understand their global iGaming vision and values.
  • Customize your resume: Highlight Kubernetes, AWS, and Terraform expertise.
  • Prepare for technical interviews: Practice cloud and infrastructure scenarios.
  • Showcase problem solving: Demonstrate RCA and incident management skills.

📝 Interview Preparation Advice

Technical Preparation

Review Kubernetes deployment documentation.
Practice Terraform and Helm scripts.
Set up a test AWS environment.
Familiarize with cloud monitoring tools.

Behavioral Questions

Describe a major system outage you resolved.
Explain your approach to incident postmortems.
Share teamwork examples during system failures.
Detail prioritization in high-pressure situations.

Frequently Asked Questions