3 days ago

Senior Service Reliability Engineer

Sony Interactive Entertainment

On Site
Full Time
$220,000
Adelaide, South Australia, Australia

Job Overview

Job TitleSenior Service Reliability Engineer
Job TypeFull Time
CategoryCommerce
Experience5 Years
DegreeMaster
Offered Salary$220,000
LocationAdelaide, South Australia, Australia

Who's the hiring manager?

Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Uncover Hiring Manager

Job Description

Why PlayStation?

PlayStation isn’t just the Best Place to Play — it’s also the Best Place to Work. Today, we’re recognized as a global leader in entertainment producing The PlayStation family of products and services including PlayStation®5, PlayStation®4, PlayStation®VR, PlayStation®Plus, acclaimed PlayStation software titles from PlayStation Studios, and more.

PlayStation also strives to create an inclusive environment that empowers employees and embraces diversity. We welcome and encourage everyone who has a passion and curiosity for innovation, technology, and play to explore our open positions and join our growing global team.

The PlayStation brand falls under Sony Interactive Entertainment, a wholly-owned subsidiary of Sony Group Corporation.

Senior Service Reliability Engineer at Sony Interactive Entertainment

As a part of Sony Interactive Entertainment, the Future Technology Group (FTG) is leading the cloud gaming revolution, putting console-quality video games on any device, from TVs to consoles to mobile devices and beyond.

Our Site Reliability Engineering team plays a significant role in delivering on the promise of a great cloud gaming experience to our customers. We do this by influencing design and operational decisions towards the overall stability of the gaming service. Our SREs focus on three main things: overall ownership of production, production code quality, and deployments. The successful candidate will be self-directed and able to participate in the way we make decisions at different levels.

We expect our SREs to have opinions on the state of our service and provide critical feedback during different phases of the operational lifecycle. We are engaged throughout the software development lifecycle, ensuring operational readiness and stability.

Requirements

  • Minimum of 7+ years working experience in Software Development and/or Linux Systems Administration role.
  • Strong interpersonal, written and verbal communication skills.
  • Available to be scheduled in on-call rotation.

Skills & Knowledge

Proficient as a Linux Production Systems Engineer, with experience managing large scale Web Services infrastructure. Development experience in one or more of the following programming languages: Python (preferred) Bash, Go, Java, C++, or Rust In addition, experience with at least 3 of the following topics:

  • Distributed data storage at scale (Hadoop, Ceph)
  • NoSQL at scale (MongoDB, Redis, Cassandra)
  • Data aggregation technologies. (ElasticSearch, Kafka)
  • Scaling and running traditional RDBMS (PostgreSQL, MySQL) with High Availability
  • Monitoring & alerting (Prometheus, Grafana), and Incident Management toolsets
  • Kubernetes and/or AWS (deployment and management)
  • Software distribution (Package management and distribution at scale)
  • Configuration management (ansible, saltstack, puppet, chef)
  • Software performance analysis and load testing (QA or SDET experience: a plus)

Responsibilities

  • Taking a leadership role in ongoing improvements in Reliability and Scalability
  • Work closely with SRE Management to define KPIs, processes and drive continuous improvement
  • Influence the architecture and implementation of solutions within the division
  • Mentor more junior SRE staff and enable them for success
  • Act as a voice to represent SRE in the wider organisation
  • Represent the operational scalability of solutions in the wider division
  • Lead small-scale projects from inception to implementation
  • Design platform-wide solutions and provide technical leadership during their implementation
  • Demonstrate a high-level of organizational skills and initiative in the role

Key skills/competency

  • Linux Systems Administration
  • Python Programming
  • Cloud Infrastructure (AWS/Kubernetes)
  • Distributed Systems
  • Database Management (NoSQL/RDBMS)
  • Monitoring & Alerting (Prometheus/Grafana)
  • Configuration Management (Ansible/Saltstack)
  • Reliability Engineering
  • Scalability
  • Incident Management

Tags:

Service Reliability Engineer
SRE
Reliability
Scalability
Operations
Architecture
Mentorship
Leadership
Deployment
Incident Management
Performance
Process Improvement
Linux
Python
AWS
Kubernetes
Hadoop
MongoDB
PostgreSQL
Prometheus
Ansible
Kafka

Share Job:

How to Get Hired at Sony Interactive Entertainment

  • Research Sony Interactive Entertainment's culture: Study their mission, values, recent news, and employee testimonials on LinkedIn and Glassdoor.
  • Tailor your resume for SRE excellence: Highlight extensive experience in Linux, distributed systems, cloud platforms, and specific programming languages like Python.
  • Prepare for in-depth technical discussions: Focus on system design, scalability challenges, monitoring strategies, and your expertise with Kubernetes, AWS, and databases.
  • Showcase problem-solving and leadership: Be ready to share examples of how you've improved reliability, led projects, and mentored junior engineers effectively.
  • Demonstrate passion for gaming and technology: Connect your technical skills to a genuine interest in PlayStation's mission and the future of cloud gaming.

Frequently Asked Questions

Find answers to common questions about this job opportunity

Explore similar opportunities that match your background