Site Reliability Engineer Distributed Systems
@ Workday

Chennai, Tamil Nadu, India
$150,000
On Site
Other
Posted 2 days ago

Your Application Journey

Personalized Resume
Apply
Email Hiring Manager
Interview

Email Hiring Manager

XXXXXXXXXX XXXXXXXXXXX XXXXXX******* @workday.com
Recommended after applying

Job Details

About Workday

Your work days are brighter here. As a Fortune 500 company and a leading AI platform for managing people, money, and agents, Workday is shaping the future of work with integrity, empathy, and shared enthusiasm. Join a team where hard work pays off, and meaningful work with supportive colleagues is the norm.

About the Team

The Data Platform and Observability team spans across Pleasanton, CA; Boston, MA; Atlanta, GA; Dublin, Ireland; and Chennai, India. They develop large scale distributed data systems that support Workday products including core HCM, Fins, AI/ML, and internal data products, delivering real-time insights and processing hundreds of terabytes of data.

About the Role

The Messaging, Streaming and Caching team is a full-service Distributed Systems Engineering team dedicated to designing and providing async messaging, streaming, and NoSQL platforms. As a Site Reliability Engineer Distributed Systems, your responsibilities will include:

  • Designing, building, and enhancing distributed services such as Kafka, Redis, and RabbitMQ.
  • Developing and maintaining core distributed software for streaming, messaging, and caching.
  • Creating observability modules, alerts, and automation for dashboard lifecycle management.
  • Deploying, operating, and tuning infrastructure components in production environments.
  • Evaluating and implementing open-source and cloud-native tools across Kubernetes, OpenStack, and Bare Metal deployments.
  • Participating in on-call rotations and managing distributed services in AWS, GCP, and Private cloud environments.

Required Qualifications

Applicants should have 4-8 years of software engineering experience (Java/Scala, Golang), 3+ years in distributed systems, and 3+ years in designing and operating large-scale deployments, with at least 1 year leading NoSQL-related product development.

Preferred Qualifications

Expertise in distributed system software, performance analysis, and optimization, as well as hands-on experience with Kafka, RabbitMQ, Redis, and Cassandra. Experience with CI/CD tools, Agile methodologies, configuration management using Chef, Kubernetes deployments via Helm and ArgoCD, and Linux system internals is desired.

Work Arrangement & Culture

Workday offers Flex Work, combining in-person and remote work. Employees are expected to spend at least 50% of their time in the office or in the field each quarter, ensuring both flexibility and community connection. Inclusion, belonging, and equity (VIBE™) are at the core of Workday's values.

Key skills/competency

  • Distributed Systems
  • Messaging
  • Streaming
  • Caching
  • DevOps
  • Scalability
  • Observability
  • Cloud Computing
  • Automation
  • Performance Optimization

How to Get Hired at Workday

🎯 Tips for Getting Hired

  • Customize your resume: Highlight relevant distributed systems and DevOps experience.
  • Research Workday culture: Learn about its values and mission.
  • Prepare technical examples: Emphasize hands-on experience with Kafka, Redis, and cloud platforms.
  • Practice behavioral questions: Showcase teamwork and problem-solving skills.
  • Network with current employees: Use LinkedIn to gather insights.

📝 Interview Preparation Advice

Technical Preparation

Review Kafka and Redis configurations.
Practice cloud platform deployment using AWS and GCP.
Revisit Kubernetes and Helm deployment techniques.
Refresh CI/CD pipeline setup and automation.

Behavioral Questions

Explain your teamwork experience in distributed projects.
Describe a challenging on-call incident you resolved.
Discuss how you managed project timeline stress.
Share an experience adapting to change quickly.

Frequently Asked Questions