6 hours ago

SRE - Devops

Grid Dynamics

On Site
Full Time
$180,000
Bengaluru, Karnataka, India

Job Overview

Job TitleSRE - Devops
Job TypeFull Time
Offered Salary$180,000
LocationBengaluru, Karnataka, India

Who's the hiring manager?

Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Uncover Hiring Manager

Job Description

SRE - Devops at Grid Dynamics

We are looking for a skilled Site Reliability Engineer (SRE) who can ensure high availability, reliability, and performance of production systems. The ideal candidate will combine software engineering, DevOps, infrastructure automation, and monitoring skills to improve system uptime, reduce operational toil, and enhance overall platform efficiency.

Responsibilities

  • Ensure reliability, scalability, and performance of production systems and services.
  • Implement automation to reduce manual operations (toil).
  • Build and maintain CI/CD pipelines, deployment automation, and release processes.
  • Develop tools, scripts, and dashboards to improve operational visibility.
  • Manage and optimize Kubernetes clusters, Docker containers, and cloud infrastructure.
  • Implement infrastructure as code (IaC) using Terraform, Ansible, or CloudFormation.
  • Monitor systems using Prometheus, Grafana, ELK, Datadog, CloudWatch, etc.
  • Conduct incident management, root cause analysis (RCA), and ensure preventive actions.
  • Design and maintain reliable, fault-tolerant, and self-healing architectures.
  • Work on performance tuning, capacity planning, and cost optimization.
  • Establish and track SLIs, SLOs, and SLAs for services.
  • Collaborate with development, security, and DevOps teams to enhance platform stability.

Requirements

  • Strong experience with Linux systems, networking, and troubleshooting.
  • Hands-on expertise with AWS/Azure/GCP cloud services.
  • Proficiency in Kubernetes, Docker, and container orchestration.
  • Solid understanding of CI/CD tools (Jenkins, GitLab CI, GitHub Actions).
  • Experience with monitoring & observability tools (Prometheus, Grafana, ELK, Datadog).
  • Strong scripting skills in Python, Go, Shell, or Bash.
  • Knowledge of infrastructure as code (Terraform, CloudFormation, Ansible).
  • Understanding of SRE concepts: SLOs, SLIs, error budgets, reliability engineering, fault tolerance.
  • Experience with incident response, postmortem processes, and automation.

We offer

  • Opportunity to work on bleeding-edge projects
  • Work with a highly motivated and dedicated team
  • Competitive salary
  • Flexible schedule
  • Benefits package - medical insurance, sports
  • Corporate social events
  • Professional development opportunities
  • Well-equipped office

About Us

Grid Dynamics (NASDAQ: GDYN) is a leading provider of technology consulting, platform and product engineering, AI, and advanced analytics services. Fusing technical vision with business acumen, we solve the most pressing technical challenges and enable positive business outcomes for enterprise companies undergoing business transformation. A key differentiator for Grid Dynamics is our 8 years of experience and leadership in enterprise AI, supported by profound expertise and ongoing investment in data, analytics, cloud & DevOps, application modernization and customer experience. Founded in 2006, Grid Dynamics is headquartered in Silicon Valley with offices across the Americas, Europe, and India.

Key skills/competency

  • Site Reliability Engineering (SRE)
  • DevOps
  • Kubernetes
  • Cloud Infrastructure (AWS/Azure/GCP)
  • CI/CD
  • Automation
  • Linux Systems
  • Monitoring & Observability
  • Terraform
  • Python

Tags:

Site Reliability Engineer
Reliability
Scalability
Automation
CI/CD
Monitoring
Incident Management
IaC
Performance
Collaboration
Troubleshooting
Kubernetes
Docker
AWS
Azure
GCP
Prometheus
Grafana
Terraform
Python
Jenkins

Share Job:

How to Get Hired at Grid Dynamics

  • Research Grid Dynamics' mission: Study their AI leadership, cloud, and DevOps expertise to align your application.
  • Tailor your SRE - Devops resume: Highlight extensive experience with Kubernetes, cloud services (AWS/Azure/GCP), and automation tools.
  • Showcase problem-solving skills: Prepare specific examples of incident management, root cause analysis, and preventive actions.
  • Demonstrate SRE principles: Be ready to discuss SLOs, SLIs, error budgets, and your approach to reliability engineering.
  • Network with Grid Dynamics professionals: Connect on LinkedIn to gain insights and express your interest in their innovative projects.

Frequently Asked Questions

Find answers to common questions about this job opportunity

Explore similar opportunities that match your background