9 days ago

Site Reliability Engineer

DevRev

On Site
Full Time
$165,000
Chennai, Tamil Nadu, India

Job Overview

Job TitleSite Reliability Engineer
Job TypeFull Time
CategoryCommerce
Experience5 Years
DegreeMaster
Offered Salary$165,000
LocationChennai, Tamil Nadu, India

Who's the hiring manager?

Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Uncover Hiring Manager

Job Description

About DevRev

At DevRev, we’re building the future of work with Computer – your AI teammate. Computer is not just another tool. It’s built on the belief that the future of work should be about genuine human connection and collaboration – not piling on more apps.

Computer is the best kind of teammate: it amplifies your strengths, takes repetition and frustration out of your day, and gives you more time and energy to do your best work.

How?

Easy: it’s the only platform capable of…

Complete Data Unification

Most AI products focus on either structured data (like CRM records and support tickets), or unstructured data (like documents and emails). Computer AirSync connects everything, unifying all your data sources (like Google Workspace, Jira, Notion) into one AI-ready source of truth: Computer Memory.

Powerful Search, Reasoning, and Action

Once connected to all your tools and apps, Computer is embedded in your full business context. It can find and summarize, sure. Even more impressive: it offers employees insights, strategic and proactive suggestions, plus powerful agentic actions.

Extensions for Your Teams and Customers

Computer doesn’t make you choose between new software and old. Its AI-native platform lets you extend existing tools with sophisticated apps and agents. So your teams – and your customers – can take action, seamlessly. These agents work alongside you: updating workflows, coordinating across teams, and syncing back to your systems.

This isn’t just software. Computer brings people back together, breaking down silos and ushering in the future of teamwork, through human-AI collaboration. Stop managing software. Stop wasting time. Start solving bigger problems, building better products, and making your customers happier.

We call this Team Intelligence. It’s why DevRev exists.

Trusted by global companies across multiple industries, DevRev is backed by Khosla Ventures and Mayfield, with $150M+ raised. We are 650+ people, across eight global offices.

About The Role

We are seeking an experienced Site Reliability Engineer / Platform Engineer to join our team and help build and maintain a resilient, scalable infrastructure supporting our applications across multiple cloud providers. In this role, you will design and implement infrastructure solutions, automate operational processes, and work closely with development teams to ensure reliable, efficient systems that scale with our business.

What You'll Do

  • Design, build, and maintain infrastructure across AWS, GCP, and Azure using Infrastructure as Code (IaC) principles.
  • Implement and optimize CI/CD pipelines using tools like Argo and CircleCI to enable rapid, reliable deployments.
  • Manage and scale Kubernetes clusters in production environments, ensuring high availability and optimal resource utilization.
  • Administer and optimize cloud databases including MongoDB, Redis, RDS, and other data stores for performance and reliability.
  • Develop monitoring, alerting, and observability solutions to identify and resolve issues before they impact users.
  • Automate routine operational tasks to reduce manual toil and improve system reliability.
  • Conduct incident response and post-mortem analysis to drive continuous improvement.
  • Collaborate with development teams to design systems with reliability, scalability, and operational excellence in mind.
  • Document infrastructure architecture, runbooks, and operational procedures.
  • Evaluate and implement new tools and technologies to improve platform capabilities.

What You'll Bring

  • 3+ years of experience in Site Reliability Engineering, DevOps, or Platform Engineering.
  • Strong hands-on experience with at least two major cloud providers (AWS, GCP, Azure).
  • Proficiency with Kubernetes for container orchestration and management.
  • Demonstrated expertise with IaC tools (Terraform, CloudFormation, Pulumi, or similar).
  • Experience with CI/CD platforms, particularly Argo and/or CircleCI.
  • Solid understanding of database technologies including MongoDB, Redis, and relational databases (RDS).
  • Proficiency in at least one programming or scripting language (Python, Go, Bash, Typescript, etc.).
  • Experience with monitoring and observability tools (e.g., Prometheus, Grafana, ELK, CloudWatch).
  • Experience implementing and managing OpenTelemetry (OTEL) for distributed tracing, metrics, and logging.
  • Strong understanding of networking, security, and infrastructure best practices.

Nice to Have

  • Experience managing multi-cloud or hybrid cloud environments.
  • Familiarity with service mesh technologies (Istio, Linkerd).
  • Knowledge of security hardening and compliance in cloud environments.
  • Experience with cost optimization in cloud infrastructure.
  • Contributions to open-source infrastructure or DevOps projects.
  • Certifications from major cloud providers.

Culture

The foundation of DevRev is its culture -- our commitment to those who are hungry, humble, honest, and who act with heart. Our vision is to help build the earth’s most customer-centric companies. Our mission is to leverage design, data engineering, and machine intelligence to empower engineers to embrace their customers.

That is DevRev!

Key skills/competency

  • Cloud Infrastructure (AWS, GCP, Azure)
  • Kubernetes
  • Infrastructure as Code (IaC)
  • CI/CD (Argo, CircleCI)
  • Database Management (MongoDB, Redis, RDS)
  • Monitoring & Observability (Prometheus, Grafana, OpenTelemetry)
  • Automation
  • System Reliability
  • Networking & Security
  • Incident Response

Tags:

Site Reliability Engineer
infrastructure design
system reliability
site reliability
operations automation
incident response
post-mortem analysis
collaboration
technology evaluation
security best practices
AWS
GCP
Azure
Kubernetes
Terraform
Argo
CircleCI
MongoDB
Redis
RDS
Prometheus
Grafana
OpenTelemetry
Python
Go
Bash
Typescript

Share Job:

How to Get Hired at DevRev

  • Research DevRev's culture: Study their mission, values (hungry, humble, honest, heart), recent news, and employee testimonials on LinkedIn and Glassdoor.
  • Tailor your resume: Customize your resume to highlight experience in multi-cloud (AWS, GCP, Azure), Kubernetes, IaC (Terraform), and CI/CD (Argo, CircleCI) relevant to DevRev's Site Reliability Engineer needs.
  • Showcase problem-solving: Prepare examples demonstrating your ability to design scalable infrastructure, automate tasks, conduct incident response, and collaborate effectively.
  • Understand their tech stack: Familiarize yourself with DevRev's core technologies like MongoDB, Redis, RDS, Prometheus, Grafana, ELK, and OpenTelemetry.
  • Highlight AI teammate alignment: Emphasize how your Site Reliability Engineer skills support building reliable systems for an AI-centric platform like DevRev's "Computer."

Frequently Asked Questions

Find answers to common questions about this job opportunity

Explore similar opportunities that match your background