8 days ago

Cloud Ops Engineer

Motive

On Site
Full Time
$135,000
Lahore, Punjab, Pakistan

Job Overview

Job TitleCloud Ops Engineer
Job TypeFull Time
CategoryCommerce
Experience5 Years
DegreeMaster
Offered Salary$135,000
LocationLahore, Punjab, Pakistan

Who's the hiring manager?

Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Uncover Hiring Manager

Job Description

About Motive

Motive empowers the people who run physical operations with tools to make their work safer, more productive, and more profitable. For the first time ever, safety, operations and finance teams can manage their drivers, vehicles, equipment, and fleet related spend in a single system. Combined with industry leading AI, the Motive platform gives you complete visibility and control, and significantly reduces manual workloads by automating and simplifying tasks.

Motive serves nearly 100,000 customers – from Fortune 500 enterprises to small businesses – across a wide range of industries, including transportation and logistics, construction, energy, field service, manufacturing, agriculture, food and beverage, retail, and the public sector.

Visit gomotive.com to learn more.

About The Role

As a Cloud Ops Engineer in the Platform Engineering organization, you will be a core member of the team responsible for the operational health, reliability, and observability of the entire Motive platform. Your mission is to ensure our globally distributed, highly-available systems recover from issues quickly and help ensure we avoid future issues. You will spearhead the effort to manage incident response, build world-class monitoring systems, and drive automation across all operational workflows. This is a critical role that improves the daily lives of engineers and ensures a consistent, high-quality experience for all our customers, across all our complex tech stacks from our core SaaS product to our mobile and AI-powered embedded systems. If you are passionate about high-leverage work, automation, continuously improving critical systems and processes, being involved across all areas of engineering, and are able to be the strong calm in the middle of incidents, this is the perfect role for you.

What You'll Do

  • Own and refine the incident management lifecycle and be the incident commander, running communication and triage, and post-incident analysis and follow-ups to drive continuous service improvement.
  • Manage the central on-call solution and integrations used by over 100 teams from different monitoring and other platforms, leveraging automation and self-serve tools such as terraform.
  • Analyze operational statistics (MTTR, incident frequency, service-level data) to identify trends and prioritize reliability initiatives and teams’ focus.
  • Improve change management processes and automation to reduce both risk and friction.
  • Collaborate with engineering teams across the organization to standardize operational practices and develop automated workflows.
  • Leverage AI for incident analysis, alert/issue solutioning, and automation.

What We're Looking For

We are looking for an individual with prior experience in cloud operations, site reliability engineering, or a similar field, who has a passion for improving system reliability and operational processes in a large-scale, distributed environment.

  • Experience managing and participating in a 24/7 on-call rotation and incident response process.
  • Experience with on-call systems such as Rootly, PagerDuty, Opsgenie, etc.
  • Experience with monitoring and observability tools (e.g., Datadog, NewRelic, Grafana, etc.).
  • Ability to communicate clearly and manage incidents, communications, and action items with stakeholders from engineers to directors, and public-facing messaging.
  • Experience with IT Service Management tools (Jira/JSM) for ticket and change management.
  • 3+ years experience in an incident response role.

Bonus Skills To Have

  • Experience with Infrastructure as Code (IaC) tools such as Terraform.
  • Scripting and automation skills in at least one modern language (Python, Go, Bash) - AI-coding assistance welcomed.
  • Prior experience in an Ops or SRE team supporting a diverse cloud product.

Creating a diverse and inclusive workplace is one of Motive's core values. We are an equal opportunity employer and welcome people of different backgrounds, experiences, abilities and perspectives.

Please review our Candidate Privacy Notice here.

UK Candidate Privacy Notice here.

The applicant must be authorized to receive and access those commodities and technologies controlled under U.S. Export Administration Regulations. It is Motive's policy to require that employees be authorized to receive access to Motive products and technology.

Key skills/competency

  • Cloud Operations
  • Site Reliability Engineering
  • Incident Management
  • Monitoring
  • Observability
  • Automation
  • Terraform
  • Python
  • Go
  • Distributed Systems
  • SaaS Operations

Tags:

Cloud Operations Engineer
Incident management
Site reliability
Observability
Automation
Distributed systems
On-call
Operational metrics
Change management
Collaboration
AI leverage
Datadog
NewRelic
Grafana
Rootly
PagerDuty
Opsgenie
Jira
JSM
Terraform
Python
Go
Bash

Share Job:

How to Get Hired at Motive

  • Research Motive's culture: Study their mission, values, recent news, and employee testimonials on LinkedIn and Glassdoor.
  • Tailor your resume: Highlight cloud operations, SRE, and incident response experience at Motive.
  • Showcase technical expertise: Emphasize proficiency in monitoring, automation, IaC, and scripting.
  • Prepare for incident scenarios: Practice explaining your approach to managing critical system outages.
  • Demonstrate communication skills: Be ready to discuss stakeholder management during incidents.

Frequently Asked Questions

Find answers to common questions about this job opportunity

Explore similar opportunities that match your background