13 days ago

DevOps Infrastructure Engineer

BTSE

Hybrid
Full Time
$150,000
Hybrid
Apply

Job Overview

Job TitleDevOps Infrastructure Engineer
Job TypeFull Time
Offered Salary$150,000
LocationHybrid

Who's the hiring manager?

Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Uncover Hiring Manager

Job Description

About BTSE

BTSE Group is a global leader in fintech and blockchain technology, anchored by three core business pillars: Exchange, Payments, and Infrastructure Development. Serving over 100 corporate clients worldwide, we provide white-label exchange and payment solutions. Our offerings encompass everything from exchange infrastructure hosting and development to custody, wallets, payments, blockchain integration, trading, and more. We are looking for talented professionals in marketing, operations, customer support, and other departments. The roles offered may be on-site, remote, or hybrid, in collaboration with our local partner.

About The Opportunity

You keep the platform running reliably. For the first client operating in crypto markets, this means 24/7 uptime with zero maintenance windows. You build multi-tenant Kubernetes infrastructure with per-tenant namespace isolation, manage GPU scheduling for AI model serving, set up CI/CD for rapid iteration, and own monitoring and on-call. You also automate tenant provisioning so that scaling from one client to ten is an operational exercise, not an engineering project.

Responsibilities

  • Set up a multi-tenant Kubernetes cluster: shared services namespace, per-tenant namespaces for isolated workloads, GPU node pools for model inference.
  • Build CI/CD pipeline: source control → container build → automated deployment with zero-downtime rolling updates.
  • Configure GPU management: scheduling, resource quotas per tenant, device plugins.
  • Set up comprehensive monitoring: per-tenant metrics, SLA tracking, data pipeline health, GPU utilisation, API latency percentiles, WebSocket connection stability.
  • Implement backup and disaster recovery: cross-region replication, automated database backups.
  • Build tenant provisioning automation: scripted creation of new tenant namespaces, storage, network policies, and service accounts.
  • Security hardening: network policies between namespaces, vulnerability scanning, audit logging.
  • 24/7 on-call during initial pilot (rotating with Tech Lead).

Requirements

  • 4+ years DevOps/SRE; Kubernetes cluster operations including multi-tenant patterns.
  • GPU workloads on Kubernetes (GPU Operator, device plugins, resource scheduling).
  • CI/CD pipelines: GitHub Actions, ArgoCD or FluxCD.
  • Terraform IaC.
  • On-call experience and incident management.

Nice to have

  • Kubernetes namespace isolation and network policies for multi-tenancy.
  • 24/7 systems experience (crypto, gaming, or global SaaS).
  • Monitoring WebSocket-heavy architectures and streaming data pipelines.
  • GPU cluster management for ML inference.

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

Key skills/competency

  • DevOps
  • Kubernetes
  • SRE
  • CI/CD
  • GPU Scheduling
  • Infrastructure as Code
  • Monitoring
  • On-call
  • Incident Management
  • Tenant Provisioning

Tags:

DevOps
Infrastructure Engineer
SRE
Kubernetes
CI/CD
GPU
Terraform
Cloud Computing
System Administration
Fintech

Share Job:

How to Get Hired at BTSE

  • Tailor your resume: Highlight your 4+ years of DevOps/SRE experience, focusing on Kubernetes, multi-tenant patterns, and GPU workloads.
  • Showcase CI/CD proficiency: Emphasize your experience with GitHub Actions, ArgoCD, or FluxCD, and Terraform IaC.
  • Demonstrate on-call readiness: Provide examples of your incident management and 24/7 systems experience.
  • Apply strategically: Clearly articulate how your skills align with building and maintaining reliable, scalable infrastructure for crypto markets.

Frequently Asked Questions

Find answers to common questions about this job opportunity

Explore similar opportunities that match your background