Job Overview
Who's the hiring manager?
Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Job Description
About Vultr
Vultr is on a mission to make high-performance cloud infrastructure easy to use, affordable, and locally accessible for enterprises and AI innovators around the world. With 32 global cloud data center locations, Vultr is trusted by hundreds of thousands of active customers across 185 countries for its flexible, scalable, global Cloud Compute, Cloud GPU, Bare Metal, and Cloud Storage solutions. In December 2024 Vultr announced an equity financing at a $3.5 billion valuation. Founded by David Aninowsky and self-funded for over a decade, Vultr has grown to become the world’s largest privately-held cloud infrastructure company.
Vultr Cares
- 100% company-paid insurance premiums for employee medical, dental and vision plans.
- 401(k) plan that matches 100% up to 4%, with immediate vesting
- Professional Development Reimbursement of $2,500 each year
- 11 Holidays + Paid Time Off Accrual + Rollover Plan
- Commitment matters to Vultr! Increased PTO at 3 year and 10 year anniversary + 1 month paid sabbatical every 5 years + Anniversary Bonus each year
- $500 stipend for remote office setup in first year + $400 each following year
- Internet reimbursement up to $75 per month
- Gym membership reimbursement up to $50 per month
- Company paid Wellable subscription
Join Vultr as a Strategic Technical Account Manager GPU
The GPU-focused Technical Account Manager (TAM) leads the post-sales technical success of customers deploying large-scale AI, training, inference, and high-performance GPU workloads on the company’s platform. This includes customers using NVIDIA GPU clusters, AMD GPU clusters, GPU VMs, and rack-scale bare-metal environments.
You will act as a trusted advisor across LLM training, fine-tuning, RAG workloads, distributed training frameworks, storage throughput requirements, multi-GPU scaling, and performance tuning. This role requires deep technical fluency and exceptional customer management skills to help AI/ML teams achieve predictable, cost-efficient, high-performance outcomes.
Key Responsibilities
AI/GPU Onboarding & Workload Architecture
- Lead onboarding for customers deploying GPU clusters (bare metal, VMs, or hybrid).
- Advise on cluster design: multi-GPU topology, NVLink/NVSwitch considerations, RDMA, Infiniband and RoCE Ethernet, networking throughput, and storage IOPS requirements.
- Guide customers in selecting GPU types and configurations based on workload (training, fine-tuning, inference, embeddings, RAG pipelines).
- Support distributed frameworks: PyTorch, TensorFlow, DeepSpeed, Megatron, JAX, Ray, Mosaic, HuggingFace, etc.
- Advanced hands on Kubernetes skills
- Advanced hands on SLURM skills
Performance Optimization & Scaling
- Identify bottlenecks (network, storage, memory bandwidth).
- Provide tuning recommendations for batch size, mixed precision, parallelization strategies, and checkpointing.
- Help customers evaluate cost vs. performance tradeoffs (GPU mix, CPU pairing, instance types, cluster sizing).
Technical Relationship Ownership
- Own the long-term technical strategy across assigned GPU/AI accounts, including hyperscalers, labs, and high-growth AI startups.
- Host recurring technical review meetings, roadmap reviews, and optimization sessions.
- Define scaling plans, future GPU reservation needs, and capacity forecasting.
- Partner with Support, SRE, Networking, NOC, and Product Management & Engineering to resolve high-urgency incidents.
- Manage outage communications, corrective action plans, and postmortem reviews with customers.
- Advocate for GPU reliability improvements and influence roadmap priorities.
Account Growth & Expansion
- Identify opportunities for expanded clusters, high speed storage, or networking upgrades.
- Support Sales with technical validation and architecture diagrams needed for expansion.
Customer Advocacy & Product Feedback
- Provide structured feedback on existing and future GPU offerings, networking fabrics, storage platforms, and upcoming AI/ML platform features.
- Partner with Product on early access programs (new GPUs, pipelines, orchestration, etc.).
Qualifications
- 2–5+ years as an AI/ML Engineer, AI/ML Ops, Technical Account Manager, HPC Engineer, Sales/Solutions Engineer or relevant technical role.
- Strong knowledge of GPU hardware architectures (NVIDIA/AMD), CUDA/ROCm, distributed training, and ML frameworks.
- Experience with Linux tuning, networking (Infiniband, RoCE Ethernet).
- Experience with high-performance storage systems (DDN, NetApp, Vast, Weka, etc.).
- Ability to communicate complex concepts clearly to both executives and engineering teams.
- Prior experience supporting hyperscale, AI labs, or large cluster deployments is a plus.
- Cloud Native Computing Foundation Certified Kubernetes Administrator (CKA) certification is a plus.
Compensation
$115,000 - $140,000
This salary can vary based on location, years of experience, background and skill set.
Key skills/competency
- AI/ML
- GPU Workloads
- Cloud Infrastructure
- Technical Account Management
- Performance Optimization
- Distributed Training
- Kubernetes
- Linux Tuning
- Networking
- Customer Relationship Management
How to Get Hired at Vultr
- Tailor your resume: Highlight AI/ML, GPU, cloud, and customer management experience.
- Showcase technical skills: Emphasize knowledge of GPU hardware, distributed training, and frameworks.
- Demonstrate customer focus: Provide examples of advising clients and managing technical relationships.
- Prepare for technical questions: Review Vultr's GPU offerings and common AI/ML challenges.
- Research Vultr's mission: Understand their impact on AI innovation and cloud accessibility.
Frequently Asked Questions
Find answers to common questions about this job opportunity
Explore similar opportunities that match your background