AI Software Engineer
Zoom
Job Overview
Who's the hiring manager?
Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Job Description
About the AI Software Engineer Role at Zoom
The AI Infra team at Zoom focuses on building world-class inference infrastructure for all Zoom’s AI services. This includes delivering high efficiency, scalability, and cost optimization for a diverse range of AI applications such as large language models (LLM), vision-language models (VLM), automatic speech recognition (ASR), and machine translation. The team emphasizes seamless collaboration between small and large models, ensuring cost-effective, privacy-preserving, and high-quality AI services for customers.
The AI Infra Team
As an AI Software Engineer on Zoom’s AI Infra team, you will be instrumental in designing, optimizing, and scaling the runtimes and services that power their AI models. Your contributions will directly enhance efficiency, reduce latency, and lower costs across Zoom’s entire AI stack, ultimately delivering reliable, high-performance AI experiences to millions of users globally.
Key Responsibilities
- Develop and optimize AI runtimes for LLMs, ASR, and MT systems, focusing on performance and cost efficiency.
- Apply GPU-level optimization techniques including CUDA, kernel fusion, and memory throughput improvements.
- Implement inference optimizations such as TorchCompile, graph optimization, KV cache, and continuous batching.
- Build scalable, highly available infrastructure services to support enterprise-grade AI workloads.
- Optimize models for edge devices (laptops, PCs, mobile devices) and large-scale cloud deployments.
- Continuously improve latency, throughput, and efficiency across serving pipelines.
- Rapidly integrate and optimize new industry models to maintain a leading edge in AI infrastructure.
What We’re Looking For
- A proven track record of building scalable, reliable AI infrastructure under real-world production constraints.
- Strong expertise in GPU programming and optimization, including CUDA and kernel-level development.
- Deep experience with transformer-based models and inference frameworks (e.g., vLLM, TensorRT-LLM, SGLang, ONNX Runtime).
- Proficiency in programming languages like Python and C++ (Java is a beneficial plus).
- Hands-on experience with PyTorch (including TorchCompile, graph-level optimization) and/or TensorFlow.
- Solid understanding of low-level hardware concepts, such as GPU memory hierarchy, caching, and vectorization.
- Familiarity with major cloud platforms (AWS, GCP, Azure) and AI deployment tools (Docker, Kubernetes, MLflow).
Compensation and Benefits
The salary range for this position is between $143,000.00 and $312,800.00. Zoom offers a Total Direct Compensation philosophy that includes base salary, bonus, and equity value, with starting pay commensurate with qualifications and experience. The company also maintains a location-based compensation structure.
Zoom's benefits program is designed to support employees' physical, mental, emotional, and financial health, promote work-life balance, and enable community contributions, reflecting an award-winning workplace culture.
Working Environment
Zoom operates with a structured hybrid approach, integrating office-based work with remote environments. The specific work style for each role (Hybrid, Remote, or In-Person) is specified in individual job descriptions.
About Zoom
Zoom enables global connectivity and productivity through its leading collaboration platform, offering products like Zoom Contact Center, Zoom Phone, Zoom Events, and Zoom Rooms. The company thrives on problem-solving, rapid innovation, and designing user-centric solutions. Employees are encouraged to grow their skills and advance their careers within a collaborative, growth-focused environment.
Commitment to Diversity & Inclusion
Zoom is dedicated to fair hiring practices, evaluating candidates based on skills, experience, and potential. They welcome individuals from diverse backgrounds and provide accommodations during the hiring process for medical disabilities, ensuring an equitable experience for all applicants.
Key skills/competency
- AI Infrastructure
- GPU Optimization
- CUDA
- Machine Learning Inference
- LLM Optimization
- Scalability
- Distributed Systems
- Python/C++
- PyTorch/TensorFlow
- Cloud Platforms (AWS, GCP, Azure)
How to Get Hired at Zoom
- Research Zoom's AI Vision: Study Zoom's commitment to AI services, infrastructure, and their impact on collaboration products.
- Tailor your Resume for AI Infra: Highlight experience in GPU optimization, transformer models, PyTorch/TensorFlow, and cloud AI deployments.
- Showcase Production ML Expertise: Emphasize projects demonstrating scalable, reliable AI infrastructure under real-world constraints.
- Prepare for Technical Depth: Be ready to discuss CUDA, kernel optimization, inference frameworks (vLLM, TensorRT-LLM), and low-level hardware.
- Demonstrate Problem-Solving Skills: Articulate how you’ve tackled latency, throughput, and cost efficiency challenges in AI systems.
Frequently Asked Questions
Find answers to common questions about this job opportunity
Explore similar opportunities that match your background