ML/AI Software Engineer - Triton GPU Kernel Optimization at AMD | Apply at AMD | Jobs near Belgrade

About the role

About AMD and the Role

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career. We are looking for an experienced ML/AI software engineer with deep expertise in GPU kernel development and building high‑performance primitives for training and inference. You’ll design, implement, and optimize custom Triton kernels for core ML workloads integrate them into our frameworks and services, and drive end‑to‑end performance on AMD ROCm platform across AMD Radeon and Ryzen product families.

Key Responsibilities

Design, implement, and maintain Triton GPU kernels for state‑of‑the‑art ML workloads, with a focus on fusion, tiling, vectorization, and memory‑hierarchy optimization for AMD RDNA GPU architectures.
Analyze performance using profiling tools; identify bottlenecks in memory hierarchy, tensor core/matrix core utilization, warp/wavefront scheduling, and synchronization.
Optimize kernels across problem shapes, batch sizes, and sequence lengths; implement autotuning strategies and performance heuristics.
Integrate custom kernels with PyTorch (torch.compile, torch._inductor), JAX, or other frameworks; develop Python/C++ bindings and align with runtime/graph compilers.
Track and contribute to upstream Triton and related compiler ecosystems; propose enhancements aligned with our workloads.

Preferred Experience

Triton GPU kernel development and optimization experience
CUDA or HIP kernel development experience (porting, performance tuning, and feature enablement)
Experience optimizing kernels on AMD GPUs and familiarity with the ROCm software stack (HIP runtime, rocBLAS, MIOpen, rocWMMA, RCCL, etc.) is a plus
Familiarity with compiler internals and IRs (LLVM, MLIR) and codegen for GPUs
Strong background in linear algebra, convolution algorithms, attention mechanisms, or other core ML primitives
Experience integrating kernels with PyTorch, JAX, or TensorFlow, including custom ops/extensions
Knowledge of quantization (INT8/FP8), mixed precision, custom dtypes, and numerics (stability, error analysis).
Experience with LLM, diffusion and MoE workloads: FlashAttention‑style kernels, paged attention, grouped‑query attention, rotary embeddings, fused MLPs.
Contributions to open-source projects in Triton, ROCm, or related GPU/ML ecosystems are a plus

Academic Credentials

Bachelor’s, Master, or PhD in Computer Science, Electrical Engineering or relevant fields.

Key skills/competency

ML AI Software Engineer
Triton GPU Kernel Optimization
C++ AI Development
GPU Programming
Performance Analysis
Compiler Internals
PyTorch Integration
AMD ROCm Platform
Linear Algebra
Large Language Models (LLM)

How to get hired

Tailor your resume: Highlight Triton, CUDA/HIP, C++, and ML kernel optimization experience. Quantify achievements with performance gains.
Showcase projects: Detail any contributions to open-source Triton, ROCm, or related ML/GPU ecosystems.
Prepare for technical interviews: Expect deep dives into GPU architecture, kernel optimization techniques, and ML algorithms.
Understand AMD's culture: Research AMD's focus on innovation, collaboration, and customer solutions.
Apply strategically: Clearly state how your skills align with the ML/AI Software Engineer role.

Frequently asked questions

What specific ML workloads are prioritized for Triton kernel optimization at AMD?

AMD prioritizes optimization for state-of-the-art ML workloads, including LLMs, diffusion models, and MoE workloads. This encompasses areas like FlashAttention-style kernels, paged attention, grouped-query attention, rotary embeddings, and fused MLPs, all aiming to leverage AMD's RDNA GPU architectures effectively.

How does AMD support contributions to open-source projects like Triton and ROCm for this ML AI Software Engineer role?

AMD encourages and values contributions to open-source projects. For this ML/AI Software Engineer role, contributing to upstream Triton and related compiler ecosystems is expected, with opportunities to propose enhancements and actively participate in community development.

What is the expected level of experience with AMD's ROCm software stack for this position?

While experience optimizing kernels on AMD GPUs and familiarity with the ROCm software stack (HIP runtime, rocBLAS, MIOpen, rocWMMA, RCCL, etc.) is considered a plus, a strong foundation in GPU programming (Triton, CUDA, HIP) and ML workloads is essential. The role offers opportunities to deepen ROCm expertise.

Are there opportunities for career growth in AI and GPU computing at AMD for this role?

Yes, AMD emphasizes career advancement. Joining AMD means shaping the future of AI and beyond, with opportunities to advance your career by working on cutting-edge technologies and solving important challenges in AI and high-performance computing.

What academic background is typically required for the ML/AI Software Engineer role?

The preferred academic credentials for this ML/AI Software Engineer position include a Bachelor’s, Master’s, or PhD in Computer Science, Electrical Engineering, or a closely related technical field. Strong practical experience in C++ AI development and GPU programming is also highly valued.

How does AMD use AI in its hiring process for roles like ML AI Software Engineer?

AMD may utilize Artificial Intelligence to assist in screening, assessing, or selecting candidates for positions like the ML/AI Software Engineer. Candidates can refer to AMD’s 'Responsible AI Policy' for more information on their AI usage in recruitment.

ML/AI Software Engineer - Triton GPU Kernel Optimization

Job highlights