Want to get hired at NVIDIA?

This job post expired on November 11, 2025

But don't worry! We can still help you get hired at NVIDIA for similar DL Performance Software Engineer - LLM Inference roles.

DL Performance Software Engineer - LLM Inference

NVIDIA

Toronto, ONOn Site

Original Job Summary

Overview

At NVIDIA, we believe artificial intelligence (AI) will fundamentally transform how people live and work. Our mission is to advance AI research and development, creating groundbreaking technologies that enable anyone to harness the power of AI. Join our elite LLM inference team and help build innovative software that makes LLM inference more efficient, scalable, and accessible.

What You’ll Be Doing

You will be architecting and implementing top inference stacks in the LLM world. Responsibilities include:

Writing safe, scalable, modular, and high-quality C++/Python code.
Performing benchmarking, profiling, and system-level programming for GPU applications.
Providing code reviews, design documents, and tutorials to facilitate team collaboration.
Conducting unit tests and performance tests for various stages of the inference pipeline.

What We Need To See

To be successful in this role, you should have:

A Bachelor's degree in Computer Science, Computer Engineering, or a relevant field, or equivalent experience.
Strong coding skills in Python and C/C++.
At least 2 years of industry or research experience in software engineering.
A passion for machine learning and performance engineering.
Proven project experience where performance is a key focus.

Ways To Stand Out

Candidates will excel if they also have:

Solid fundamentals in machine learning, deep learning, operating systems, computer architecture, and parallel programming.
Research experience in systems or machine learning.
Project experience with modern DL software such as PyTorch, CUDA, vLLM, SGLang, and TensorRT-LLM.
Experience with performance modeling, profiling, debugging, and architectural optimization for CPU and GPU systems.

Additional Information

NVIDIA is recognized as one of the technology world’s most desirable employers, renowned for its forward-thinking and hardworking team. Your base salary will be determined based on your location, experience, and internal benchmarks. You will also be eligible for equity and benefits. Applications for this job will be accepted at least until September 2, 2025.

Key Skills/Competency

LLM inference, C++, Python, GPU, benchmarking, profiling, performance, machine learning, deep learning, distributed systems.

How to Get Hired at NVIDIA

🎯 Tips for Getting Hired

Customize your resume: Highlight performance engineering and ML projects.
Demonstrate technical skills: Showcase coding proficiency in C++ and Python.
Prepare for technical interviews: Practice benchmarking and profiling challenges.
Research NVIDIA: Understand their AI mission and successes.

📝 Interview Preparation Advice

Technical Preparation

Revise C++ and Python coding practices.

Study GPU programming and kernel optimization.

Practice system and performance profiling techniques.

Review distributed system design fundamentals.

Behavioral Questions

Describe a challenging project experience.

Explain your teamwork approach and communication style.

Discuss problem-solving under deadline pressure.

Share examples of receiving and applying feedback.

Ready to optimize your application for NVIDIA?

Our Al will adapt your resume for NVIDIA's hiring patterns and similar DL Performance Software Engineer - LLM Inference roles.