DL Performance Software Engineer - LLM Inference @ NVIDIA
Your Application Journey
Email Hiring Manager
Job Details
Overview
At NVIDIA, we believe artificial intelligence (AI) will fundamentally transform how people live and work. Our mission is to advance AI research and development, creating groundbreaking technologies that enable anyone to harness the power of AI. Join our elite LLM inference team and help build innovative software that makes LLM inference more efficient, scalable, and accessible.
What You’ll Be Doing
You will be architecting and implementing top inference stacks in the LLM world. Responsibilities include:
- Writing safe, scalable, modular, and high-quality C++/Python code.
- Performing benchmarking, profiling, and system-level programming for GPU applications.
- Providing code reviews, design documents, and tutorials to facilitate team collaboration.
- Conducting unit tests and performance tests for various stages of the inference pipeline.
What We Need To See
To be successful in this role, you should have:
- A Bachelor's degree in Computer Science, Computer Engineering, or a relevant field, or equivalent experience.
- Strong coding skills in Python and C/C++.
- At least 2 years of industry or research experience in software engineering.
- A passion for machine learning and performance engineering.
- Proven project experience where performance is a key focus.
Ways To Stand Out
Candidates will excel if they also have:
- Solid fundamentals in machine learning, deep learning, operating systems, computer architecture, and parallel programming.
- Research experience in systems or machine learning.
- Project experience with modern DL software such as PyTorch, CUDA, vLLM, SGLang, and TensorRT-LLM.
- Experience with performance modeling, profiling, debugging, and architectural optimization for CPU and GPU systems.
Additional Information
NVIDIA is recognized as one of the technology world’s most desirable employers, renowned for its forward-thinking and hardworking team. Your base salary will be determined based on your location, experience, and internal benchmarks. You will also be eligible for equity and benefits. Applications for this job will be accepted at least until September 2, 2025.
Key Skills/Competency
LLM inference, C++, Python, GPU, benchmarking, profiling, performance, machine learning, deep learning, distributed systems.
How to Get Hired at NVIDIA
🎯 Tips for Getting Hired
- Customize your resume: Highlight performance engineering and ML projects.
- Demonstrate technical skills: Showcase coding proficiency in C++ and Python.
- Prepare for technical interviews: Practice benchmarking and profiling challenges.
- Research NVIDIA: Understand their AI mission and successes.