12 days ago

AI Test Architect

NVIDIA

Hybrid
Full Time
$180,000
Hybrid

Job Overview

Job TitleAI Test Architect
Job TypeFull Time
CategoryCommerce
Experience5 Years
DegreeMaster
Offered Salary$180,000
LocationHybrid

Who's the hiring manager?

Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Uncover Hiring Manager

Job Description

About the Role

The AI Test Architect will join NVIDIA's E2E Verification group to profile innovative large scale distributed training on NVIDIA AI End-to-End solutions in supercomputing clusters. The role involves providing insights on at-scale system design and tuning mechanisms for large compute runs.

What You’ll Be Doing

You will be responsible for profiling, benchmarking, and analyzing deep learning models to identify optimization opportunities, with a special focus on networking. You will collaborate with data scientists, researchers, developers, and automation teams to design and implement scalable training pipelines. Staying updated on deep learning algorithms, NVIDIA GPU technologies and high-performance networking solutions is key. The role also involves optimizing deep learning models for performance, memory usage, power efficiency and addressing networking bottlenecks. Additionally, you will work closely with hardware engineers to integrate efficient networking solutions, exploring technologies such as RDMA and InfiniBand.

  • Profile and benchmark deep learning models
  • Collaborate with cross-functional teams
  • Stay updated with latest deep learning and NVIDIA GPU technologies
  • Optimize performance, memory and power usage
  • Guide development of high-performance networking solutions

What We Need To See

A B.Sc in Computer Science, Software Engineering or equivalent experience. Candidates should have 8+ years of experience with CUDA programming on deep learning frameworks like TensorFlow and PyTorch, with practical experience in high-performance networking. Strong analytical skills, excellent communication, and a deep understanding of profiling and optimizing deep learning workflows are essential.

Ways To Stand Out

Demonstrated experience in profiling and optimizing large-scale deep learning training, particularly with high-performance networking. Familiarity with distributed deep learning frameworks, NVIDIA's networking technologies (e.g., Mellanox InfiniBand) and optimization of network parameters such as bandwidth and latency will help you excel in this role.

Key skills/competency

  • Deep Learning
  • CUDA
  • Distributed Systems
  • Networking
  • Benchmarking
  • Profiling
  • High-performance Computing
  • InfiniBand
  • RDMA
  • Optimization

Tags:

AI Test Architect
Deep Learning
CUDA
Networking
Distributed Systems
Benchmarking
Profiling
HPC
InfiniBand
RDMA
Supercomputing
Optimization
Scalable Training

Share Job:

How to Get Hired at NVIDIA

  • Research NVIDIA's culture: Review their innovations and technology focus.
  • Customize your resume: Highlight deep learning and networking experience.
  • Emphasize technical projects: Include CUDA and profile optimization examples.
  • Prepare for interviews: Practice system design and technical problem solving.

Frequently Asked Questions

Find answers to common questions about this job opportunity

Explore similar opportunities that match your background