Deep Learning Scientist, LLM Training Datasets
@ NVIDIA

Hybrid
$200,000
Hybrid
Full Time
Posted 23 days ago

Your Application Journey

Personalized Resume
Apply
Email Hiring Manager
Interview

Email Hiring Manager

XXXXXXXXXX XXXXXXXXXXX XXXXXXXXXX******* @nvidia.com
Recommended after applying

Job Details

About the Role

NVIDIA is seeking a dedicated Deep Learning Scientist, LLM Training Datasets to innovate and engineer high-quality datasets for large language model training. This highly technical role requires deep expertise in machine learning, data science, and data engineering.

What You'll Be Doing

  • Develop datasets for LLM pre-training, fine-tuning, and reinforcement learning.
  • Design and implement data strategies including collection, cleaning, labeling and augmentation.
  • Generate high-quality synthetic data for domain-specific and safety-critical use cases.
  • Define annotation guidelines and curate labeled datasets for model alignment including RLHF.
  • Conduct experiments to optimize LLM performance with SFT and RL techniques.
  • Collaborate with ML researchers, data scientists, and infrastructure teams to integrate efficient ML workflows.

What We Need To See

  • Master’s or PhD in Computer Science, Electrical Engineering or related field, or equivalent experience.
  • 3+ years industry experience in dataset development and training generative AI models.
  • Proficiency in Python programming and familiarity with ML frameworks such as PyTorch and TensorFlow Data.
  • Experience with synthetic data generation, distributed training paradigms and performance evaluation of LLMs.
  • Strong collaborative and communication skills along with excellent problem solving capabilities.

Ways To Stand Out

  • Contributions to open-source data tools or research publications.
  • Experience with cloud platforms and modern data storage systems.
  • A passion for AI with continuous evaluation of new tools and methodologies.

Additional Information

With highly competitive salaries, equity, and a comprehensive benefits package, NVIDIA is renowned as one of the technology industry’s most desirable employers. Applications are accepted until September 30, 2025. NVIDIA champions diversity and equal opportunity in all its hiring practices.

Key skills/competency

Deep Learning, LLM, Datasets, Machine Learning, Data Engineering, Synthetic Data, Reinforcement Learning, PyTorch, TensorFlow, Distributed Training

How to Get Hired at NVIDIA

🎯 Tips for Getting Hired

  • Customize your resume: Highlight relevant deep learning and data engineering experience.
  • Showcase projects: Detail LLM dataset and synthetic data projects.
  • Network actively: Connect with NVIDIA employees on LinkedIn.
  • Prepare for technical interviews: Practice ML algorithms and problem-solving.

📝 Interview Preparation Advice

Technical Preparation

Review PyTorch and TensorFlow libraries.
Practice distributed training and data preprocessing.
Study synthetic data generation techniques.
Experiment with reinforcement learning setups.

Behavioral Questions

Describe past collaboration experiences.
Explain problem-solving techniques used.
Detail how you handle dataset challenges.
Share examples of effective team communication.

Frequently Asked Questions