3 days ago

GenAI Optimization Intern

Modular

On Site
Intern
$135,200
Los Altos, CA

Job Overview

Job TitleGenAI Optimization Intern
Job TypeIntern
CategoryCommerce
Experience5 Years
DegreeMaster
Offered Salary$135,200
LocationLos Altos, CA

Who's the hiring manager?

Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Uncover Hiring Manager

Job Description

About Modular

At Modular, we’re on a mission to revolutionize AI infrastructure by systematically rebuilding the AI software stack from the ground up. Our team, made up of industry leaders and experts, is building cutting-edge, modular infrastructure that simplifies AI development and deployment. By rethinking the complexities of AI systems, we’re empowering everyone to unlock AI’s full potential and tackle some of the world’s most pressing challenges.

If you’re passionate about shaping the future of AI and creating tools that make a real difference in people’s lives, we want you on our team. You can read about our culture and careers to understand how we work and what we value.

What You Will Work On

As an intern on the MAX Serve team, you'll tackle real performance problems in state-of-the-art GenAI models like DeepSeek, Llama, and Qwen. You'll execute on a critical project related to performance optimization across the Python/Mojo boundary, working from GPU kernels up through compiler, framework, and multiprocessing optimizations. Your work will involve profiling bottlenecks in multi-billion parameter models, implementing bleeding-edge research (FP8 quantization, speculative decoding, advanced attention), and shipping optimizations to our open-source MAX platform that developers worldwide depend on.

What You Will Learn

  • Hands-on experience optimizing GenAI workloads at the intersection of compilers, runtimes, and distributed systems.
  • You'll gain deep expertise in performance optimization techniques, profiling tools, and systems programming while working with Mojo: our next-generation programming language.
  • Mentorship from experienced engineers on the MAX Serve team who work on problems spanning GPU kernels to serving APIs.
  • You'll present and demo your contributions to the engineering organization and make lasting contributions to open-source technology that powers production deployments achieving 200+ tokens/second on multi-GPU systems.

What You Bring To The Table

  • Currently pursuing a Bachelor's, Master's, or PhD in Computer Science, Engineering, or related field with graduation expected by Spring 2027 at the latest.
  • Strong Python programming skills and ML/Deep Learning coursework or projects (PyTorch, TensorFlow, etc.).
  • Passion for performance optimization and systems programming with curiosity for solving complex problems.
  • Strong verbal and written communication skills, ability to collaborate with mentors and peers.
  • Helpful experience includes: systems programming (C++, Rust), CUDA/ROCm, Python multiprocessing/asyncio, compilers, profiling tools, or parallel computing concepts.

What Modular Brings To The Table

  • Amazing Team. We are a progressive and agile team with some of the industry’s best engineering and product leaders.
  • Competitive Compensation. We offer very strong compensation packages, including stock options. We want people to be focused on their best work and believe in tailoring compensation plans to meet the needs of our workforce.
  • Team Building Events. We organize regular team onsites and local meetups in Los Altos, CA.

Working at Modular will enable you to grow quickly as you work alongside incredibly motivated and talented people who have high standards, possess a growth mindset, and a purpose to truly change the world.

Key skills/competency

  • GenAI Optimization
  • Performance Engineering
  • Python Programming
  • Machine Learning
  • Deep Learning Frameworks
  • Compilers
  • GPU Programming
  • Systems Programming
  • Profiling Tools
  • Open-Source Contribution

Tags:

GenAI Optimization Intern
Performance Optimization
AI Infrastructure
Machine Learning
Deep Learning
Compilers
Systems Programming
GPU Kernels
Python
Mojo
PyTorch
TensorFlow
C++
Rust
CUDA
ROCm
Multiprocessing
Parallel Computing
Open-Source
Distributed Systems

Share Job:

How to Get Hired at Modular

  • Research Modular's vision: Study their mission to revolutionize AI infrastructure, company values, and culture page on modular.com to align your application.
  • Tailor your resume for GenAI: Customize your resume to highlight Python, ML/Deep Learning projects, and any experience with performance optimization or compilers for the GenAI Optimization Intern role.
  • Showcase systems programming skills: Emphasize experience with C++, Rust, CUDA/ROCm, or parallel computing, as these are highly valued by Modular.
  • Prepare for technical interviews: Practice problem-solving related to performance optimization, GenAI models, and compiler concepts, specifically mentioning Mojo's relevance.
  • Demonstrate passion for AI: Articulate your enthusiasm for shaping the future of AI, open-source contributions, and tackling complex problems during your Modular interview.

Frequently Asked Questions

Find answers to common questions about this job opportunity

Explore similar opportunities that match your background