ML Compiler Engineer, AWS Neuron
Amazon Web Services (AWS)
Job Overview
Who's the hiring manager?
Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Job Description
ML Compiler Engineer, AWS Neuron at Amazon Web Services (AWS)
The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon’s custom machine learning accelerators, Inferentia and Trainium.
The Neuron team is hiring systems and compiler engineers to solve our customers toughest problems. Specifically, the performance team in Toronto is focused on analysis and optimization of system-level performance of machine learning models on AWS ML accelerators. The team conducts in-depth profiling and works across multiple layers of the technology stack - from frameworks and compilers to runtime and collectives - to meet and exceed customer requirements while maintaining a competitive edge in the market. As part of the Neuron Compiler organization, the team not only identifies and implements performance optimizations but also works to crystallize these improvements into the compiler, automating optimizations for broader customer benefit.
This is an opportunity to work on products at the intersection of machine-learning, high-performance computing, and distributed architectures. You will architect and implement business-critical features, publish research, and mentor a brilliant team of experienced engineers. We operate in spaces that are very large, yet our teams remain small and agile. There is no blueprint. We're inventing. We're experimenting. It is a very unique learning culture. The team works closely with customers on their model enablement, providing direct support and optimization expertise to ensure their machine learning workloads achieve optimal performance on AWS ML accelerators.
The Product: AWS Neuron SDK
The AWS Machine Learning accelerators (Inferentia/Trainium) offer unparalleled ML inference and training performances. These are enabled through a state-of-the-art software stack – the AWS Neuron Software Development Kit (SDK). This SDK comprises an ML compiler, runtime, and application framework, seamlessly integrating into popular ML frameworks like PyTorch. AWS Neuron, running on Inferentia and Trainium, is trusted and used by leading customers such as Snap, Autodesk, and Amazon Alexa.
The Team: Annapurna Labs & Neuron Compiler
Annapurna Labs was a startup company acquired by AWS in 2015. Our organization covers multiple disciplines including silicon engineering, hardware design and verification, software, and operations. AWS Nitro, ENA, EFA, Graviton and F1 EC2 Instances, AWS Neuron, Inferentia and Trainium ML Accelerators, and scalable NVMe storage are some of the products we have delivered.
Within this ecosystem, the Neuron Compiler team develops a deep learning compiler stack that takes state-of-the-art LLM, Vision, and multi-modal models created in frameworks such as TensorFlow, PyTorch, and JAX, making them run performantly on our accelerators. The team is comprised of bright minds focused on creating a toolchain that will provide a quantum leap in performance.
Key Job Responsibilities
- Analyze and optimize system-level performance of machine learning models across the entire technology stack, from frameworks to runtime.
- Conduct detailed performance analysis and profiling of ML workloads, identifying and resolving bottlenecks in large-scale ML systems.
- Work directly with customers to enable and optimize their ML models on AWS accelerators, understanding their specific requirements and use cases.
- Design and implement compiler optimizations, transforming manual performance improvements into automated compiler passes.
- Collaborate across teams to develop innovative optimization techniques that enhance AWS Neuron SDK's performance capabilities.
- Work in a startup-like development environment, focusing on high-impact projects.
About The AWS Team
Diverse Experiences: AWS values diverse experiences and encourages all qualified candidates to apply, regardless of traditional career paths or alternative experiences.
Why AWS: Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and continuously innovate, trusted by startups to Global 500 companies.
Inclusive Team Culture: We embrace differences and are committed to a culture of inclusion, reinforced by our 16 Leadership Principles. We support various employee-led affinity groups and learning experiences.
Work/Life Balance: Our team values work-life balance and offers flexibility in working hours to help you find fulfillment in both your personal and professional life.
Mentorship & Career Growth: We support new members with a broad mix of experience levels, fostering knowledge sharing and mentorship. We prioritize career growth by assigning projects that aid professional development.
Basic Qualifications
- 3+ years of non-internship professional software development experience.
- 2+ years of non-internship design or architecture experience (design patterns, reliability, and scaling) of new and existing systems.
- Experience programming with at least one software programming language.
Preferred Qualifications
- 3+ years of full software development life cycle experience, including coding standards, code reviews, source control management, build processes, testing, and operations.
- Bachelor's degree in computer science or equivalent.
- Experience in compiler design for CPU/GPU/Vector engines/ML-accelerators.
- Experience with System Level performance analysis and optimization.
- Experience with LLVM and/or MLIR.
- Experience with the following technologies: PyTorch, OpenXLA, StableHLO, JAX, TVM, deep learning models, and algorithms.
Key skills/competency
- ML Compiler Design
- Performance Optimization
- Deep Learning Accelerators
- Distributed Systems
- System Level Profiling
- PyTorch / TensorFlow / JAX
- LLVM / MLIR
- Algorithm Optimization
- Customer Engagement
- Software Architecture
How to Get Hired at Amazon Web Services (AWS)
- Research AWS culture: Study their mission, values, recent news, and employee testimonials on LinkedIn and Glassdoor.
- Tailor your resume: Highlight experience in ML compilers, performance optimization, and distributed systems specific to AWS's needs.
- Prepare for technical deep dives: Strengthen knowledge of compiler design, deep learning frameworks, and system-level performance analysis.
- Demonstrate customer obsession: Be ready to share examples of how you've solved complex problems for customers or end-users.
- Practice Amazon Leadership Principles: Prepare STAR method answers for questions related to innovation, ownership, and learning.
Frequently Asked Questions
Find answers to common questions about this job opportunity
Explore similar opportunities that match your background