AI Agent Engineer
AMD
Job Overview
Who's the hiring manager?
Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Job Description
About AMD
At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.
The Role: AI Agent Engineer
As an AI Agent Engineer at AMD, you will be instrumental in developing and optimizing advanced AI solutions. This role focuses on building robust multi-agent workflows, enhancing training reliability, and improving distributed training performance to maximize GPU utilization.
Responsibilities
- Build and integrate multi-agent workflows.
- Expose platform capabilities via standard APIs and agent protocols (e.g., MCP/A2A).
- Improve training reliability through automation, failover, and health checks to ensure continuous job execution during cluster faults.
- Optimize distributed training performance, including parallelism, communication, storage, and operator tuning, to enhance GPU utilization.
- Own observability and debugging processes, utilizing logs, metrics, traces, profiling, and visualization techniques.
Requirements
- Strong background in software and systems engineering, with experience in large-scale/distributed training.
- Familiarity with a multi-agent framework (e.g., CrewAI, LangGraph) or a training stack (e.g., Megatron/DeepSpeed).
- Proven performance profiling and bottleneck isolation skills.
- Awareness of protocol and security aspects for agent connectivity and governance.
Nice to Have
- SFT/RL fine-tuning experience.
- End-to-end agent + LLM training pipeline experience.
- TorchFT (or similar) fault-tolerant training experience.
Academic Credentials
Bachelor’s or Master's degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent.
Key skills/competency
- AI Agent Development
- Multi-agent Systems
- Distributed Training
- GPU Optimization
- Software Engineering
- Systems Engineering
- Performance Profiling
- API Integration
- Observability
- LLM Pipelines
How to Get Hired at AMD
- Research AMD's culture: Study their mission, values, recent news, and employee testimonials on LinkedIn and Glassdoor.
- Customize your resume: Highlight experience in AI agents, distributed training, and GPU optimization for this AI Agent Engineer role.
- Showcase technical skills: Prepare to discuss your expertise in multi-agent frameworks, performance profiling, and systems engineering during interviews.
- Demonstrate problem-solving: Be ready to share specific examples of how you've debugged complex systems and optimized performance in previous roles.
- Network effectively: Connect with current AMD employees on LinkedIn to gain insights and potentially secure a referral.
Frequently Asked Questions
Find answers to common questions about this job opportunity
Explore similar opportunities that match your background