21 hours ago

AI Agent Engineer

AMD

On Site
Full Time
$180,000
Beijing, Beijing, China

Job Overview

Job TitleAI Agent Engineer
Job TypeFull Time
Offered Salary$180,000
LocationBeijing, Beijing, China

Who's the hiring manager?

Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Uncover Hiring Manager

Job Description

About AMD

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.

The Role: AI Agent Engineer

As an AI Agent Engineer at AMD, you will be instrumental in developing and optimizing advanced AI solutions. This role focuses on building robust multi-agent workflows, enhancing training reliability, and improving distributed training performance to maximize GPU utilization.

Responsibilities

  • Build and integrate multi-agent workflows.
  • Expose platform capabilities via standard APIs and agent protocols (e.g., MCP/A2A).
  • Improve training reliability through automation, failover, and health checks to ensure continuous job execution during cluster faults.
  • Optimize distributed training performance, including parallelism, communication, storage, and operator tuning, to enhance GPU utilization.
  • Own observability and debugging processes, utilizing logs, metrics, traces, profiling, and visualization techniques.

Requirements

  • Strong background in software and systems engineering, with experience in large-scale/distributed training.
  • Familiarity with a multi-agent framework (e.g., CrewAI, LangGraph) or a training stack (e.g., Megatron/DeepSpeed).
  • Proven performance profiling and bottleneck isolation skills.
  • Awareness of protocol and security aspects for agent connectivity and governance.

Nice to Have

  • SFT/RL fine-tuning experience.
  • End-to-end agent + LLM training pipeline experience.
  • TorchFT (or similar) fault-tolerant training experience.

Academic Credentials

Bachelor’s or Master's degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent.

Key skills/competency

  • AI Agent Development
  • Multi-agent Systems
  • Distributed Training
  • GPU Optimization
  • Software Engineering
  • Systems Engineering
  • Performance Profiling
  • API Integration
  • Observability
  • LLM Pipelines

Tags:

AI Agent Engineer
AI Development
Multi-Agent Systems
Distributed Training
GPU Optimization
Software Engineering
Systems Engineering
Performance Profiling
Observability
LLM
CrewAI
LangGraph
Megatron
DeepSpeed
TorchFT
APIs
Python
Machine Learning
Data Centers
Cloud Computing

Share Job:

How to Get Hired at AMD

  • Research AMD's culture: Study their mission, values, recent news, and employee testimonials on LinkedIn and Glassdoor.
  • Customize your resume: Highlight experience in AI agents, distributed training, and GPU optimization for this AI Agent Engineer role.
  • Showcase technical skills: Prepare to discuss your expertise in multi-agent frameworks, performance profiling, and systems engineering during interviews.
  • Demonstrate problem-solving: Be ready to share specific examples of how you've debugged complex systems and optimized performance in previous roles.
  • Network effectively: Connect with current AMD employees on LinkedIn to gain insights and potentially secure a referral.

Frequently Asked Questions

Find answers to common questions about this job opportunity

Explore similar opportunities that match your background