
Staff Systems Engineer
Graphcore · Austin, TX
- On site
- Full-time
- $150,000 / year
- Austin, TX
Email the hiring manager to get a response.
Get their verified email + an intro that's ready to send.
Subject: Interested in the Staff Systems Engineer role at Graphcore
Hi Riley — I came across the Staff Systems Engineer opening and wanted to reach out directly. I've spent the last few years doing exactly this kind of work, and Graphcore stood out because…
✎ Personalized to your résumé after sign-up.
- ✓ Verified email of the hiring manager
- ✓ Intro email personalized to your résumé
- ✓ $9/mo = unlimited — any job link
Secure checkout · cancel anytime
Job highlights
- Support advanced AI hardware platforms in labs.
- Troubleshoot complex server and rack systems.
- Collaborate with engineering and data center teams.
- Focus on hardware validation and reliability.
- Mentor junior engineers and technicians.
About the role
About Us
Graphcore is one of the world’s leading innovators in Artificial Intelligence compute. It is developing hardware, software and systems infrastructure that will unlock the next generation of AI breakthroughs and power the widespread adoption of AI solutions across every industry.
As part of the SoftBank Group, Graphcore is a member of an elite family of companies responsible for some of the world’s most transformative technologies. Together, they share a bold vision: to enable Artificial Super Intelligence and ensure its benefits are accessible to everyone.
Graphcore’s teams are drawn from diverse backgrounds and bring a broad range of skills and perspectives. A melting pot of AI research specialists, silicon designers, software engineers and systems architects, Graphcore enjoys a culture of continuous learning and constant innovation.
Job Summary
We are seeking a Staff Systems Engineer to provide advanced operational, diagnostic, and engineering support for Graphcore’s Arm-based hardware platforms across lab and data center environments.
This role focuses on supporting hardware bring-up, validation, and troubleshooting of complex AI compute platforms, including server blades, racks, and rack-scale infrastructure. The successful candidate will collaborate closely with engineering, platform, and data center teams to ensure the reliability and performance of next-generation AI systems.
The Team
The Systems Engineering and Hardware Engineering teams are responsible for enabling the bring-up, validation, and operational reliability of Graphcore’s AI infrastructure platforms.
The team works closely with server engineering, firmware teams, platform architects, and data center operations to support the development, testing, and deployment of next-generation AI compute systems.
This collaborative environment enables rapid problem-solving and continuous improvement of Graphcore’s hardware platforms from early development through production deployment.
Responsibilities and Duties
- Lead advanced break-fix troubleshooting for server blades, motherboards, power systems, and rack-scale infrastructure.
- Support engineering bring-up activities, including component validation and firmware interaction testing.
- Diagnose system-level failures involving thermal behavior, power anomalies, network configuration, and BIOS/BMC issues.
- Collaborate with server engineering teams to perform root cause analysis and propose corrective actions or design improvements.
- Support deployment and rollout of next-generation hardware platforms through structured validation and qualification cycles.
- Interface with facilities and infrastructure teams to understand environmental factors impacting system reliability.
- Develop and maintain standard operating procedures (SOPs), troubleshooting guides, and validation documentation.
- Provide guidance and mentorship to junior technicians and engineers on troubleshooting methodologies and hardware diagnostics.
- Participate in on-call rotations or off-hours support during critical engineering milestones or hardware bring-up phases.
Candidate Profile
Essential
- Bachelor’s degree in Electrical Engineering, Computer Engineering, Computer Science, or related discipline.
- Strong experience with server hardware architectures and board-level debugging.
- Experience analyzing system logs, hardware telemetry, and power/thermal metrics to isolate hardware failures.
- Hands-on experience with HPC systems, AI compute platforms, or rack-scale infrastructure.
- Strong collaboration skills and ability to work effectively in fast-paced engineering environments.
- Excellent written and verbal communication skills.
Desirable
- Experience supporting prototype or pre-production hardware bring-up.
- Familiarity with data center facilities, including liquid cooling and power distribution systems.
- Experience using Python, Bash, or automation tools for hardware validation or troubleshooting.
- Exposure to structured failure analysis and reliability engineering methodologies.
USA Benefits
In addition to a competitive salary, Graphcore offers flexible working and a comprehensive benefits package designed to support your health, wellbeing and financial future. Our benefits include medical, dental and vision coverage, Flexible Spending Accounts (FSAs), Health Savings Accounts (HSAs), disability and life insurance, a 401(k) retirement plan, commuter benefits, wellness services and an Employee Assistance Programme (EAP). We welcome people of different backgrounds and experiences; we're committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments.
Key skills/competency
- Staff Systems Engineer
- Hardware Bring-up
- Troubleshooting
- Server Hardware Architecture
- Board-Level Debugging
- HPC Systems
- AI Compute Platforms
- Rack-Scale Infrastructure
- Python
- Bash
Skills & topics
- Staff Systems Engineer
- Hardware Engineering
- AI Compute
- Server Hardware
- Troubleshooting
- HPC Systems
- Data Center
- Board-Level Debugging
- Firmware
- Python
How to get hired
- Research Graphcore's culture: Study their mission, values, recent news, and employee testimonials on LinkedIn and Glassdoor.
- Tailor your resume: Highlight your experience with server hardware, debugging, and AI compute platforms. Use keywords from the job description.
- Prepare for technical questions: Be ready to discuss your troubleshooting methodologies, hardware diagnostic experience, and familiarity with HPC or AI systems.
- Showcase collaboration skills: Emphasize your ability to work effectively in fast-paced engineering environments and communicate clearly.
- Highlight relevant experience: If applicable, mention experience with prototype bring-up, data center facilities, or automation tools.
Technical preparation
Behavioral questions
Frequently asked questions
- What is the primary focus of the Staff Systems Engineer role at Graphcore?
- The Staff Systems Engineer role at Graphcore is primarily focused on providing advanced operational, diagnostic, and engineering support for Graphcore’s Arm-based hardware platforms in lab and data center environments. This includes hardware bring-up, validation, and troubleshooting of complex AI compute platforms.
- What are the essential qualifications for a Staff Systems Engineer at Graphcore?
- Essential qualifications include a Bachelor’s degree in Electrical Engineering, Computer Engineering, Computer Science, or related field, strong experience with server hardware architectures and board-level debugging, experience analyzing system logs and telemetry, and hands-on experience with HPC systems, AI compute platforms, or rack-scale infrastructure. Excellent communication and collaboration skills are also crucial.
- What kind of troubleshooting challenges can I expect as a Staff Systems Engineer at Graphcore?
- You can expect to lead advanced break-fix troubleshooting for server blades, motherboards, power systems, and rack-scale infrastructure. This involves diagnosing system-level failures related to thermal behavior, power anomalies, network configuration, and BIOS/BMC issues.
- Does Graphcore offer remote work options for this Staff Systems Engineer position?
- While the job description mentions lab and data center environments, Graphcore also offers flexible working. It's best to clarify the specific work arrangement and remote possibilities during the interview process.
- What is the career progression for a Staff Systems Engineer at Graphcore?
- Graphcore fosters a culture of continuous learning and innovation. As a Staff Systems Engineer, you can expect opportunities for growth through mentorship, involvement in next-generation AI systems, and potential advancement within the Systems Engineering or Hardware Engineering teams.
- What benefits does Graphcore offer to employees in the USA?
- Graphcore offers a comprehensive benefits package including medical, dental, and vision coverage, FSAs, HSAs, disability and life insurance, a 401(k) retirement plan, commuter benefits, wellness services, and an Employee Assistance Programme (EAP).
- How does Graphcore support diversity and inclusion in its hiring process for the Staff Systems Engineer role?
- Graphcore is committed to building an inclusive work environment and welcomes people of different backgrounds and experiences. They offer an equal opportunity process and can provide a flexible approach to interviews, encouraging candidates to discuss any reasonable adjustments needed.
Similar roles
Open positions we recommend based on this role.
