Software Engineer, Infrastructure
@ Anthropic

Seattle, WA
$300,000
On Site
Full Time
Posted 16 hours ago

Your Application Journey

Personalized Resume
Apply
Email Hiring Manager
Interview

Email Hiring Manager

XXXXXXXXXX XXXXXXXXXXXXX XXXXXXXXX******* @anthropic.com
Recommended after applying

Job Details

About Anthropic

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems that are safe and beneficial for society. The team comprises researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

About The Role

Anthropic is seeking talented and experienced Infrastructure Engineers to join our team and support the development, scaling, and maintenance of cutting-edge AI systems. The role offers an opportunity to work on groundbreaking AI technologies and contribute to frontier models that power safe and reliable AI systems.

Team placement occurs after the interview process, matching candidates with the right team based on interests and experience. Teams include:

  • Data Infrastructure: Design and optimize data pipelines using Spark, Airflow, dbt on GCP and AWS.
  • Core Infrastructure: Build and manage large Kubernetes clusters with GPU/TPU/Trainium workloads.
  • Observability: Implement monitoring solutions using Prometheus, Splunk, and Grafana.
  • Developer Productivity: Enhance secure, efficient development environments for Anthropic engineers.
  • Developer Acceleration: Integrate Claude to optimize development setups.
  • Cloud Inference: Scale and optimize Claude on AWS and GCP for enterprise users.
  • AI Reliability: Innovate new approaches to system reliability for AI products.

Responsibilities

The role involves leading the build-out of industry-leading AI clusters, consulting with stakeholders to understand infrastructure needs, setting technical strategies, mentoring talent, and designing processes for robust system operations.

You May Be a Good Fit If You

  • Have 8+ years industry experience and 3+ years in leading large, complex projects or teams.
  • Are obsessed with distributed systems, reliability, scalability, and security.
  • Have strong proficiency in at least one programming language like Python, Rust, Go, or Java.
  • Possess excellent communication skills to build consensus with various stakeholders.
  • Have deep knowledge of modern cloud technologies including Kubernetes, AWS, and GCP.

Strong Candidates May Have

  • Experience with security and privacy best practices.
  • Expertise with machine learning infrastructure like GPUs, TPUs, or Trainium.
  • Low-level systems experience, including Linux kernel tuning and eBPF.
  • A quick grasp of systems design tradeoffs and evolving software systems.

Logistics

Applicants must have at least a Bachelor's degree or equivalent experience. The location-based hybrid policy expects staff to be in the office at least 25% of the time, with adjustments based on role and requirements. Visa sponsorship is available, with contingency based on role fit.

How We're Different

Anthropic values impact and collaboration over small-scale tasks. The company focuses on large-scale research efforts and encourages diverse perspectives in AI research to support steerable, trustworthy AI development. Frequent research discussions and a single cohesive team effort define the work culture at Anthropic.

Key skills/competency

  • Infrastructure
  • Distributed Systems
  • Cloud Computing
  • Kubernetes
  • Data Pipelines
  • Observability
  • Security
  • Scalability
  • System Reliability
  • Developer Productivity

How to Get Hired at Anthropic

🎯 Tips for Getting Hired

  • Tailor your resume: Highlight infrastructure and cloud expertise.
  • Research Anthropic: Understand their mission and AI projects.
  • Customize your cover letter: Emphasize distributed systems experience.
  • Prepare technical narratives: Discuss past large-scale projects.
  • Showcase leadership: Provide examples of team mentoring.

📝 Interview Preparation Advice

Technical Preparation

Review cloud service provider documentation.
Practice system design for high scalability.
Refresh programming in Python, Rust, or Go.
Study Kubernetes and cloud orchestration tools.

Behavioral Questions

Describe a project with complex infrastructure challenges.
Explain your leadership approach in team projects.
Share a time you solved critical system failures.
Discuss balancing innovation with reliability in projects.

Frequently Asked Questions