
Principal AI Architect/Engineer
PepsiCo · Plano, TX
- On site
- Full-time
- $150,000 / year
- Plano, TX
Job highlights
- Design and build AI observability capabilities.
- Implement OpenTelemetry instrumentation and pipelines.
- Ensure AI system safety, security, and governance.
- Drive quality engineering for agentic solutions.
- Collaborate on AI platform evolution.
About the role
Principal AI Architect/Engineer - PepsiCo
PepsiCo is seeking a Principal AI Architect/Engineer to design, build, and operate observability capabilities within the enterprise AI observability platform. This execution-focused role translates architectural blueprints into production-grade instrumentation, telemetry pipelines, dashboards, quality gates, and safety signals for agentic AI systems. You will be a technical practitioner with an emerging architect mindset, contributing to the evolution of our AI observability platform.
Responsibilities:
- Observability Platform Engineering & OTEL Integration (25%): Implement OpenTelemetry (OTEL) instrumentation, build and maintain telemetry pipelines, integrate OTEL with enterprise agentic platforms, develop dashboards and alerting rules, and participate in on-call rotations.
- Safety, Security & Red Teaming Support (15%): Instrument safety-critical signal capture, support red team exercises by building observability hooks, implement secure trace handling for AI decision events, and assist in maintaining the Security Observability Playbook.
- Responsible AI (RAI) & Governance Signal Instrumentation (10%): Implement RAI signal collectors, maintain RAI telemetry pipelines, ensure data quality for governance signals, and support gap analyses against governance framework requirements.
- Quality Engineering for Agentic Solutions — Post Go-Live & Continuous QE (15%): Build and maintain quality gate components in CI/CD pipelines, instrument and monitor Skill Evaluations, implement continuous quality monitoring for post-go-live solutions, and conduct structured testing of new agent capabilities.
- Memory, Skills, MCP & Harness Engineering Observability (10%): Instrument agent memory operations, add trace instrumentation to MCP server interactions, capture harness execution telemetry, and monitor skill eval harness execution pipelines.
- Data Science & Python Engineering (10%): Write production-grade Python for observability tooling, apply statistical methods to telemetry data, contribute to Python SDK development, and participate in code reviews.
- Agent Fleet, Physical AI & Multi-Modal Observability (5%): Implement telemetry for agent fleet coordination, contribute to observability instrumentation for physical AI pipelines, and add OTEL instrumentation to multi-modal model pipelines.
- Agentic Marketplace, Registry & A2A / UCP / AP2 Observability (5%): Instrument the Agentic Marketplace and Agent Registry with usage telemetry, implement protocol-level observability for communication flows, and contribute to Marketplace Observability Dashboard development.
- Collaboration, Integration & Continuous Learning (5%): Collaborate with AI platform engineers, SRE, and product teams, participate in agile ceremonies, stay current with emerging observability frameworks, and contribute to internal documentation.
Compensation and Benefits:
The expected compensation range for this position is between $110,700 - $185,250. Bonus based on performance (target payout 12% of annual salary). Comprehensive benefits package including Medical, Dental, Vision, Disability, Retirement Plan, and more.
Qualifications:
- Bachelor's or Master's degree in Computer Science, Software Engineering, AI/ML, Data Science, or a related technical field.
- 6–8 years of experience in software engineering, platform engineering, or data engineering; 2-3 years in observability, monitoring, or distributed systems.
- Strong Python engineering skills with experience in async patterns and modern tooling.
- Solid working knowledge of observability pillars (metrics, logs, traces) and OpenTelemetry (OTEL).
- Working knowledge of microservices, event streaming, REST/gRPC APIs, and containerized deployment.
- Hands-on experience with major cloud platforms (Azure, AWS, or GCP).
- Experience building CI/CD pipelines and familiarity with GitOps and IaC concepts.
- Ability to query, analyze, and visualize time-series and log data using tools like Grafana, Datadog, Splunk, or Prometheus.
- Hands-on experience with agentic AI frameworks (LangChain, LangGraph, AutoGen, etc.).
- Contributions to open-source observability projects or the OTEL community are a plus.
- Familiarity with reinforcement learning or model fine-tuning workflows.
- Experience with security tooling relevant to AI.
- Exposure to Responsible AI frameworks and libraries.
- Experience in a fast-paced AI platform, MLOps, or LLMOps role.
Key skills/competency:
- AI Observability
- OpenTelemetry (OTEL)
- Python Engineering
- Distributed Systems
- Cloud Platforms (AWS, Azure, GCP)
- CI/CD & DevOps
- Agentic AI Frameworks
- Responsible AI (RAI)
- Data Science
- Quality Engineering
Skills & topics
- AI Architect
- AI Engineer
- Observability
- OpenTelemetry
- Python
- Distributed Systems
- Cloud
- DevOps
- Agentic AI
- Responsible AI
- Data Science
- Quality Engineering
- PepsiCo
- Principal Engineer
- Software Engineer
How to get hired
- Tailor your resume: Highlight Python proficiency, observability experience, and AI framework knowledge specific to the Principal AI Architect Engineer role.
- Showcase OpenTelemetry expertise: Detail your experience with OTEL instrumentation, custom exporters, and semantic conventions in your application.
- Demonstrate cloud and distributed systems skills: Emphasize your hands-on experience with AWS, Azure, or GCP, microservices, and containerization.
- Prepare for technical interviews: Be ready to discuss your approach to building telemetry pipelines, ensuring AI safety, and applying data science methods to observability data.
- Research PepsiCo's AI initiatives: Understand their focus on agentic AI, responsible AI, and the role of observability in their platform.
Technical preparation
Behavioral questions
Frequently asked questions
- What are the key technical skills required for the Principal AI Architect Engineer role at PepsiCo?
- The Principal AI Architect Engineer role at PepsiCo requires strong Python engineering skills, solid working knowledge of observability pillars (metrics, logs, traces) with OpenTelemetry (OTEL) expertise, experience with distributed systems, cloud platforms (AWS, Azure, GCP), CI/CD & DevOps practices, and hands-on experience with agentic AI frameworks like LangChain or AutoGen.
- What is the expected experience level for this Principal AI Architect Engineer position?
- PepsiCo is looking for candidates with 6-8 years of experience in software, platform, or data engineering, including at least 2-3 years focused on observability, monitoring, or distributed systems. A Bachelor's or Master's degree in a relevant technical field is also required.
- Can you elaborate on the responsibilities related to Responsible AI (RAI) and governance at PepsiCo?
- For this role, you will implement RAI signal collectors within agent workflows to capture fairness, bias, explainability, and content safety metrics. Maintaining RAI telemetry pipelines, ensuring data quality for compliance dashboards, and supporting audits by including required governance metadata in AI decision traces are key aspects.
- What kind of AI frameworks does PepsiCo utilize for this Principal AI Architect Engineer role?
- PepsiCo utilizes various agentic AI frameworks. Hands-on experience with frameworks such as LangChain, LangGraph, AutoGen, Semantic Kernel, CrewAI, or equivalent is highly desired for this position.
- What is the compensation range for the Principal AI Architect Engineer position at PepsiCo?
- The expected compensation range for this Principal AI Architect Engineer position at PepsiCo is between $110,700 and $185,250 annually. Actual salary is determined by factors such as location, experience, skills, and education. A performance-based bonus target of 12% is also offered.
- Does PepsiCo offer remote work for the Principal AI Architect Engineer role?
- The job description does not explicitly state if the Principal AI Architect Engineer role is remote, hybrid, or on-site. Typically, such specialized engineering roles may offer hybrid or remote options, but this would need to be confirmed during the application process.
- What is the role of OpenTelemetry (OTEL) in this position at PepsiCo?
Similar roles
Open positions we recommend based on this role.