Job Overview
Who's the hiring manager?
Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Job Description
About Bain & Company
We are proud to be consistently recognized as one of the world’s best places to work. We are currently the top ranked consulting firm on Glassdoor’s Best Places to Work list and have earned the #1 overall spot a record seven times. Extraordinary teams are at the heart of our business strategy, but these don’t happen by chance. They require intentional focus on bringing together a broad set of backgrounds, cultures, experiences, perspectives, and skills in a supportive and inclusive work environment. We hire people with exceptional talent and create an environment in which every individual can thrive professionally and personally.
Who You’ll Work With
As the premier consulting partner for the private equity industry, Bain's PEG boasts a global practice that is over three times larger than any competitor. Our network of over 1,000 professionals supports private equity and institutional investor clients through every stage of the investment life cycle, from deal generation and due diligence to portfolio value creation and exit planning. Bain & Company is developing a suite of cutting-edge data and software solutions designed to revolutionize how the private equity industry uses data for investment insights and decision-making. The PEG Innovation team's mission is to create analytical solutions for Bain clients, teams, and the broader institutional investor space using proprietary software and data products. This includes the development, commercialization, and daily management of Bain's proprietary datasets, data, and software businesses.
Where You’ll Fit Within the Team
Senior Platform Engineers design and build the shared services that every product and data squad depends on: Auth, RBAC, Session Management, Audit, Notifications, File handling, Search, and more. You own your services end-to-end: design, build, test, deploy, monitor, and operate them. You set the standard for how platform services are built and contribute to the engineering standards and conventions that govern the broader estate. You collaborate closely with Security and Infrastructure to ensure services are secure-by-default, observable from day one, and operable under production on-call expectations.
What You'll Do
Core Platform Service Development, Deployment, and Operations (80%)
- Design, build, test, deploy, and operate core platform services to production quality standards (Auth, RBAC, Session, Audit, Notifications, File, Search).
- Own service APIs and contracts end-to-end: versioning, backwards compatibility, and consumer impact management across product squads.
- Write and maintain Postgres schemas and Alembic migrations using the expand/contract pattern; never ship a breaking schema change without a backwards-compatible transition.
- Implement and enforce authentication and authorisation patterns: JWT, refresh token rotation, RBAC, SAML/OIDC, and service-to-service auth where required.
- Design and operate Redis-backed patterns: caching, session storage, rate limiting, pub/sub, and distributed lock coordination where needed.
- Build and operate event-driven capabilities using Kafka domain events (CloudEvents envelope, schema registry integration) where platform services publish and consume.
- Instrument services with structured logs, distributed tracing, and Prometheus metrics from day one using OpenTelemetry and FastAPI instrumentation.
- Write and maintain Helm charts for owned services; contribute to Kubernetes manifests in the platform-infra repository (health checks, resource limits, HPA readiness).
- Participate in on-call rotation for platform incidents; drive incident response to resolution and maintain runbooks for owned services.
Other (20%)
- Set and enforce engineering standards for platform service development: testing, observability, security, reliability, and operational hygiene.
- Conduct thorough code reviews; enforce standards on PRs and raise the bar for production practices across the platform estate.
- Mentor mid-level and junior platform engineers through pairing, design guidance, and ongoing review feedback.
- Use AI coding assistants to accelerate service scaffolding, API/router generation, migration drafts, and test creation; review all generated code against production and security standards before committing.
- Use LLMs to generate first-draft documentation (runbooks, service docs, API notes) and operational checklists; validate and refine outputs before publishing.
- Collaborate with the Security Engineer on Vault integration (Vault Agent Injector), dynamic secrets usage, policy scoping, mTLS policy, and software supply chain security requirements.
About You
- Bachelor’s degree in Computer Science, Engineering, Information Systems, or a related field (or equivalent practical experience).
- 6+ years of experience building and operating backend services, APIs, or platform components in production environments, including on-call responsibility.
- Demonstrated experience owning production backend services end-to-end (design, build, test, deploy, monitor, and operate), including on-call operational responsibility.
- Production experience building and operating REST and event-driven microservices at scale in a Kubernetes environment.
- Experience designing and operating data stores in Postgres, including schema migration practices, query optimisation, and performance tuning.
- Experience implementing authentication and authorization systems (JWT, refresh token rotation, RBAC, SAML/OIDC) in production environments.
- Demonstrated ability to mentor other engineers and raise engineering standards through code review and shared conventions.
Backend/Platform Engineering
- Strong Python proficiency: FastAPI, Pydantic v2, SQLAlchemy 2.0 async, Alembic, pytest, Ruff, mypy (strict).
- Production microservices: REST APIs, event-driven patterns, idempotency, retries, backwards-compatible versioning, and consumer contract discipline.
- PostgreSQL: query plan analysis, indexing strategies, partitioning approaches, and schema evolution patterns for high-availability systems.
- Redis: caching strategies, session storage, pub/sub, and rate limiting patterns; understands operational trade-offs and failure modes.
- Apache Kafka: producing/consuming domain events, CloudEvents envelope conventions, schema registry integration, and consumer group semantics.
- Docker: multi-stage builds, non-root containers, image scanning (e.g., Trivy), and secure base-image practices.
- Kubernetes: Helm charts, pod lifecycle, probes/health checks, resource requests/limits, and HPA concepts; comfortable operating services on-cluster.
- Observability: OpenTelemetry instrumentation, structured logging (structlog), distributed tracing, and Prometheus metrics for FastAPI services.
- Secrets and security: familiarity with HashiCorp Vault (Vault Agent Injector, dynamic secrets, policy scoping) and secure service-to-service patterns.
Generative AI and Agentic Systems
- Uses AI coding assistants (Cursor, GitHub Copilot, or equivalent) to accelerate feature development and reduce repetitive boilerplate; reviews all generated code against production and security standards before committing.
- Uses agents to generate first-draft Pydantic schemas, SQLAlchemy models, and FastAPI router skeletons; refines outputs to match domain conventions and security requirements.
- Uses LLM assistance to draft unit and integration test cases; validates coverage gaps and supplements with manually authored tests.
- Understands how platform services (Auth, RBAC, Audit) interact with the Agent Gateway and what security constraints (permissions, auditability, data minimisation) that interaction requires.
General
- Treats every service as a production system from the first commit: tests, observability, documentation, and runbooks are not optional.
- Communicates blockers early and escalates appropriately; does not quietly struggle for days before raising a risk.
- Uses AI tooling to move faster, but applies critical judgement and rigorous review to all generated code and documentation before it enters the codebase.
- Keeps runbooks and service documentation current as services evolve; treats operability as part of delivery.
Key skills/competency
- Senior Platform Engineer
- Backend Development
- Microservices
- Kubernetes
- Python
- PostgreSQL
- Redis
- Kafka
- Observability
- Security
How to Get Hired at Bain & Company
- Tailor your resume: Highlight experience with Python, FastAPI, Kubernetes, and microservices, aligning with Bain's platform engineering needs.
- Showcase production ownership: Emphasize your end-to-end experience in designing, building, deploying, and operating production backend services.
- Demonstrate mentorship: Provide examples of how you've mentored junior engineers and raised engineering standards through code reviews.
- Prepare for technical questions: Be ready to discuss your experience with PostgreSQL, Redis, Kafka, and security concepts like Vault.
- Understand the culture: Research Bain's commitment to teamwork, innovation, and being a great place to work.
Frequently Asked Questions
Find answers to common questions about this job opportunity
Explore similar opportunities that match your background