
Site Reliability Engineer Principal (Software Engineering)
FIS · Atlanta, GA
- On site
- Full-time
- $150,000 / year
- Atlanta, GA
✓ Hiring manager found for this role
Email the hiring manager to get a response.
Get their verified email + an intro that's ready to send.
Site Reliability Engineer Principal (Software Engineering)
FIS · Atlanta, GA
Sam Bennett
Hiring Manager · h•••••@careers.fisglobal.com
✍️ Your intro emailReady to send
Subject: Interested in the Site Reliability Engineer Principal (Software Engineering) role at FIS
Hi Sam — I came across the Site Reliability Engineer Principal (Software Engineering) opening and wanted to reach out directly. I've spent the last few years doing exactly this kind of work, and FIS stood out because…
🔒 Unlock to read & send
✎ Personalized to your résumé after sign-up.
$1 once
Just this hiring manager
Best value
$9/mo
Unlimited — any job, anywhere
- ✓ Verified email of the hiring manager
- ✓ Intro email personalized to your résumé
- ✓ $9/mo = unlimited — any job link
Secure checkout · cancel anytime
View the original posting ↗
Not recommended alone — most applicants never hear back.
Job highlights
- Drive innovation in financial services technology.
- Design and maintain critical monitoring solutions.
- Implement automation for scalability and deployments.
- Lead incident response and optimize performance.
- Collaborate across teams for reliability goals.
About the role
About The Role
This position is under our CTO org to support SRE functions for innovation and growth for the Banking Solutions, Payments and Capital Markets business.What You Will Be Doing
Site Reliability Engineer will play a critical role in driving innovation and growth for the Banking Solutions, Payments and Capital Markets business. In this role, the candidate will have the opportunity to make a lasting impact on the company's transformation journey, drive customer-centric innovation and automation, and position the organization as a leader in the competitive banking, payments and investment landscape. Specifically, the Site Reliability Engineer will be responsible for the following:- Design and maintain monitoring solutions for infrastructure, application performance, and user experience
- Implement automation tools to streamline tasks, scale infrastructure, and ensure seamless deployments
- Ensure application reliability, availability, and performance, minimizing downtime and optimizing response times
- Lead incident response, including identification, triage, resolution, and post-incident analysis
- Conduct capacity planning, performance tuning, and resource optimization
- Collaborate with security teams to implement best practices and ensure compliance
- Manage deployment pipelines and configuration management for consistent and reliable app deployments
- Develop and test disaster recovery plans and backup strategies
- Collaborate with development, QA, DevOps, and product teams to align on reliability goals and incident response processes
- Participate in on-call rotations and provide 24/7 support for critical incidents
What You Bring
- Proficiency in development technologies, architectures, and platforms (web, API)
- Experience with cloud platforms (AWS, Azure, Google Cloud) and IaC tools (Terraform)
- Knowledge of monitoring tools (Prometheus, Grafana, DataDog) and logging frameworks (Splunk, ELK Stack)
- Experience in incident management and post-mortem reviews
- Strong troubleshooting skills for complex technical issues
- Proficiency in scripting languages (Python, Bash) and automation tools (Terraform, Ansible)
- Experience with CI/CD pipelines (Harness, Jenkins, GitLab CI/CD, Azure DevOps)
- Ownership approach to engineering and product outcomes
- Excellent interpersonal communication, negotiation, and influencing skills
What We Offer You
At FIS, we hire the best. In return, you receive exceptional benefits including:- Opportunities to innovate in fintech
- Tools for personal and professional growth
- Inclusive and diverse work environment
- Resources to invest in your community
- Competitive salary and benefits
Key skills/competency
- Site Reliability Engineering
- Cloud Platforms (AWS, Azure, Google Cloud)
- Infrastructure as Code (IaC)
- Monitoring Tools (Prometheus, Grafana, DataDog)
- Logging Frameworks (Splunk, ELK Stack)
- Incident Management
- CI/CD Pipelines
- Scripting Languages (Python, Bash)
- Automation Tools (Terraform, Ansible)
- Troubleshooting
Skills & topics
- Site Reliability Engineer
- Principal Engineer
- SRE
- DevOps
- Cloud Engineering
- AWS
- Azure
- Google Cloud
- Terraform
- Python
- Bash
- Monitoring
- Prometheus
- Grafana
- DataDog
- Splunk
- ELK Stack
- CI/CD
- Harness
- Jenkins
- Azure DevOps
- Incident Management
- Capacity Planning
- Performance Tuning
- Automation
- Fintech
- Banking Solutions
- Payments
- Capital Markets
How to get hired
- Tailor your resume: Highlight SRE experience, cloud platforms, IaC, monitoring, and scripting skills relevant to the job description.
- Showcase automation and CI/CD skills: Emphasize your experience with tools like Terraform, Ansible, Harness, Jenkins, or Azure DevOps.
- Quantify your achievements: Use metrics to demonstrate impact in areas like system reliability, downtime reduction, or performance optimization.
- Prepare for technical interviews: Be ready to discuss complex troubleshooting scenarios, incident response, and system design.
- Research FIS culture: Understand their focus on innovation, collaboration, and customer-centricity to align your answers.
Technical preparation
Master cloud platform concepts (AWS, Azure, GCP).,Practice IaC with Terraform and Ansible.,Build and manage CI/CD pipelines.,Solve complex troubleshooting scenarios.
Behavioral questions
Describe a major incident you led.,How do you ensure system reliability?,How do you collaborate with development teams?,Share an example of automating a complex task.
Prefer to apply the usual way?
Not recommended alone — most applicants never hear back. Email the hiring manager first.
Frequently asked questions
- What is the work arrangement for the Principal Site Reliability Engineer role at FIS?
- This Principal Site Reliability Engineer position is a hybrid role. Employees are expected to work onsite at one of the FIS office locations in Atlanta (GA), Jacksonville (FL), or Milwaukee (WI) for 3 days a week, with the remaining days offering flexibility for remote work.
- What are the primary responsibilities of a Principal Site Reliability Engineer at FIS?
- The Principal Site Reliability Engineer will focus on driving innovation and growth within the Banking Solutions, Payments, and Capital Markets business. Key responsibilities include designing monitoring solutions, implementing automation, ensuring application reliability, leading incident response, capacity planning, and collaborating with development and DevOps teams.
- What technical skills are most important for this Principal Site Reliability Engineer role at FIS?
- Key technical skills include proficiency in development technologies (web, API), experience with cloud platforms (AWS, Azure, Google Cloud), Infrastructure as Code (IaC) tools like Terraform, knowledge of monitoring tools (Prometheus, Grafana, DataDog), logging frameworks (Splunk, ELK), scripting languages (Python, Bash), and CI/CD pipelines.
- Does FIS offer opportunities for professional growth for a Principal Site Reliability Engineer?
- Yes, FIS emphasizes providing tools for personal and professional growth. The company encourages innovation in fintech and offers resources to support employee development, making it a good environment for a Principal Site Reliability Engineer looking to advance their career.
- What kind of team can I expect to work with as a Principal Site Reliability Engineer at FIS?
- You can expect to be part of a team that is described as open, collaborative, entrepreneurial, passionate, and fun. This role is within the CTO organization, supporting critical business areas and working closely with development, QA, DevOps, and product teams.
- How does FIS approach incident response for its Principal Site Reliability Engineer roles?
- The Principal Site Reliability Engineer is expected to lead incident response, which includes the identification, triage, resolution, and post-incident analysis of critical issues. This role also involves participating in on-call rotations to provide 24/7 support for critical incidents.
Similar roles
Open positions we recommend based on this role.
