Senior Site Reliability Engineer
@ ECS Tech Inc

Fairfax, Virginia, United States
$150,000
On Site
Full-time
Posted 28 days ago

Your Application Journey

Personalized Resume
Apply
Email Hiring Manager
Interview

Email Hiring Manager

XXXXXXXXXX XXXXXXXXX XXXXXXXXXX******* @ecstech.com
Recommended after applying

Job Details

Senior Site Reliability Engineer

ECS Tech Inc is seeking a talented Senior Site Reliability Engineer to work remotely on the next-generation Continuous Diagnostics and Mitigation (CDM) Cyber data solution. This role is part of the Cybersecurity and Infrastructure Security Agency’s (CISA) initiative to enhance federal network security.

Role & Responsibilities

  • Define, implement and grow the SRE practice.
  • Ensure reliability, availability and performance of critical production environments.
  • Design and maintain logging, monitoring and alerting systems using Elastic and other tools.
  • Conduct root cause analyses and manage incident response.
  • Collaborate with cross-functional teams to integrate SRE practices into the development lifecycle.

Qualifications

  • US citizenship with ability to obtain Public Trust Suitability.
  • 6+ years of SRE experience with hands-on observability, logging, monitoring, and alerting.
  • 3+ years of experience with cloud platforms (AWS GovCloud preferred) and coding (Python, Bash, etc.).
  • Strong knowledge of microservices, containerization, and orchestration tools (Docker, Kubernetes).
  • Experience working in a SAFe (Scaled Agile Framework) environment.

Key skills/competency

  • SRE
  • Reliability
  • Monitoring
  • Elastic
  • AWS
  • Kubernetes
  • Python
  • DevOps
  • Cybersecurity
  • SAFe

How to Get Hired at ECS Tech Inc

🎯 Tips for Getting Hired

  • Research ECS Tech Inc's culture: Study mission, values, and recent news.
  • Tailor your resume: Highlight SRE and cloud experience.
  • Showcase technical skills: Emphasize observability and automation.
  • Prepare for situational questions: Demonstrate problem-solving in past roles.

📝 Interview Preparation Advice

Technical Preparation

Review logging and monitoring tool documentation.
Practice coding in Python and Bash scripts.
Study AWS GovCloud and cloud platform basics.
Familiarize with container orchestration using Kubernetes.

Behavioral Questions

Describe a challenging incident response experience.
Explain teamwork in a fast-paced environment.
Discuss handling continuous improvement initiatives.
Share examples of cross-functional collaboration.

Frequently Asked Questions