Infrastructure Management Lead
ECS
Job Overview
Who's the hiring manager?
Sign up to PitchMeAI to discover the hiring manager's details for this job. We will also write them an intro email for you.

Job Description
Job Summary
ECS is seeking an Infrastructure Management Lead to work remotely. This critical role involves collaborating with business and technical teams to ensure applications are fully available to all stakeholders and USPS customers. You will maintain the integrity and performance of various environments including Development (DEV), System Integration Testing (SIT), and Customer Acceptance Testing (CAT), and be responsible for updating technical architecture diagrams, NCRB's, and system network specifications.
Key Responsibilities
- Work across business and technical teams to ensure application availability for all stakeholders and USPS customers.
- Update and draft new Technical Architecture Diagrams, NCRB's, and System Network details and specifications as needed.
- Analyze and evaluate complex data file processing, translating customer reporting requirements into comprehensive reports.
- Collaborate with the Corporate Information Security Office (CISO) and Network and Infrastructure Technology (NIT) teams to ensure timely system patching and testing across all environments.
- Track system upgrade requirements and due dates, coordinating system upgrades effectively.
- Ensure the availability of DEV, SIT, and CAT environments, and manage proper Production Data Usage approvals.
- Lead as a Critical Incident Manager to restore production applications during impacting events, responding quickly to system alerts to minimize downtime.
- Facilitate communication between technical teams and the Enterprise System Monitoring Team (ESM) to ensure timely updates and accurate monitoring of critical systems.
- Collaborate with Enterprise teams to set up and configure Splunk Monitors for effective system monitoring and alerting.
- Maintain and update monitoring configurations to reflect changes in system architecture and operational requirements.
- Analyze monitoring data to identify trends and potential issues, providing insights to improve system performance and reliability.
Required Skills
- Strong working knowledge of SQL, including the ability to understand and write complex SQL using select statements, views, joins, and indexing to create report data.
- Demonstrated experience using UNIX commands to manage production Linux servers, including the creation and maintenance of shell scripts for application maintenance activities.
- Proficient in using SQL queries on Oracle Databases, Cloudera, and DataStax for database lookups and data management.
- Proficient in working with Apache Kafka and Confluent to implement and manage real-time data streaming solutions, ensuring efficient data flow and integration across systems.
- Experience working with both on-premises and cloud-based environments.
- Strong knowledge or ability to master the Incident Management processes and the ability to lead as a Critical Incident Manager.
- Excellent written and verbal communication skills, with a proven ability to work effectively with stakeholders, executives, and technical staff.
- Effective liaison skills between business and technical teams, with strong facilitation abilities to lead Production Support Application Monitoring in real-time.
- Familiarity with terminology, usage, and operating characteristics of hardware, software, and operating system components.
- Ability to present and communicate at all levels within the organization.
Desired Skills
- Experience with Agile methodologies, Version One, ServiceNow, Application Lifecycle Management (ALM), and logging tools such as Splunk Enterprise.
- Familiarity with monitoring tools such as Neustar and AppDynamics; experience with cloud monitoring tools is a plus.
Restrictions
Cannot be a member of the USPS eAccess Corporate Developer Registration (CDR) to develop new code due to segregation of duties requirements.
Key skills/competency
- Application Availability
- Incident Management
- System Monitoring
- Technical Architecture
- Data Streaming
- Database Management
- System Patching
- Stakeholder Coordination
- Environment Management
- Linux Server Administration
How to Get Hired at ECS
- Research ECS's culture: Study their mission, values, recent news, and employee testimonials on LinkedIn and Glassdoor.
- Tailor your resume: Customize your application to highlight experience in infrastructure management, incident response, and specific technologies like SQL, UNIX, and Kafka for ECS.
- Showcase problem-solving skills: Prepare to discuss complex data file processing and critical incident management scenarios in interviews with ECS.
- Demonstrate communication prowess: Emphasize your ability to liaise between technical and business teams effectively, crucial for an Infrastructure Management Lead at ECS.
- Highlight federal government experience: If applicable, showcase experience working with federal agencies or government contracts, aligning with ECS's core business.
Frequently Asked Questions
Find answers to common questions about this job opportunity
Explore similar opportunities that match your background