Senior Site Reliability Engineer @ Remotivate
Your Application Journey
Email Hiring Manager
Job Details
About the Company
Our client is one of the leading SMS providers for marketing teams in the US. Their advanced dashboard and queueing mechanisms help clients scale campaigns to the next level. With a global team in scale-up mode, they are looking for strong problem solvers who thrive on building reliable systems.
About the Role
We are looking for a Senior Site Reliability Engineer with strong infrastructure experience to ensure platform stability and optimize back-end systems in Python. You will play a key role in keeping their SMS marketing platform fast, reliable, and scalable. This highly technical position sits at the intersection of backend engineering and infrastructure.
You will work hands-on with Python/Flask applications, Linux servers, and network analysis tools to ensure millions of SMS messages are delivered without delay or downtime.
Key Responsibilities
- Maintain and optimize Linux-based servers running Python/Flask apps.
- Ensure high uptime by monitoring system health and addressing bottlenecks.
- Troubleshoot complex issues using packet capture and diagnostic tools.
- Optimize backend workflows for efficient message queuing and delivery.
- Implement monitoring and alerting systems for enhanced visibility.
- Automate infrastructure tasks through scripting and tool development.
- Take ownership in infrastructure decisions to boost platform performance.
Requirements
- 5+ years experience in SRE, infrastructure, or backend systems engineering.
- Experience in running and maintaining Python/Flask applications in production.
- Advanced Python development skills and familiarity with relevant libraries/frameworks.
- In-depth knowledge of Linux (Debian/Ubuntu) server administration.
- Proficiency in network analysis using tools like Wireshark, mitmproxy, tcpdump.
- Understanding of distributed systems, scaling strategies, and performance tuning.
- Experience with monitoring/logging systems (Prometheus, Grafana, ELK, Datadog) and CI/CD workflows.
- Comfort with automation tools and scripting for infrastructure management.
- Excellent troubleshooting, analytical skills, and a strong sense of ownership.
Growth Opportunities & Perks
- Endless growth in a scale-up environment.
- Potential to move into R&D or leadership roles.
- Flexible working schedule with fully remote setup.
- Performance bonuses as the company grows.
- Collaborate with highly skilled developers in a challenging industry.
Key skills/competency
Senior Site Reliability Engineer, Python, Linux, Flask, Monitoring, Logging, Networking, Automation, CI/CD, Troubleshooting
How to Get Hired at Remotivate
🎯 Tips for Getting Hired
- Customize your resume: Highlight infrastructure and Python expertise.
- Showcase troubleshooting skills: Emphasize your network diagnostic experience.
- Prepare for technical interviews: Review Linux and CI/CD workflows.
- Research Remotivate: Understand their scale-up journey and SMS services.