Site Reliability Engineer ID53670
AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to Work awards.
Why join us
If you're looking for a place to grow, make an impact, and work with people who care, we'd love to meet you! :)
About the role
We are looking for a SRE Operations Engineer to maintain reliability across a cloud-based SaaS platform. You’ll handle live incidents, improve observability, and reduce toil through automation using Kubernetes, Terraform, Grafana, and AWS. Hands-on, execution-focused, with real ownership across CI/CD pipelines, GitOps workflows, and on-call rotations.
What you will do
Monitor and support production and staging environments to ensure availability, performance, and stability;
Respond to incidents, perform triage and root cause analysis, and contribute to remediation efforts;
Participate in on-call rotations with defined SLAs;
Handle operational requests from internal teams;
Maintain and improve monitoring, alerting, dashboards, logs, and metrics;
Support CI/CD pipelines, production releases, and GitOps workflows;
Contribute to automation initiatives to reduce operational overhead;
Maintain and improve Kubernetes-based infrastructure and containerized workloads;
Support Infrastructure as Code practices and environment improvements.
Must haves
2+ years of experience in Site Reliability Engineering, DevOps, or Production Operations;
Experience with AWS supporting production environments;
Experience supporting production SaaS applications;
Strong understanding of CI/CD systems (GitHub Actions, Jenkins, CircleCI);
Experience with GitOps and Git fundamentals;
Experience using GitHub, Jira, and Confluence;
Experience with Kubernetes (EKS, kOps or similar);
Experience with Docker and containerization;
Experience with observability tools (Grafana, Prometheus, Loki, PagerDuty);
Proficiency in scripting (Bash, Python, or Go);
Experience with Infrastructure as Code (Terraform, Helm);
Ability to work within structured operational processes and SLAs;
Strong written and verbal English communication skills;
Self-driven with a growth mindset.
Nice to haves
AWS certifications such as Solutions Architect, DevOps Engineer, or SysOps Administrator;
Experience with multi-tenant SaaS environments;
Experience working in globally distributed teams;
Familiarity with ChatOps practices;
Experience improving monitoring quality and reducing alert fatigue.
The benefits of joining us
Professional growth
Accelerate your professional journey with mentorship, TechTalks, and personalized growth roadmaps
Competitive compensation
We match your ever-growing skills, talent, and contributions with competitive USD-based compensation and budgets for education, fitness, and team activities
A selection of exciting projects
Join projects with modern solutions development and top-tier clients that include Fortune 500 enterprises and leading product brands
Flextime
Tailor your schedule for an optimal work-life balance, by having the options of working from home and going to the office – whatever makes you the happiest and most productive.
Meet Our Recruitment Process
Asynchronous stage – An automated, self-paced track that helps us move faster and give you quicker feedback:
Short online form to confirm basic requirements
30–60 minute skills assessment via Codility – a platform founded in Poland that helps us provide quicker feedback and streamline this stage of the process.
5-minute introduction video
Synchronous stage – Live interviews
Technical interview with our engineering team (scheduled at your convenience)
Final interview with your future teammates
If it’s a match — you’ll get an offer!

AgileEngine
We build products for companies in 17+ industries — solving real problems and improving millions of lives worldwide. Here, your voice matters, your work creates impact, and your growth is part of our journey together.Site Reliability Engineer ID53670
Site Reliability Engineer ID53670