Application Reliability Engineer
Unleash the power of reliability — shape the future of cloud and DevOps innovation!
Krakow-based opportunity with hybrid work model.
As a Senior Site Reliability Engineer – Cloud & DevOps, you will be working for our client, a global leader in IT solutions, dedicated to ensuring the highest levels of system availability and performance. You will support and optimize production services, implement SRE best practices, and lead critical incident resolutions to drive seamless digital operations and continuous improvement.
Your main responsibilities:
Act as a technical lead for supporting highly available (24x7) production services within a global DevOps team.
Implement and promote SRE best practices to enhance service availability, performance, and security.
Resolve incidents, conduct root cause analysis, and facilitate post-incident reviews to prevent recurrence.
Design and review software architecture, defining application SLIs and SLOs to optimize operational health.
Build and maintain observability frameworks using tools such as Prometheus and Grafana, automating alerts and monitoring.
Plan and execute application and infrastructure migrations, disaster recovery exercises, and product upgrades.
Review support queries, develop automation solutions, and enhance self-service capabilities to improve user experience.
Provide on-call support during scheduled rotations, ensuring rapid response to critical issues.
Participate in scheduled maintenance activities, including weekend tasks, to maintain system reliability with minimal user disruption.
You're ideal for this role if you have:
Minimum of 7 years of professional experience in Production Application Support or Site Reliability Engineering.
Proven expertise with automation, build, and monitoring tools such as Ansible, Jenkins, Prometheus, and Grafana.
Exceptional analytical and troubleshooting skills in high-pressure environments.
Strong full-stack engineering skills with Java, Python, JavaScript, NodeJS, React, and SQL.
In-depth understanding of the Software Development Life Cycle (SDLC) and its principles.
Excellent communication skills to collaborate effectively across global, cross-functional teams.
Prior experience supporting large Atlassian Jira and Confluence Data Centre instances is a plus.
Ability to learn quickly and adapt to new technologies.
Language Required for the role:
Fluent English.
We offer you:
ITDS Business Consultants is involved in various, innovative, and professional IT projects for international companies in the financial industry in Europe. We offer an environment for professional, ambitious, and driven people. The offer includes:
Stable and long-term cooperation with very good conditions
Enhance your skills and develop your expertise in the financial industry
Work on the most strategic projects available in the market
Define your career roadmap and develop yourself in the best and fastest possible way by delivering strategic projects for different clients of ITDS over several years
Participation in Social Events, training, and work in an international environment
Access to an attractive Medical Package
Access to Multisport Program
#MAKEYourCareerBETTER
Interested? Apply now and include your CV (preferably in English) along with a statement confirming your consent to the processing and storage of your personal data.
Application Reliability Engineer
Application Reliability Engineer