Senior DevOps Engineer - Monitoring
plac Nowy Targ 28, Wrocław
Spyrosoft
Tech stack:
Icinga2, Prometheus, Grafana
Opsgenie
Terraform, Ansible
Kubernetes
NIS2, Kritis compliance
Application Performance Management (APM) tools
Incident Management
SLA Management
Project description:
Our customer is a leading German producer of customized solutions for the self-supply of solar-powered electricity. This includes photovoltaic and energy storage systems as well as cloud technology solutions helping individuals to achieve energy independence.
We are seeking a highly skilled and experienced Senior DevOps Engineer – Monitoring to join our team. The ideal candidate will have in-depth expertise in system and application monitoring, infrastructure automation, and compliance. This role requires strong hands-on skills with tools such as Icinga2, Prometheus, Grafana, Terraform, and Ansible, as well as proven experience in Kubernetes administration, APM, and incident management.Please note that this position involves occasional on-call duty to handle customer-critical incidents.
Main responsibilities:
Configure and manage monitoring solutions using Icinga2, Prometheus, and Grafana.
Set up and optimize alerting systems and incident management platforms with Opsgenie and other relevant tools.
Automate infrastructure and configuration management with Terraform and Ansible.
Manage and optimize Kubernetes clusters for performance and reliability.
Implement and monitor compliance with NIS2 and Kritis security requirements.
Utilize APM tools to identify and resolve performance bottlenecks in critical applications.
Lead the Incident Management process, ensuring adherence to on-call duty policies.
Define, monitor, and continuously improve Service Level Agreements (SLA).
Requirements:
5+ years of relevant DevOps experience.
Minimum 3 years of hands-on experience with Icinga2, Prometheus, and Grafana.
At least 2 years of experience with Opsgenie alerting and incident management configuration.
Proven track record in Terraform and Ansible for automation.
3+ years of Kubernetes administration and optimization.
Strong knowledge of NIS2 and Kritis compliance implementation.
Experience using APM tools for performance monitoring and troubleshooting.
Incident management expertise, including on-call duty operations.
SLA definition, tracking, and improvement experience.
Recruitment process:
Recruitment screening and experience survey (ca. 1 hour)
Profile-specific offline assignment (approx. 2–3 evenings of concentrated work, depending on experience)
Client technical interview (ca. 1 hour)
Client team interview (ca. 1 hour)
Offer meeting (ca. 30 mins)
Spyrosoft is an authentic, cutting-edge software engineering company, established in 2016. We have been included in the Financial Times ranking of 1000 fastest growing companies for three consecutive years: 2021, 2022 and 2023.