#1 Job Board for tech industry in Europe

Platform/Site Reliability Engineer (SRE)

DevOps

Platform/Site Reliability Engineer (SRE)

DevOps

-, Poznań +4 Locations

DCV Technologies

Full-time

B2B

Senior

Remote

Job description

Platform/Site Reliability Engineer (SRE)

We are looking for a DevOps Engineer on behalf of our client.

Remote from Poland
B2B

📌 The Platform Reliability Engineer is responsible for ensuring the reliability, performance, and availability of our critical platforms: Kong (API Management), Solace (Messaging), Mulesoft (iPaaS), and Informatica (ETL).

This role applies Site Reliability Engineering (SRE) principles — including automation, monitoring, and continuous improvement — to proactively identify and resolve potential issues, optimize platform performance, and collaborate with cross-functional teams to deliver exceptional service reliability.

This role requires a deep understanding of distributed systems, cloud technologies, and a passion for building resilient and scalable platforms.

The consultant will work closely with various platform teams in the Integration space and report directly to the Enterprise Integration Manager.

Platform Reliability & Performance (SRE Focus)

Ensure the reliability and availability of the Kong, Solace, Mulesoft, and Informatica platforms, applying SRE principles of automation, monitoring, and continuous improvement.
Proactively identify and resolve potential issues before they impact production environments, using data-driven insights and predictive analysis.
Develop and implement comprehensive monitoring and alerting systems to ensure platform health and performance.
Collaborate with the Support team and conduct thorough post-incident reviews with the goal of continuous improvement of platform reliability.
Conduct root cause analysis (RCA) for incidents and implement preventative measures, focusing on automation and systemic solutions.
Collaborate with development, operations, and security teams to ensure smooth platform operations, promoting a culture of shared responsibility for reliability.
Take ownership of platform SLAs and SLOs, ensuring they are met or exceeded, and proactively identify opportunities for improvement.
Evaluate and implement new tools and technologies to improve platform reliability and efficiency, staying up to date with the latest SRE trends and technologies.

Chaos Engineering & Resilience

Design, implement, and execute chaos engineering experiments to proactively identify weaknesses and vulnerabilities in integration platforms.
Develop and maintain a chaos engineering framework to systematically test platform resilience under various failure scenarios.
Analyze chaos experiment results and collaborate with engineering teams to implement improvements to enhance platform resilience.
Participate in designing and implementing fault-tolerant and self-healing systems.

Disaster Recovery & Business Continuity

Collaborate with DevOps engineers to develop, maintain, and test disaster recovery plans for the integration platforms.
Participate in disaster recovery exercises to validate plan effectiveness and identify areas for improvement.
Ensure disaster recovery plans align with business continuity requirements.
Implement and maintain backup and recovery procedures for critical platform components.

Upstream/Downstream Dependency Management

Analyze integration platform dependencies on other systems (e.g., API Gateway, backend services) and assess their reliability impact on overall service.
Implement monitoring and alerting for issues in upstream and downstream systems that could affect integration platforms.
Collaborate with other teams to improve the reliability and performance of dependent systems.
Design and implement strategies for handling failures in dependent systems, such as circuit breakers, retries, and fallbacks.

Collaboration & Communication

Work closely with the Support team to address platform-related issues and improve support processes, providing them with tools and knowledge to resolve issues efficiently.
Collaborate with Platform Engineers to optimize platform architecture and infrastructure, ensuring alignment with SRE best practices.
Partner with the Product Owner to define and communicate platform reliability metrics and performance to stakeholders through clear dashboards and reports.

Performance Optimization

Monitor platform performance and identify areas for optimization using performance profiling and load testing techniques.
Conduct performance testing and tuning to ensure optimal resource utilization and eliminate bottlenecks.
Collaborate with development teams to optimize application performance and provide guidance on best practices.
Implement caching strategies and other techniques to improve responsiveness and reduce latency.

Documentation and Knowledge Sharing

Create and maintain comprehensive documentation for daily activities, platform architecture, configuration, and operational procedures.
Ensure documentation is up to date and accessible.
Share knowledge and best practices with the team, fostering a culture of learning and collaboration.

Qualifications

Bachelor’s degree in Computer Science, Engineering, or a related field.
5+ years of experience in a similar role focused on platform reliability and operations, ideally within an SRE environment.
Strong understanding of Kong API Gateway, Solace PubSub+, Mulesoft Anypoint Platform, and Informatica PowerCenter.
Experience with cloud platforms such as AWS, Azure, or GCP.
Proficiency in scripting languages such as Python, Bash, or Go.
Experience with infrastructure-as-code tools such as Terraform or Ansible.
Experience with monitoring and alerting tools such as Datadog.
Strong understanding of networking concepts and protocols.
Excellent problem-solving and troubleshooting skills.
Excellent communication and collaboration skills, with the ability to communicate technical concepts clearly.
Strong understanding of SRE principles and practices.
Experience with containerization (Docker, Kubernetes).
Experience with CI/CD pipelines and automation tools.
Relevant certifications (e.g., AWS Certified DevOps Engineer, Azure DevOps Engineer Expert, Google Cloud Professional Cloud Architect).
Experience with Agile development methodologies.

📩 If you’re interested and meet the qualifications, please send your CV to Alina Pchelnikova at alina.pchelnikova@dcvtechnologies.co.uk

Tech stack

AWS

regular

Azure

regular

GCP

regular

API

regular

Terraform

regular

Agile

regular

Python

regular

Office location

Platform/Site Reliability Engineer (SRE)

Summary of the offer

Platform/Site Reliability Engineer (SRE)

-, Poznań

DCV Technologies

By applying, I consent to the processing of my personal data for the purpose of conducting the recruitment process. Informujemy, że administratorem danych jest z siedzibą w , ul.(dalej jako "administrator"). Masz prawo do żądania dostępu do swoich da... MoreThis site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Check similar offers

Awareson Sp. z o.o.

Warszawa

Remote

Site Reliability Engineer / Platform Engineer

New

52 - 63USD/h

AKS

Prometheus

Ansible/Puppet

Grafana

Docker

Azure

Terraform

Python

SRE

SeniorSeniorB2BB2B

New

ADVERTISEMENT: Recommended by Just Join IT

Applied -

Check similar offers

Awareson Sp. z o.o.

Warszawa

Remote

Site Reliability Engineer / Platform Engineer

New

52 - 63USD/h

AKS

Prometheus

Ansible/Puppet

Grafana

Docker

Azure

Terraform

Python

SRE

SeniorSeniorB2BB2B

New

Pretius

Warszawa

Remote

Site Reliability Engineer

33 - 41USD/h

CDN

Linux

Terraform

Ansible

OTT

Grafana

CI/CD

Datadog

Unix

AWS

SeniorSeniorB2BB2B

N-iX

Remote

Site Reliability Engineer

6 000 - 7 000USD/month

CI/CD

Terraform

Java

SeniorSeniorB2BB2B

Antal Sp. z o.o.

Warszawa

Remote

Site Reliability Engineer

38 - 49USD/month

DevOps

SeniorSeniorB2BB2B

Sigma Software

Remote

Principal Site Reliability Engineer

New

Undisclosed Salary

AWS

CI/CD

Terraform

Kubernetes

Python

SeniorSeniorB2B, PermanentB2B, Permanent

New

ADVERTISEMENT: Recommended by Just Join IT