#1 Job Board for tech industry in Europe

Site Reliability Engineer
New
DevOps

Site Reliability Engineer

46 - 62 USD/monthNet per month - B2B
46 - 62 USD/monthNet per month - B2B
Type of work
Full-time
Experience
Senior
Employment Type
B2B
Operating mode
Hybrid

Tech stack

    Linux / Unix

    regular

    Prometheus

    regular

    Grafana

    regular

    GCP

    regular

    Java

    regular

    Splunk

    regular

Job description

Site Reliability Engineer

📍 Kraków (Hybrid – minimum 2 days/week in the office)

💼 Employment type: B2B


Are you looking for an opportunity to join a high-impact project in a global financial institution that invests heavily in cloud, AI, and DevOps? We're building a new Site Reliability Engineering (SRE) team in Kraków to support a mission-critical Counterparty Credit Risk (CCR) platform, and we're looking for experienced engineers to join the journey.

As part of this role, you'll contribute to the stability, scalability, and observability of a high-volume, distributed platform operating on both Google Cloud Platform and on-prem infrastructure.


What you’ll do:

  • Ensure the reliability and high availability of production systems used in global credit risk management.
  • Monitor, detect, and troubleshoot incidents in distributed systems running in cloud and hybrid environments.
  • Implement observability tools (Grafana, Prometheus, Loki, etc.) and improve monitoring and alerting strategies.
  • Lead root cause analysis (RCA) and post-incident reviews to improve resilience and operational efficiency.
  • Collaborate with developers, DevOps engineers, and global support teams to implement SRE best practices.
  • Contribute to CI/CD automation, deployment pipelines, and security/vulnerability remediation.


What you need to succeed in this role:

  • 5+ years of experience in supporting or developing distributed systems (Java-based environments preferred).
  • Hands-on experience with monitoring and logging tools: Grafana, Prometheus, Loki, Splunk, etc.
  • Solid understanding of Unix/Linux systems, cloud infrastructure (GCP preferred), and databases (RDBMS).
  • Experience with CI/CD tooling, such as Ansible, Jenkins, GitHub Actions, and vulnerability management.
  • Familiarity with job scheduling tools (e.g., Control-M or equivalent).
  • Strong communication skills and ability to drive technical discussions with multiple support teams.
  • Experience working in Agile/Scrum teams.


What we offer:

  • The chance to build and shape a new SRE team supporting a critical platform for global risk management.
  • Work in a modern technology stack: Java, GCP, Apache Beam, Spring Boot, DevOps tooling.
  • Hybrid working model with at least 2 days/week in our Kraków office.
  • Flexible form of cooperation (B2B or Employment Contract).
  • Stable, long-term project with excellent opportunities for growth and learning.

📩 Interested? Apply now and take the next step in your career with a team that’s redefining reliability at a global scale.


To learn more about Antal, please visit www.antal.pl

 

46 - 62 USD/month

Net per month - B2B