#1 Job Board for tech industry in Europe

Site Reliability Engineer SRE

4 133 - 8 266 USDGross per month - Permanent

UX/UI

Site Reliability Engineer SRE

UX/UI

Prosta 20, Warsaw

Cantor Fitzgerald

Full-time

Permanent

Senior

Hybrid

4 133 - 8 266 USDGross per month - Permanent

Job description

Company Overview:

Cantor Fitzgerald is a leading global financial services firm specializing in investment banking, capital markets, institutional equity and fixed income sales and trading, commercial real estate, and prime brokerage. With a legacy of over 75 years of financial innovation and integrity, Cantor operates across major financial centres worldwide, delivering excellence and trusted expertise to its clients.

About the Role

We are seeking a skilled and proactive Reliability Engineer to join our Messaging team, responsible for the stability, performance, and scalability of enterprise messaging platforms built on Solace PubSub+ software and appliances.

This role focuses on maintaining highly available, low‑latency messaging infrastructure supporting mission‑critical systems across both production and non‑production environments. The successful candidate will play a key role in operational reliability, observability, capacity planning, and continuous improvement, while also gaining exposure to proprietary messaging APIs and platforms.

Key Responsibilities

Administer, maintain, and support Solace PubSub+ appliances and software brokers across on‑premises and cloud environments
Provide production support for messaging‑related incidents, including root cause analysis and permanent remediation
Monitor system performance and availability using Prometheus, InfluxDB, and Grafana, proactively identifying and resolving issues
Configure, optimise, and support Solace deployments across WAN environments, ensuring secure, low‑latency message delivery
Collaborate closely with development, application support, and infrastructure teams to troubleshoot message flow and integration issues
Own capacity planning, scaling, and performance tuning of the messaging platform
Automate routine operational tasks and contribute to continuous improvement of reliability processes
Build and maintain monitoring dashboards, alerts, and metrics to provide deep visibility into messaging systems
Produce and maintain high‑quality documentation, including runbooks, topology diagrams, and configuration baselines
Support proprietary messaging APIs and components using C++, Java, Python, and C#
Provide support for proprietary caches and gateways integrating applications with the messaging layer

Skills & Experience Required

Minimum 3+ years of hands‑on experience administering Solace PubSub+ messaging systems in an enterprise environment
Strong background in production support, ideally within a 24x7 or high‑availability environment
Solid understanding of distributed systems, WAN networking, latency management, and failover strategies
Proven experience with Prometheus and Grafana for monitoring and alerting
Strong troubleshooting skills related to message delivery, persistence, and topic routing
Experience with capacity management, performance tuning, and scalability of distributed platforms
Good knowledge of Linux/Unix operating systems
Scripting and automation skills using Bash and/or Python
Excellent analytical and problem‑solving skills with strong attention to detail
Clear and effective communicator, comfortable working with multiple technical teams

Desirable Skills & Experience

Experience with containerisation technologies such as Docker and Kubernetes
Familiarity with other messaging platforms (Kafka, RabbitMQ, IBM MQ)
Exposure to DevOps practices and CI/CD pipelines
Experience with cloud platforms such as AWS, Azure, or GCP, including cloud‑native Solace deployments

Personal Attributes

Highly motivated, proactive, and ownership‑driven
Comfortable working in a high‑availability, mission‑critical environment
Strong collaborator who works well across teams
Methodical, organised, and capable of handling multiple priorities
Curious and eager to learn new systems and technologies
Calm and effective under pressure

Why Join Us?

Work on low‑latency, high‑throughput messaging systems supporting mission‑critical trading and enterprise platforms
Join a highly skilled, multi‑disciplinary engineering team
Opportunity to work with a broad and modern technology stack
Further develop both infrastructure reliability and programming skills in a complex environment

Tech stack

English

Bash

regular

Linux / Unix

regular

Prometheus

regular

Grafana

regular

Ansible

regular

Solace PubSub+

regular

CI/CD

nice to have

Docker

nice to have

Kubernetes

nice to have

Kafka

nice to have

Office location

Site Reliability Engineer SRE

4 133 - 8 266 USDGross per month - Permanent

Summary of the offer

Site Reliability Engineer SRE

Prosta 20, Warsaw

Cantor Fitzgerald

4 133 - 8 266 USDGross per month - Permanent

By applying, I consent to the processing of my personal data for the purpose of conducting the recruitment process. Please be informed that the data controller is Cantor Fitzgerald (hereinafter "controller"). You have the right to request access to ... MoreThis site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Check similar offers