Site Reliability Engineer
Ref: 13949
Some careers shine brighter than others.
If you’re looking for a career that will help you stand out, join HSBC, and fulfil your potential. Whether you want a career that could take you to the top, or simply take you in an exciting new direction, HSBC offers opportunities, support and rewards that will take you further.
Your career opportunity
HSBC is the largest bank in Europe and is also one of the largest investment banks in the world. MSS IT provides technology solutions for its Global Banking and Markets business worldwide.
Counterparty Credit Risk (CCR) management is a business-critical function within the bank. The CCR IT team provide the technology to enable the Counterparty Credit Risk exposure calculation across thousands of HSBC clients globally every day.
We are part way through a multi-year plan to build the next generation of Counterparty Credit Risk Engines which involves migration to Cloud platforms and replace vendor software by in-house developed analytic library. The new counterparty credit risk engine comprises micro services leveraging the latest open-source infrastructure, and runs on Google Cloud Platform and on-premises infrastructures.
Technologies used by the team include Java SE, Spring Boot, Spring Cloud, Apache Beam, Apache Flink, GCP, Redis, REST APIs, Ansible, Jenkins.
HSBC and Traded Risk are heavily investing in an Agile culture with adoption of DevOps processes, CI/CD pipeline and Cloud Technologies. Traded Risk CCR IT is looking to initiate a brand-new development team in Krakow in 2023 as part of a long-term strategy to develop and support its platform in Europe.
This is an exciting opportunity to join a team in its early stages and make a key contribution.
What you’ll do
- Manage application support operations, focusing on resiliency, availability, and monitoring system health and performance.
- Coordinate resolution of production incidents, conducting post-mortem/RCA to identify root causes and improve processes.
- Investigate, triage, and resolve production incidents with a focus on technical signals and root cause analysis.
- Document post-incident recovery steps, contributing to process improvements, identifying deviations, and creating a Knowledge Base.
- Actively participate in the service management community, engaging in Incident Management, Problem Management, and Service Delivery.
- Define and deliver tactical and strategic service improvements across the technical and process landscape.
- Apply SRE principles to continuously improve platform reliability, capacity, and performance, reducing toil and enhancing observability.
- Develop observability tools and techniques for monitoring, alerting, incident detection, response, capacity management, and release safety.
What you need to have to succeed in this role
- 4+ years of experience in developing, supporting, distributed systems written in Java.
- Experience of Disaster Recovery methods and processes.
- A methodical approach to troubleshooting and problem-solving skills.
- Experience in application lifecycle management tooling: JIRA/Confluence, Ansible, Vulnerability Remediation
- Experience implementing and managing Logging, Monitoring and Alerting framework for hybrid cloud using tools such as Geneos, Grafana, Prometheus, Splunk, Loki or any other similar tools,
- Understanding of RDBMS Database, Cloud Technology, Unix/Linux, Job scheduling e.g. Controm-m or autosys
- Ability to lead technical conversations with various technical groups.
- Excellent communication skills and experience working in Agile methodology.
What we offer
- Competitive salary
- Annual performance-based bonus
- Additional bonuses for recognition awards
- Multisport card
- Private medical care
- Life insurance
- One-time reimbursement of home office set-up (up to 800 PLN).
- Corporate parties & events
- CSR initiatives
- Nursery discounts
- Financial support with trainings and education
- Social fund
- Flexible working hours
- Free parking
If your CV meets our criteria, you should expect the following steps in the recruitment process:
- Online behavioural test (for external candidates only)
- Telephone screen (for external candidates only)
- Job interview with the hiring manager
We are looking to hire as soon as possible so don’t wait and apply now!
You'll achieve more when you join HSBC.