Solution Architect – Site Reliability (SRE) & Observability
plac Trzech Krzyży 10, Warszawa +1 Location
ERGO Technology & Services
About Us
ERGO Technology & Services S.A. (ET&S S.A.) was established in January 2021 following the integration of ERGO Digital IT and Atena into one entity, leveraging both companies’ strengths and best practices. As a part of ERGO Technology & Services Management AG, the technology holding of ERGO Group AG, we support millions of internal and external customers with state-of-the-art IT solutions to everyday problems.
In October 2022, ET&S S.A. expanded its scope of operations by creating a Business Services unit to contribute in a new way to the growth of ERGO’s business. Acting as a co-partner and internal consultant, it adds non-IT value and supports the development of the entire ERGO Group, currently offering skills in reporting, analysis, actuarial, and input management. We are committed to fostering innovation and meeting the evolving needs of our clients worldwide.
Discover how we implement AI, IoT, Voice Recognition, Big Data science, advanced mobile solutions, and business-related services to anticipate and address our customers’ future needs.
About the role
As a Solution Architect, you will be responsible for defining the strategic direction of the Site Reliability Engineering (SRE) service including observability and monitoring. This role focuses on architectural decisions, designing integrations, ensuring best practices, and advising SRE engineers and consulting customer teams on how to automate their service operations and leverage observability tools (e.g. Datadog) effectively.
How you will get the job done
defining the strategic vision for site reliability engineering, observability and platform engineering and planning tactical steps for implementation
leading the design and governance of automated service operations, observability tooling, ensuring scalability, security, and cost efficiency
scouting and analysing new observability features – matching them to business needs and notifying the engineers about potential improvements
designing collaboration, automation and integration models
defining standards/best practices for automated service operations, observability framework including alerting, SLOs, and distributed tracing across digital products
configuring, integrating, administering, and maintaining observability for all relevant digital products, using Infrastructure as Code (IaC)
ensuring comprehensive monitoring coverage across digital products
supporting, advising, and coaching SRE engineers on the best ways to automate service operations, and the use observability tools
supporting SRE engineers in troubleshooting and optimizing monitoring configurations
guiding and mentoring engineers in implementing provisioning and configuration of observability tools using Infrastructure as Code
engaging with the observability tool vendors to discuss complex technical issues and feature enhancements
answering technical questions from product teams
negotiating technical aspects of observability tools during procurement discussions to ensure optimal setup
Skills and experience you will need
fluency in English
strong Site Reliability Engineering (SRE), Platform Engineering and Observability Architecture experience
expertise in observability tools (architecture, governance, integrations, APM, security best practices) and automating service operations
strong Infrastructure as Code (IaC) knowledge and experience (e.g. Terraform)
experience designing log management, APM, infrastructure monitoring, and synthetic testing solutions
knowledge of distributed tracing, metrics, and telemetry collection
familiarity with cloud environments (Azure, Kubernetes, Databricks)
strong strategic thinking and vision-setting for observability and reliability
excellent stakeholder communication and coaching abilities
experience negotiating with vendors and external service providers
ability to lead and mentor engineers, ensuring effective implementation of observability tooling
Nice to have
German language proficiency
Perks & Benefits
Let's be healthy
Medical package, sports card, and numerous sports sections – these are some of the benefits that help our employees stay in good shape.
Let's be balanced
Work-life balance is a key aspect of a healthy workplace. We offer our employees flexible working hours, a confidential employee assistant program, as well as the possibility of remote working. However, staying at home with our in-office gaming room and dog-friendly office in Warsaw won’t be easy.
Let's be smart
We organize numerous workshops and training courses. Thanks to hackathons and meetups, our specialists share their expertise with others. Additionally, we have a wide range of digital learning platforms and language courses.
Let's be responsible
Each year, we participate in several CSR activities, during which, together with our colleagues, we do our best to create a better future.
Let's be fun
Company-wide bike races and soccer matches, film marathons in our cinema room or other engaging team-building activities – we got it covered!
Let's be diverse
Every team member is valued, regardless of gender, nationality, religious beliefs, disability, age, and sexual orientation or identity. Your qualifications, experience, and mindset are our greatest benefit!
Solution Architect – Site Reliability (SRE) & Observability
Solution Architect – Site Reliability (SRE) & Observability
plac Trzech Krzyży 10, Warszawa
ERGO Technology & Services