Relout is a place created by ambitious people with a passion for technology. We work for international projects and clients from various industries, helping startups, software houses, and enterprises to transform and scale their businesses. We’re a boutique consulting (https://relout.team) & technology (https://relout.cloud) partner that builds the foundation to scale for our client’s success. Our mission is to connect best-in-class, passionate engineers with fast-growing digital & technology companies.
We're looking for a Site Reliability Engineer to join a long-term project with one of our client's - a well established provider of safety solutions for logistics sector, operating mainly in US and Canada. The SRE engineer will join a Cloud Engineering & Infrastructure team and work with a medium-sized R&D organization (around 100 engineers) and be responsible mainly for the core platform running SaaS product on top of Azure and Kubernetes.
We're seeking a highly skilled individual who possesses technical expertise in SRE as well as strong communication and initiative driving skills. The candidate will play crucial role in bridging the gap between operations and business stakeholders, managing discussions related to observability and reliability as well as driving SRE initiatives. The ability to effectively implement SLOs & SLIs, build useful dashboards, work closely with development teams on implementation and supervise the escalations to L1/L2 teams will be key elements to fullfilling this role
Our client operates in US timezones (EST), but remains flexibile in terms of working hours, requiring only 2-3h overlap for the purpose of business meetings & syncs (between 15:00 and 18:00 Polish time)
🔷Responsibilities:
-
Drive the SRE discipline: Take ownership of SRE practices, serving as the primary advocate and driving force behind the company reliability strategy.
-
Implement SLIs and SLOs: Collaborate with stakeholders to define and implement service-level indicators, objectives, and error budgets.
-
Build centralized dashboards: Develop tools to enhance visibility into system performance, identify trends, and improve overall observability.
-
Oversee incident management processes: Collaborate with the 24/7 incident support team, assist in drafting and refining SOPs, ensure RCA processes are followed, and handle escalations as needed.
-
Drive cross-team collaboration: Act as the bridge between Cloud Engineering, Infrastructure Development, and other technical teams to ensure alignment on objectives and seamless execution of SRE initiatives.
-
Collaborate with senior management: Prepare and deliver clear, business-oriented presentations that justify investments, outline strategic priorities, and demonstrate measurable progress on reliability metrics.
🎯Requirements:
- Minimum 3 years of experience in Site Reliability Engineering (SRE) roles or simmilar
- Proficiency in utilizing monitoring and observability tools such as Prometheus, Grafana, Elasticsearch and OpsGenie
- Ability to design and implement centralized dashboards for system performance monitoring and analysis.
- Experience in drafting and executing SOPs and RCA processes
- Experience in working with large-scale distributed systems, microservices, and cloud-native architectures
- Proven ability to guide cross-functional engineering teams and collaborate with stakeholders at various organizational levels.
- Nice to have: Proven track record in monitoring cloud infrastructures, including Kubernetes and Azure (PaaS and IaaS)
- Familiarity with monitoring databases like MongoDB, CosmosDB, MySQL or SQL Server.
- Exceptional verbal and written communication skills in English
-
Communication skills to present technical ideas and plans in a way that aligns with organizational goals
⭐What we offer:
- Payment in either USD or PLN (we have a branch in Poland)
- Regular company events and integrations (meetups)
- Recurring Fun budget to spend on anything that makes you happy (team activities encouraged!)
-
Educational budget to spend on certifications, training, and conference attendance
- Ability to access & use coworking office spaces in every major city in Poland
- Attractive referral programs
- Unlimited legal advice & support with B2B partnership and self-employment
- Missing anything you like? Luxmed, Multisport? Ask us about it!
Apply now! Join the Relout Team today! 📢