DCG
Cześć, tu DCG! Jesteśmy ekspertami w obszarze rekrutacji, a naszym konikiem jest branża IT. Mimo to chętnie wykorzystujemy swoje kompetencje również w realizacji projektów na stanowiska z sektorów tj. finanse, HR, marketing i wiele innych. Stawiamy na dobrą relację, zawsze będziemy grać z Tobą do jednej bramki. Przygotujemy Cię do rozmowy z naszym Klientem tak, abyś mógł pokazać się ze swojej najlepszej strony. Zróbmy wspólnie krok w Twoim rozwoju zawodowym. Nie możemy doczekać się, aż Cię poznamy! 😊
DCG is a modern technology company, gathering IT-related professionals in its ranks. Due to the continuous development and the large number of recruitment projects that we carry out for our Partners, we are looking for a person for the position of SRE Manager.
DESCRIPTION
A Site Reliability Engineering (SRE) Sr. Manager leads a team of SRE engineers, focusing on designing, implementing, and maintaining highly reliable and scalable systems. They emphasize operational excellence, monitoring, automation, and collaboration with development teams to ensure the stability and performance of critical applications and infrastructure across the organization. Combining strong technical expertise with leadership skills, they drive continuous improvement and maintain high service availability.
RESPONSIBILITIES
Team Leadership:
- Recruit, hire, and develop a high-performing SRE team.
- Provide mentorship and coaching to junior FTE SRE engineers within client's teams.
- Set clear goals and expectations for the team.
- Develop and track objectives and key results (OKRs) for the team as a whole and for individual team members.
Technical Strategy:
- Define and implement SRE best practices, standards, and processes.
- Own end-to-end availability and performance of key services and build automation to prevent problem recurrence.
- Deliver end-to-end automation using Terraform within Google Cloud to create a new project, add a user to an existing project, request access to a new service, enable a new Google service in an existing project, etc.
- Expert knowledge of IAM and roles and permissions within Google Cloud
- Design various user roles that consider both security and user experience.
- Design and build monitoring systems to identify potential issues proactively.
- Establish Service Level Objectives (SLOs) for all offered services.
- Manage expenses to the budget in the cloud
Collaboration:
- Come with a customer-obsessed attitude and create a seamless user experience for any team requesting infrastructure services
- Work closely with development teams to identify potential reliability issues early in the development cycle
- Collaborate with security teams to maintain system security and compliance
- Excellent written and verbal communication skills
Performance Optimization:
- Analyze system metrics to identify performance bottlenecks and opportunities for improvement
- Implement capacity planning strategies to ensure system resilience under high-load
- Continuously monitor and optimize system performance
REQUIREMENTS
OFFER
Check similar offers