DevOps Engineer (Observability)
The Opportunity
Join a high-performing, international team of six DevOps experts. This is not a "maintenance-only" role. You will have a seat at the table in designing, building, and scaling our next-generation observability and logging solutions from the ground up.
We believe in "Attitude First." If you are an ambitious engineer who thrives on collaboration, knowledge sharing, and solving complex distributed systems challenges, we want to grow with you.
Key Responsibilities
Architect & Build: Design and implement end-to-end observability solutions, including metrics, logging, tracing, and advanced alerting.
Platform Excellence: Operate and optimize high-scale monitoring platforms (Prometheus, Mimir, Grafana) and ELK stack logging infrastructure.
Infrastructure as Code: Define and maintain all observability systems using Terraform and Terragrunt.
Reliability Engineering: Ensure the scalability and performance of our systems while supporting incident detection and root cause analysis (RCA).
Collaborate: Work across domains with a team that values mentoring, transparency, and collective problem-solving.
Your Technical Core
Observability Expert: Solid hands-on experience with Prometheus, Grafana, and scaling tools like Thanos or Mimir.
Logging Architect: Proven experience managing enterprise-grade logging platforms (ELK stack or Loki).
IaC Ninja: Strong proficiency in Terraform/Terragrunt to manage infrastructure.
Cloud Native: Deep understanding of Kubernetes and the complexities of metrics/logs/traces in distributed systems.
Language: Full proficiency in English for seamless global collaboration.
Stand Out From The Crowd (Nice to Have)
Coding: Ability to automate and integrate using Python or Go.
CI/CD: Exposure to GitHub Actions and automated workflows.
Configuration Management: Experience with Puppet.
SRE Mindset: Understanding of Service Level Indicators (SLIs), Objectives (SLOs), and Error Budgets.
DevOps Engineer (Observability)
DevOps Engineer (Observability)