-
💰 Salary: €7,900 – €9,000/month
-
📍 Location: 100% Remote
-
🕒 Type: Full-time
-
☑️ Contract type: B2B
Are you a passionate Site Reliability Engineer? We’re hiring for a company specialized in distributed systems, content delivery, and video streaming at scale. This fast-growing tech company is transforming in-transit entertainment with an intelligent caching platform that enables airlines and cruise lines to deliver personalized, high-quality video content, even without internet access. Join a global team building the next-generation content delivery system for aircraft and maritime environments.
-
3+ years of experience in Site Reliability Engineering, DevOps, or Cloud Infrastructure roles
-
Strong hands-on experience with:
-
Kubernetes (Helm, Operators, workload and networking management)
-
Terraform and other Infrastructure-as-Code tools
-
Containers and orchestration at scale
-
CI/CD systems (e.g., GitLab CI, Argo CD)
-
Observability tools like Prometheus, Grafana, Loki, Alertmanager
-
Familiarity with service meshes such as Istio or Linkerd
-
Deep understanding of networking protocols (TCP/IP, HTTPS, DNS, QUIC)
-
Experience with distributed systems principles (consistency, fault tolerance, horizontal scaling)
-
Ability to diagnose and resolve production issues effectively in high-availability systems
-
Bachelor’s degree in Computer Science or equivalent professional experience
-
Experience with performance tuning in Go, Rust, or C/C++
-
Background in content delivery, video/media platforms, or caching technologies
-
Proven contributions to reliability engineering or developer platform improvements
-
Design, deploy, and maintain Kubernetes-based infrastructure using Terraform and Infrastructure-as-Code principles
-
Lead software deployment efforts for two new international content delivery sites
-
Build and optimize observability systems (metrics, logging, alerting) to monitor service health and performance
-
Collaborate with engineering teams to develop and automate CI/CD pipelines (GitLab CI, Argo CD, etc.)
-
Operate and improve service mesh technology (e.g., Istio) to ensure secure, reliable service-to-service communication
-
Troubleshoot production systems with a focus on distributed services and networking (HTTP/S, DNS, QUIC)
-
Contribute to post-incident reviews, root cause analyses, and long-term stability initiatives
-
Participate in on-call rotations for incident response and site uptime
We are a technology consulting company and a recruitment agency, delivering software solutions to clients from Europe and the US. We work 100% remotely, in an international team, including people from Asia, London, or San Francisco. We employ people with experience in international corporations as well as students of the best technical and business universities.
Find out more: https://devsdata.com