Senior DevOps / SRE (Platform Reliability Engineer)

DevOps

Senior DevOps / SRE (Platform Reliability Engineer)

DevOps
Centrum, Lisbon

emagine Polska

Full-time
Any
Senior
Remote

Job description

We are looking for a Senior DevOps / Site Reliability Engineer (SRE) to ensure the reliability, scalability, performance, and security of our platform and cloud infrastructure. You will play a key role in building and operating cloud-native systems, improving observability, automating operations, implementing SRE best practices (SLOs/SLIs), and supporting development teams to deliver highly available services.

Key Responsibilities

  • Design, implement, and maintain highly available and scalable infrastructure on AWS.

  • Own and improve the reliability of production systems using SRE principles (SLO, SLI, error budgets).

  • Build and manage CI/CD pipelines to support fast and safe software delivery.

  • Develop and maintain Infrastructure as Code (IaC) using Terraform, Ansible, CloudFormation, etc.

  • Manage and optimize container orchestration platforms (Kubernetes, Docker, Helm).

  • Implement and maintain monitoring, logging, and alerting solutions (Prometheus, Grafana, ELK, Datadog, Splunk).

  • Lead incident response, perform root cause analysis, and write postmortems to drive continuous improvement.

  • Improve system performance, capacity planning, scaling strategies, and disaster recovery processes.

  • Collaborate closely with development teams to improve deployment strategies and system resilience.

  • Implement security best practices (IAM, secret management, vulnerability scanning, patching).

  • Define operational standards, runbooks, documentation, and best practices for platform reliability.

  • Participate in on-call rotation and provide senior-level support for critical production issues.

Key Requirements

  • 5+ years of experience in DevOps / SRE / Cloud Infrastructure / Platform Engineering.

  • Strong expertise in Linux systems administration and troubleshooting.

  • Proven experience with Kubernetes in production environments.

  • Strong experience with CI/CD tools (GitLab CI, Jenkins, GitHub Actions, Azure DevOps).

  • Solid knowledge of Infrastructure as Code (Terraform highly preferred).

  • Experience with cloud platforms: AWS, Azure, or Google Cloud.

  • Strong understanding of networking fundamentals (TCP/IP, DNS, load balancing, reverse proxies).

  • Experience with observability tools: monitoring, metrics, logging, tracing.

  • Strong scripting skills (Bash, Python, or similar).

Nice to Have

  • Experience with additional cloud platforms (Azure, GCP).

  • Strong understanding of networking fundamentals.

Tech stack

    English

    B1

    Microsoft Azure

    advanced

    Security

    advanced

    Cloud

    advanced

    Jenkins

    advanced

    CI/CD

    advanced

    TCP/IP

    advanced

    Splunk

    advanced

    Python

    advanced

    Operations

    advanced

    Microsoft Platform

    advanced

Office location

Senior DevOps / SRE (Platform Reliability Engineer)

Summary of the offer

Senior DevOps / SRE (Platform Reliability Engineer)

Centrum, Lisbon
emagine Polska
By applying, I consent to the processing of my personal data for the purpose of conducting the recruitment process. Informujemy, że administratorem danych jest emagine z siedzibą w Warszawie, ul.Domaniewskiej 39A (dalej jako "administrator"). Masz pr... MoreThis site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.