Head of DevOps / SRE + AI Ops

DevOps

Head of DevOps / SRE + AI Ops

DevOps
Powstańców Warszawy 6, Sopot

Autopay S.A.

Full-time
Permanent, B2B
Senior
Hybrid

Job description

About the company

Autopay Global is the newest member of the Autopay family, aiming to expand the reach of the group’s state-of-the-art payment integration and payment data technologies to the international market, providing seamless integration with local PSPs, support for multiple currencies and compliance with local frameworks. We have a very forward-looking approach to our products, we value creativity, passion and drive to leverage the newest achievements in technology to our advantage.

 

To support our dynamic expansion, we are looking for a new Head of Operations (DevOps / SRE + AI Ops) for a full-time, hybrid work in Warsaw or Gdańsk.

About the role

The Head of DevOps / SRE + AI Ops owns production reliability, delivery infrastructure, and operational governance across AWS and GCP, with a specific focus on low-latency decisioning and high-integrity payments flows.

Your responsibilities will be to:

  • define and operationalize SLOs for critical services,
  • own incident management: on-call structure, alerting standards, runbooks, postmortems, and corrective action tracking,
  • build and maintain the multi-cloud runtime platform (EKS/GKE or equivalent),
  • establish a paved road for engineers: standardized service templates, CI/CD pipelines, IaC modules, environment management, and production readiness checks,
  • implement cloud-native observability across AWS and GCP, including metrics, logs, alarms and dashboards using respective cloud platform tools,
  • establish unified cross-cloud telemetry conventions,
  • integrate and use Dynatrace Platform as suitable,
  • drive governance, monitoring, and optimization of integrations usingMulesoft Anypoint, enabling secure and reliable connectivity with external services,
  • harden security posture: least-privilege IAM, key management, perimeter controls and secure CI/CD,
  • own multi-cloud cost controls and capacity planning for peak campaign traffic and paid media bursts,
  • lead AI Ops: implement model/agent versioning, evaluation suites, rollout and rollback. Monitor model/agent quality and enforce guardrails,
  • partner with Engineering, Data, Security, and Compliance to meet SOC2 and PCI-aligned operational controls,
  • hire and lead a high-performing AI OPS organization.

What tools will you be working with?

  • Technology: AWS (EKS, ECR, MSK, CloudWatch, X-Ray), GCP (GKE, GCS, BigQuery, Pub/Sub), OpenTelemetry, Dynatrace, ML/LLM inference or agent runtimes,
  • Nice to have: Experience with MuleSoft Anypoint and multi-cloud data movement (AWS and GCP).

Requirements and skills we are looking for in a person hired for this role:

  • 10+ years in DevOps/SRE/platform engineering with startups; 3-5+ years leading SRE/DevOps teams for multi-service production systems in payments or banking,
  • proven experience in IT audits and compliance, including industry standards (e.g., PCI DSS) and regulatory requirements from banking/payment authorities,
  • hands-on expertise running Kubernetes in production (EKS and/or GKE) with strong networking fundamentals (VPC design, private connectivity, TLS, DNS),
  • deep experience with cloud-native observability (metrics, logs, tracing) and building actionable alerting and on-call hygiene,
  • proven implementation of SLOs, error budgets, capacity planning, and DR for high-availability services,
  • strong security engineering instincts: IAM design, secrets management, encryption, secure CI/CD, and audit logging,
  • experience operating real-time data systems (Kafka/MSK/Confluent, Pub/Sub, stream processing) and API-heavy platforms,
  • operating ML/LLM inference or agent runtimes in production: rollout/rollback, safe configuration, monitoring and alerting,
  • implementing evaluation gates for models/agents (offline regression, golden sets, canaries) and closing the loop with production feedback signals,
  • monitoring for drift and regression (data drift, embedding drift, tool-call failure rates, latency regressions) and establishing kill switches for rapid containment,
  • predictive capacity & resource optimization leveraging AI/ML.

What we offer?

  • A leadership role in a fast-growing, global fintech company,
  • possibility to work with cutting-edge tools and technologies,
  • independence in decision-making,
  • friendly working environment, team support, no dress code.

 

Join us and let's head together where no one has gone before!

 

Tech stack

    DevOps

    advanced

    AWS

    advanced

    Machine Learning

    advanced

    Kubernetes

    advanced

    CI/CD

    advanced

    AI

    advanced

    Kafka

    advanced

Office location

Head of DevOps / SRE + AI Ops

Summary of the offer

Head of DevOps / SRE + AI Ops

Powstańców Warszawy 6, Sopot
Autopay S.A.
By applying, I consent to the processing of my personal data for the purpose of conducting the recruitment process. Informujemy, że administratorem danych jest Autopay S.A z siedzibą w Sopocie, ul. Powstańców Warszawy 6 (dalej jako "administrator"). ... MoreThis site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.