AI Service Reliability Engineer (SRE)
Krakow, Kraków
DCV Technologies
AI Service Reliability Engineer (SRE)
📌 We are looking for AI Service Reliability Engineer (SRE) on behalf of our client to join to the an an American multinational company that develops and sells network equipment.
📄 Contract: B2B
📍 Location: Hybrid — based in Krakow, 3 days in the office per week, or occasional visits once or 2–3 times per month.
We’re looking for a Senior Platform Engineer / AI Service Reliability Engineer to drive the secure enablement, onboarding, and support of AI assistant tools. You’ll work at the intersection of infrastructure, security, and developer enablement to support platforms like Cursor, Windsurf, GitHub Copilot, and OpenAI Codex, ensuring they are accessible, reliable, and compliant at scale.
In addition to managing core infrastructure and observability platforms, this role will play a critical part in onboarding users, resolving support issues, and expanding adoption of AI tools across business units—partnering closely with InfoSec, Legal, and IT.
What You’ll Do
Lead the enterprise enablement of AI coding assistants manage onboarding flows, user provisioning, security reviews, and platform integration.
Act as the primary point of contact for support and issue resolution, ensuring AI tools are reliable and usable across different user groups.
Own the reliability, scalability, and security of multi-tenant Kubernetes/OpenShift clusters (dev/stage/prod).
Design and manage internal platforms and workflows: Operators, Helm/Kustomize, GitOps (e.g., Argo CD), ingress/egress, quotas, and RBAC.
Operate and optimize Redis, PostgreSQL, and MongoDB with HA, backup/restore, replication, and tuning.
Build end-to-end observability stacks: Prometheus, Grafana, Splunk, alerting, SLOs/SLIs, and runbooks.
Automate operations and developer workflows using Bash, Python, Terraform, and Ansible.
Define and enforce security controls
What You’ll Bring
6+ years in SRE, Platform, or DevOps roles with production support and automation experience.
Deep experience with Kubernetes and/or OpenShift and GitOps principles.
Hands-on with Redis, PostgreSQL, MongoDB in production environments.
Experience building observability and support tooling for large-scale systems.
Strong scripting skills in Bash and Python; experience with CI/CD tools like GitHub Actions or Jenkins.
Background in security and compliance, especially around onboarding third-party tools and data privacy controls.
Excellent problem-solving and communication skills for handling user support, cross-team collaboration, and stakeholder alignment.
Proven ability to manage internal enablement efforts, onboarding large numbers of users and driving tool adoption across diverse teams.
Nice to Have
Experience operating within environments or similar enterprise networks.
Familiarity with enterprise egress controls, CASB, DLP, and proxy management.
Experience with PostgreSQL HA (Patroni), Redis Sentinel/Cluster, and MongoDB Atlas.
Background in supporting air-gapped or VPN-constrained data migration scenarios.
Exposure to Click/Typer, Argo CD, Terraform, Ansible, and policy-as-code frameworks like OPA
📩 If you’re interested and meet the qualifications, please send your CV to Alina Pchelnikova
at alina.pchelnikova@dcvtechnologies.co.uk