Operations Specialist Storage
Key Responsibilities:
Provide Tier-3 operational ownership for Storage Products for Local Production.
Handling of complex incidents, deep troubleshooting, and root cause analysis; drive permanent fixes and preventive measures.
Ensuring operational readiness for storage changes
Monitoring/alerting coverage, performance baselines, hardening, patch strategy, rollback and recovery procedures, runbooks.
Executing and improving standard operational procedures through automation (reduce toil, improve MTTR and stability).
Automation of standard operational tasks (capacity checks, validation procedures, provisioning workflows where applicable).
Ensure operational readiness for deployments
Validation of deployment artifacts from an operations perspective.
Defining and enforcing quality assurance measures (e.g. required documentation of standard operation procedures, successful test reports, …) to ensure the high quality of delivered products and services.
Ensuring rollback strategies and operational monitoring (observability) are in place for production deployments.
Ensure operational stability and responsiveness for the managed Kubernetes platform
Monitoring system health, performance metrics, and service availability across multi-tenant environments.
Identifying, analyzing, and resolving incidents, minimizing service disruption.
Triggering root cause analysis and implementation of corrective and preventive actions.
Reduce operational toil and improve service reliability
Address recurring operational issues by automating remedial standard operations processes
Validate all automated procedures following the established software development lifecycle including staging, testing, and validation reviews
Ensure platform operations adhere to security and compliance standards
Implementing monitoring and logging strategies to support audit and compliance requirements.
Performing routine security scans and remediating identified vulnerabilities.
Profile Requirements:
Willingness to travel on business trips to Germany for up to 50% of the time
5+ years in IT storage operations / service delivery / platform operations with demonstrated leadership in missioncritical environments.
Proven experience implementing/leading Incident, Problem, Change, Release governance in production.
Experience supporting platform workloads that rely on shared storage services.
Expertise with storage types: File Storage, Block Storage, Object Storage.
Expertise with protocols/services: NFS; object storage operations (S3-like concepts).
Experience with kubernetes storage integration: CSI driver concepts and troubleshooting (PV/PVC lifecycle understanding).
Virtualization (Storage): Experience operating storage virtualization in enterprise environments.
Expertise within ITSM: Jira Service Management (JSM), Jira, Confluence.
Fundamental understanding of core operations processes (incident management, change management, problem management, IT Service Management) as well as SRE concepts
Experience in gathering operational insights from monitoring or observability including SLI/SLA/SLO management and tracking.
Hand-on experience in documenting procedures properly and enforcing clear runbooks or playbooks.
Observability Hands-on experience with monitoring and logging tools (e.g., Prometheus, Grafana, Datadog, Mimir, Loki).
Familiarity with enterprise DevOps toolchains is a plus (GitLab, JFrog Artifactory, Backstage, Harness).
Strong understanding of modern platform operations (Kubernetes/containers, automation, observability), sufficient to govern specialists.
Platform delivery concepts: GitOps and IaC awareness (Terraform/OpenTofu, ArgoCD, Helm) to govern deployment/readiness standards.
Proficiency in both speech and writing in English (at least C1).
Proficiency in both speech and writing in German(at least C1).
Preferred experience:
Experience operating in regulated / high-availability industries (banking, telco, public sector, healthcare).
Experience with SRE practices (SLOs/SLIs, error budgets) and reliability management.
Experience operating storage services that integrate with Kubernetes platforms.
Familiarity with IaC-based provisioning and GitOps-driven operational patterns.
Perks and benefits:
Comprehensive benefits - enjoy Udemy for Business, private medical care, Multisport card, veterinary package, language lessons, and shopping vouchers.
Career growth - access opportunities for professional development and learning, including perks related to our official partnerships with global IT giants: Microsoft, AWS, Snowflake, Salesforce & more.
Global collaboration - work with a diverse, international team.
Innovative environment - be part of a forward-thinking and growth-oriented workplace.
Engaging community – Work with passionate professionals and participate in team-building events, hackathons, and CSR initiatives to make an impact beyond work.
Team-building events including our company tradition (annual company event in Mazury).
A pleasant surprise to start your journey with us in the form of a welcome pack.
Operations Specialist Storage
Operations Specialist Storage