We're building a Platform Team that will be responsible for handling all of Kubernetes-related things in AVSystem, including installation and support of Kubernetes-based platforms in production environments of our clients.
You will be a member of the Platform Team at AVSystem, providing guidelines and support for efficient usage of Kubernetes across the whole company. As it is an ever-evolving field, we decided it is best to have a dedicated team of experts on infrastructure-as-code principles and observability.
Responsibilities
- Manage Kubernetes infrastructure, including deployment, configuration, and scaling
- Ensure high availability and reliability strategies
- Develop observability tools and monitoring solutions
- Drive automation through Infrastructure as Code practices
- Provide expert support for Kubernetes-related incidents
- Develop and deploy Custom Resources and Operators
- Implement security best practices for Kubernetes clusters
- Continuously improve platform performance and efficiency
- Conduct training and facilitate knowledge sharing
- Integrate Kubernetes with cloud services and service mesh solutions
- Collaborate on capacity planning and scaling strategies
- Participate in incident response and root cause analysis
Requirements
- 2+ years experience with Kubernetes (ideally including Custom Resources and Operators)
- 3+ years experience in Software Engineering, Site Reliability Engineering or DevOps role
- Strong working knowledge of Linux and containerisation
- Troubleshooting skills that span across applications, networking, and systems
- Solid knowledge of time series data collection, processing, storage, and visualisation, utilising tools such as Prometheus, Thanos
- Experience in environments that apply Infrastructure-as-code principles with configuration management technologies (Terraform, Ansible, ArgoCD, Flux, Rancher Fleet)
Nice to have
- Proficiency in one or more of the following: Go, Python, Shell scripting
- Practical experience building infrastructure on a cloud provider (GCP preferred) and/or on bare metal
- Experience with a service mesh such as Istio or Cilium Mesh
What we offer
- Freedom and responsibility – a high level of autonomy to make critical decisions, directly impacting the success of our platform and clients
- A culture that encourages innovation, where your ideas for process improvement and tool development are valued
- A team of highly skilled and humorous colleagues
- Collaborative and supportive work environment, embracing agile methodologies for effective teamwork
- Access to the best tools and equipment available in the market
- A MacBook Pro / ThinkPad with 2 monitors
- Company events and team building activities
- Flexible working hours/remote work when you need it
- Trainings and conferences
- Multisport card
- Medical care
- Kitchen full of snacks and treats (including Good Lood ice cream)
- Car parking area and bike room
- A relaxed work atmosphere – no dress code, no open space