Senior DevOps Engineer (HPC)

DevOps

Senior DevOps Engineer (HPC)

DevOps
Grunwaldzka 472E, Olivia Prime A and Grunwaldzka 472F, Olivia Prime B, Gdansk

EPAM Systems

Full-time
Any
Senior
Remote

Job description

We are seeking a Senior DevOps Engineer to enhance our high-performance computing services and collaborate closely with the scientific community to optimize research computing.

Join our team to build and operate cutting-edge HPC capabilities using automation and infrastructure-as-code. Apply now to contribute to innovative computational solutions in a dynamic environment.

Responsibilities

  • Design, implement, and maintain robust platform infrastructure using Infrastructure as Code tools such as Terraform
  • Develop, deliver, and operate research computing services and applications
  • Apply Site Reliability Engineering principles to manage HPC service deployment, monitoring, and incident response
  • Solve complex technical problems related to HPC services and user applications
  • Manage large-scale HPC, HTC, or BC computing environments for optimal performance
  • Collaborate with scientific users to tailor HPC resources to research needs
  • Automate deployment processes to ensure consistency across HPC infrastructure
  • Maintain and administer large-scale cluster and server computing software such as Slurm, LSF, or Grid Engine
  • Develop and maintain monitoring dashboards using tools like Grafana and Prometheus
  • Work within a DevOps team environment following agile methodologies
  • Operate and utilize virtualized private cloud resources such as OpenStack
  • Administer large-scale parallel filesystems including Weka, GPFS, or Lustre
  • Use configuration management tools like Ansible, Salt, or Puppet to manage IT operations
  • Develop scripts and tools for HPC and DevOps platform operations using Bash and Python

Requirements

  • 3+ years of experience with DevOps processes and automation using Infrastructure as Code tools such as Terraform
  • Hands-on experience operating or engineering large-scale HPC or similar computing environments
  • Proven expertise in Linux system administration including TCP/IP networking and storage subsystems
  • Experience administering large-scale cluster management software such as Slurm, LSF, or Grid Engine
  • Knowledge of configuration management tools like Ansible, Salt, or Puppet
  • Experience working in agile DevOps teams
  • Ability to develop and maintain monitoring tools such as Grafana and Prometheus
  • Experience with scripting languages such as Bash and Python for automation and tool development
  • Strong experience managing virtualized private cloud environments like OpenStack
  • Scientific degree or equivalent experience in computationally intensive scientific data analysis
  • Proven ability to manage relationships with third-party suppliers
  • Upper-intermediate proficiency in English (B2+)

Nice to have

  • Experience with container technologies such as LXD, Singularity, Docker, or Kubernetes
  • Operation and configuration experience with public cloud platforms like AWS, Azure, or GCP
  • Experience with HashiCorp tools such as Vault, Consul, and Nomad
  • Development experience with programming languages such as Java, C++, Python, Ruby, or Perl
  • Experience with parallel filesystems like Weka, GPFS, or Lustre

We offer

  • We gather like-minded people:
    • Engineering community of industry professionals
    • Friendly team and enjoyable working environment
    • Flexible schedule and opportunity to work remotely within Poland
    • Chance to work abroad for up to 60 days annually
    • Business-driven relocation opportunities
  • We provide growth opportunities:
    • Outstanding career roadmap
    • Leadership development, career advising, soft skills, and well-being programs
    • Certification (GCP, Azure, AWS)
    • Unlimited access to LinkedIn Learning, Get Abstract, Cloud Guru
    • English classes
  • We cover it all:
    • Stable income (Employment Contract or B2B)
    • Participation in the Employee Stock Purchase Plan
    • Benefits package (health insurance, multisport, shopping vouchers)
    • Strategically located offices featuring entertainment and relaxation zones, table tennis and football, free snacks, fantastic coffee, and more
    • Referral bonuses
    • Corporate, social and well-being events
  • Please, note:
    • The set of bonuses might vary based on the role you apply for – specifics will be discussed with our recruiter during the general interview.
    • We will reach out to selected candidates exclusively.

EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.

Tech stack

    English

    B2

    HPC

    master

    Linux

    master

    SLURM

    advanced

    Python

    advanced

    Bash

    advanced

    Terraform

    advanced

    OpenStack

    advanced

    Grafana

    advanced

    Ansible

    advanced

    Prometheus

    advanced

Office location

Senior DevOps Engineer (HPC)

Summary of the offer

Senior DevOps Engineer (HPC)

Grunwaldzka 472E, Olivia Prime A and Grunwaldzka 472F, Olivia Prime B, Gdansk
EPAM Systems
By applying, I consent to the processing of my personal data for the purpose of conducting the recruitment process. Klikając w przycisk „Aplikuj” lub w inny sposób wysyłając zgłoszenie rekrutacyjne, zgadzasz się na przetwarzanie Twoich danych osobowy... MoreThis site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Check similar offers
Caspian One

Caspian One

Poland (Remote)

Remote

Remote

434 - 569USD/day
Ansible
Linux
Azure DevOps
SeniorSeniorB2BB2B
New
ADVERTISEMENT: Recommended by Just Join IT
Applied -
11 day left (until 30.06.2026)
Applied -
Check similar offers
Caspian One

Caspian One

Poland (Remote)

Remote

Remote

434 - 569USD/day
Ansible
Linux
Azure DevOps
SeniorSeniorB2BB2B
New
GameCode

GameCode

Remote

Remote

Undisclosed Salary
Git
CI/CD
Bash
Docker
Linux
Python
SeniorSeniorB2BB2B
New
Ciklum

Ciklum

Remote

Remote

5 766USD/month
Google Cloud Platform
GCP
Terraform
Atlantis
Helm
Kubernetes
Python
SeniorSeniorB2BB2B
New
DataFeedWatch

DataFeedWatch

Remote

Remote

6 817 - 8 726USD/month
DevOps
CI/CD
Bash
Prometheus
Grafana
Alertmanager
Kubernetes
Loki
OpenTelemetry
Python
SeniorSeniorB2B, PermanentB2B, Permanent
New
TQLO SPÓŁKA Z OGRANICZONĄ ODPOWIEDZIALNOŚCIĄ

TQLO SPÓŁKA Z OGRANICZONĄ ODPOWIEDZIALNOŚCIĄ

Warszawa

Remote

Remote

6 870 - 8 244USD/month
AWS
Bash
Docker
Terraform
Kubernetes
Python
SeniorSeniorB2BB2B
New
ADVERTISEMENT: Recommended by Just Join IT