Network Reliability Engineer - Senior

54 - 68 USDNet per hour - B2B
Architecture

Network Reliability Engineer - Senior

Architecture
Prosta 20, Warsaw

Margo

B2B Contract
B2B
Senior
Remote
54 - 68 USDNet per hour - B2B

Job description

Growth is driving Client to strengthen its SRE team to support and scale its production environments.

Your mission will be to build and maintain reliable, observable, and secure infrastructure in order to ensure optimal service availability for our customers around the world.

#HPC #AI #GPU #CLUSTERS


!!! Client is in CALIFORNIA, USA !!!
Working Hours - More or less like strart at 18.00 CEST
Long term Project of minimum a year

YOUR DAILY ROUTINE:

- Build a large AI infrastructure with monitoring, diagnosis, and remediation of production incidents- Troubleshoot high-impact production issues in collaboration with other engineering teams

- Participate in an on-call rotation to handle incidents and ensure service continuity

- Implement and maintain observability solutions to monitor AI infrastructure and application health

- Contribute to AI infrastructure lifecycle management across different environments and countries

- Promote and apply best practices in terms of stability, resiliency, scalability, and security

- Maintain clear technical documentation for tools and procedures

- Contribute to system and tool evolution based on production feedback

- Collaborate closely with development teams to ensure infrastructure readiness- Participate in team rituals and knowledge-sharing initiatives

SOFTSKILLS :

- Proactive and solution-oriented mindset

- Passion for automation and continuous improvement

- Strong collaboration and communication skills

- Ability to work independently and in a team

- Willingness to mentor and share knowledge

HARDSKILLS :

- Experience with Go or Python

- Strong scripting skills (Bash, Python)

- Hands-on experience with Linux systems (Ubuntu/Debian)

- Preferred hands-on experience with GPU & HPC infrastructure

- Knowledge of networking (LAN/VLAN TCP/IP, DNS, BGP, load-balancing, IPv6, etc.)

- Boot systems like PXE

- Familiarity with monitoring and logging tools (Prometheus, Grafana, Elastic, etc.)

- Comfortable with Infrastructure-as-Code (Ansible, Salt, AWX, etc.)

- Experience managing relational databases (MariaDB)

- Understanding of CI/CD pipelines (GitLab)

- Comfortable with English (written and spoken)

Tech stack

    English

    C1

    Ubuntu

    advanced

    Linux (Debian)

    advanced

    MariaDB

    advanced

    Networking

    advanced

    Prometheus

    regular

    Grafana

    regular

    Ansible

    regular

    GitLab

    regular

    Python

    regular

    GPU

    regular

Office location

Network Reliability Engineer - Senior

54 - 68 USDNet per hour - B2B
Summary of the offer

Network Reliability Engineer - Senior

Prosta 20, Warsaw
Margo
54 - 68 USDNet per hour - B2B
By applying, I consent to the processing of my personal data for the purpose of conducting the recruitment process. I hereby declare that I give consent to my personal data processing by Margo Consulting Polska Sp z o. o., which after expressing this... MoreThis site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Check similar offers
EPAM Systems

EPAM Systems

Warsaw

Remote

Remote

Undisclosed Salary
AWS
Application Security
Azure
Secure SDLC
Identity and Access Management
.Net
Cloud security
Java
threat modeling
Python
SeniorSeniorAnyAny
New
ADVERTISEMENT: Recommended by Just Join IT
Salary
54 - 68 USD
Net per hour - B2B
Applied -
18 day left (until 09.07.2026)
Applied -
Check similar offers
EPAM Systems

EPAM Systems

Warsaw

Remote

Remote

Undisclosed Salary
AWS
Application Security
Azure
Secure SDLC
Identity and Access Management
.Net
Cloud security
Java
threat modeling
Python
SeniorSeniorAnyAny
New
emagine Polska

emagine Polska

Warsaw

Remote

Remote

Undisclosed Salary
ETL
User Experience (UX)
Artificial Intelligence (AI)
SAP
DataStage (ETL)
business requirements
business architecture
Use Cases
Python
MVP (Minimum Viable Product)
SeniorSeniorAnyAny
New
Fibertide

Fibertide

Remote

Remote

6 779 - 8 677USD/month
Programming
AWS
Computer science
Cloud
IaC
Algorithms
GCP
Networks
English
Communication
SeniorSeniorB2B, Mandate contractB2B, Mandate
New
Harvey Nash Technology

Harvey Nash Technology

Warsaw

Remote

Remote

43 - 57USD/h
Software Architecture
ServiceNow
SeniorSeniorB2BB2B
New
Nexio Management

Nexio Management

Warszawa

Remote

Remote

Undisclosed Salary
Prometheus
Azure
Terraform
Observability
Loki
OpenTelemetry
Graphana
Monitoring Tools
SeniorSeniorB2BB2B
New
ADVERTISEMENT: Recommended by Just Join IT