All offersWarszawaPythonSenior DL Performance Infrastructure and MLOps Engineer
Senior DL Performance Infrastructure and MLOps Engineer
new
Python
NVIDIA

Senior DL Performance Infrastructure and MLOps Engineer

NVIDIA
Warszawa
Type of work
Full-time
Experience
Senior
Employment Type
Permanent
Operating mode
Remote

Tech stack

    PyTorch
    advanced
    JAX
    advanced
    GPU
    advanced
    CI/CD
    advanced
    GitLab
    advanced
    Docker
    advanced
    C++
    advanced
    Python
    advanced
    CUDA
    advanced
    DL
    advanced

Job description

We are now looking for a Senior DL Performance Infrastructure & MLOps Engineer.

NVIDIA is seeking engineers who love building world-class infrastructure, from automated command-line scripting to full-blown CI/CD systems running on some of the world's largest clusters, to support our work to accelerate training of deep neural networks like Stable Diffusion or ChatGPT via hardware and software innovations. If you have that itch whenever the mechanical aspects of code development, performance analysis, and data processing consume any more human time than necessary, we'd like to hear from you. If you are passionate about accelerating all existing workfloads in a diverse team while also envisioning next-gen opportunities to enable new forms of hardware/software analysis and development we haven't even thought of, this is the place for you.


What you'll be doing:

  • Improve all tooling and automation in use in the team, from simple data collection scripts to datacenter-scale ML CI/CD systems.
  • Understand and internalize workflows for GPU performance analysis and optimization so you can help us re-invent them.
  • Build Python-based machinery hooking into common Deep Learning software like PyTorch or JAX to support performance analysis work.
  • Ruthlessly discover and chase down workflow- and tool-related inefficiencies in the team's daily work, and dream up and implement ways to eliminate them.


What we need to see:

  • MS degree in CS or adjacent fields or equivalent experience
  • 3+ years of relevant work experience
  • Background in deep learning fundamentals and common deep learning software, especially PyTorch/JAX
  • Experience in GPU computing, i.e. fundamental understanding of heterogeneous multi-node accelerated computing systems
  • Background in analyzing and optimizing application performance
  • Familiarity with containerized CI/CD flows, e.g. gitlab + docker
  • Programming skills in C++, Python, and CUDA
  • Deep passion related to tools, scripts, and automation


NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. Are you creative and autonomous? Do you love a challenge? If so, we want to hear from you! Come, join our DL Architecture team and help build the real-time, cost-effective AI computing platform driving our success in this exciting and quickly growing field.