#1 Job Board for tech industry in Europe

  • Job offers
  • All offersWarszawaOtherFounding ML Engineer (CUDA, ROCm, C++)
    Founding ML Engineer (CUDA, ROCm, C++)
    Other
    DevsData LLC

    Founding ML Engineer (CUDA, ROCm, C++)

    DevsData LLC
    Warszawa
    Type of work
    Full-time
    Experience
    Senior
    Employment Type
    B2B
    Operating mode
    Hybrid
    DevsData LLC

    DevsData LLC

    DevsData is a premium recruitment and software development agency specialized in developing unique software, artificial intelligence, and Big Data solutions. We’re working 100% remotely so that we can change the world from every place on Earth.

    Company profile

    Tech stack

      C++

      advanced

      CUDA

      advanced

      C

      advanced

      Python

      advanced

      GPU

      advanced

    Job description

    Online interview

    Founding ML Engineer


    • 💰 Salary: $100 000 USD+ Equity (highly negotiable for the right candidate)
    • 🌎 Hybrid role: 2-3 days at an office in Warsaw/Gdansk
    • 🕦 Full-time position
    • 📝B2B contract or Contract of Employment, negotiable
    • ☑️Home office budget & relocation/traveling cost included


    A rapidly scaling startup, recently emerging from stealth mode and backed by a top-tier venture capital fund, is embarking on a mission to democratize AI across any hardware platform


    Our client's R&D team is building a highly efficient engine for deploying genAI models. This entails a wide array of tasks, ranging from fine-tuning GPU kernels to optimizing system performance. The Founding ML Engineer will play a pivotal role in driving significant enhancements in GPU performance while spearheading innovative AI and machine learning initiatives.


    To tackle this mission - they are seeking an expert-level engineer for either Kernel, Compiler, or Runtime Optimization, with a robust background in CUDA, ROCm, or Triton kernel optimization.


    This role presents an exceptional opportunity to shape the technical direction of the company and contribute to groundbreaking advancements in AI technology.


    Requirements:


    • Deep understanding and experience in GPU performance optimizations.
    • Proven track record of kernel optimizations on CUDA, ROCm, or other accelerators.
    • Proficiency in programming languages such as C/C++ and Python.
    • Experience with the training and deployment of ML models.
    • Familiarity with distributed systems development or distributed ML workloads.
    • Bachelor's, Master’s or PhD’s degree in Computer Science, Electrical Engineering, or a related field.
    • Great understanding of English with strong communication and collaboration skills.


    An exceptional candidate will also have:


    • Familiarity with OSS projects like FlashAttention, mlc-llm, vllm
    • Experience with machine learning compilers or frameworks such as TVM, MLIR, Pytorch, Tensorflow, ONNX Runtime, TensorRT.


    You would be:


    • Analyzing the bottlenecks in ML training and inference
    • Developing and optimizing computing kernels in CUDA, Triton or ROCm
    • Working on the GPU performance optimizations to maximize performance


    Get to know DevsData


    We are a technology consulting company and a recruitment agency, delivering software solutions to clients from Europe and the US. We work 100% remotely, in an international team, including people from Asia, London, or San Francisco. We employ people with experience in international corporations as well as students of the best technical and business universities.


    Find out more: https://devsdata.com