All offersWarszawaOtherFounding ML Engineer (CUDA, ROCm, C++)
Founding ML Engineer (CUDA, ROCm, C++)
DevsData LLC

Founding ML Engineer (CUDA, ROCm, C++)

DevsData LLC
Type of work
Employment Type
Operating mode
DevsData LLC

DevsData LLC

DevsData is a premium recruitment and software development agency specialized in developing unique software, artificial intelligence, and Big Data solutions. We鈥檙e working 100% remotely so that we can change the world from every place on Earth.

Company profile

Tech stack











Job description

Online interview

Founding ML Engineer

  • 馃挵 Salary: $100 000 USD+ Equity (highly negotiable for the right candidate)
  • 馃寧 Hybrid role: 2-3 days at an office in Warsaw/Gdansk
  • 馃暒 Full-time position
  • 馃摑B2B contract or Contract of Employment, negotiable
  • 鈽戯笍Home office budget & relocation/traveling聽cost included

A rapidly scaling startup, recently emerging from stealth mode and backed by a top-tier venture capital fund, is embarking on a mission to democratize AI across any hardware platform

Our client's R&D team is building a highly efficient engine for deploying genAI models. This entails a wide array of tasks, ranging from fine-tuning GPU kernels to optimizing system performance. The Founding ML Engineer will play a pivotal role in driving significant enhancements in GPU performance while spearheading innovative AI and machine learning initiatives.

To tackle this mission - they are seeking an expert-level engineer for either Kernel, Compiler, or Runtime Optimization, with a robust background in CUDA, ROCm, or Triton kernel optimization.

This role presents an exceptional opportunity to shape the technical direction of the company and contribute to groundbreaking advancements in AI technology.


  • Deep understanding and experience in GPU performance optimizations.
  • Proven track record of kernel optimizations on CUDA, ROCm, or other accelerators.
  • Proficiency in programming languages such as C/C++ and Python.
  • Experience with the training and deployment of ML models.
  • Familiarity with distributed systems development or distributed ML workloads.
  • Bachelor's, Master鈥檚 or PhD鈥檚 degree in Computer Science, Electrical Engineering, or a related field.
  • Great understanding of English with strong communication and collaboration skills.

An exceptional candidate will also have:

  • Familiarity with OSS projects like FlashAttention, mlc-llm, vllm
  • Experience with machine learning compilers or frameworks such as TVM, MLIR, Pytorch, Tensorflow, ONNX Runtime, TensorRT.

You would be:

  • Analyzing the bottlenecks in ML training and inference
  • Developing and optimizing computing kernels in CUDA, Triton or ROCm
  • Working on the GPU performance optimizations to maximize performance

Get to know DevsData

We are a technology consulting company and a recruitment agency, delivering software solutions to clients from Europe and the US. We work 100% remotely, in an international team, including people from Asia, London, or San Francisco. We employ people with experience in international corporations as well as students of the best technical and business universities.

Find out more: