All offersWarszawaDataSenior Data Engineer - Machine Learning
Senior Data Engineer - Machine Learning
Data
SAVENTIC HEALTH sp. z o.o.

Senior Data Engineer - Machine Learning

SAVENTIC HEALTH sp. z o.o.
Warszawa
Type of work
Full-time
Experience
Senior
Employment Type
B2B
Operating mode
Hybrid

Tech stack

    ETL
    advanced
    Python
    advanced
    Big Data
    regular

Job description

Saventic Health - we are an international scale-up company focusing on innovations in medicine. We create algorithms, based on artificial intelligence, to support the diagnosis of rare diseases. Our product is based on classical NLP models and a range of additional machine learning models. Currently, we are actively working on developing our own multi-language LLM model and addressing database prompting issues.


Currently, we are looking for an experienced Data Engineer to develop our analytical capabilities. We plan to perform the implementation of a new architecture combining data extraction from medical centers, data quality verification, creation of a feature store and upload of the data to the database. Additionally, we aim to develop a platform, one of the functionalities of which will be the detection of clinical symptoms in patients who, do not have specific phrases written in their medical description. In the next stages, we plan to develop a large model that will be the core for finding rare diseases.


Your responsibilities

  • Planning, construction, and testing of various ETL pipelines for medical data processing.
  • Design and implementation of an architecture integrating databases and interfaces.
  • Development & deployment of AI platform for diagnosis of rare diseases at multiple remote servers around the world.
  • Reviewing code developed by other team members, providing feedback and insights.
  • Analysing, processing and modelling data, interpret the results.
  • Close cooperation with medical, data science and business team. 


Our requirements

  • MD in Data Engineering, Computer Science, Math, Statistics or related fields
  • 5+ years of relevant experience (can be mixed industry and academic)
  • Proven track record of developed or maintained ML pipeline 
  • Experience in:
  • ETL (Kedro, AirFlow) 
  • Big Data (egz Spark)
  • Demonstrating very good coding skills
  • Ability to communicate complex research ideas and solutions in both verbal and written form.
  • Good English language skills