Data Engineer

Data

Mokotowska 1, Warszawa

SAVENTIC HEALTH sp. z o.o.

Full-time

Any

Mid

Remote

Job description

Saventic Health is a mission-driven health-tech startup based in Warsaw, dedicated to transforming the diagnosis of rare diseases using cutting-edge Artificial Intelligence. We are developing innovative solutions that have the potential to significantly shorten diagnostic timelines and improve patient outcomes. As a startup, we thrive on innovation, agility, and a collaborative spirit where every team member makes a tangible impact. We work with complex medical data, requiring robust and reliable data infrastructure.

The Opportunity:

We are looking for a skilled and enthusiastic Data Engineer to join our team and play a vital role in building and maintaining the data infrastructure that powers our AI platform. This is a fantastic opportunity to develop your expertise in data engineering within a meaningful healthcare context, working with complex data and modern tools.

As a Data Engineer, you will collaborate with senior engineers, data scientists, and MLOps engineers to implement, manage, and optimize our data pipelines and storage solutions.

You'll gain hands-on experience across the data lifecycle, ensuring our systems are efficient, reliable, and capable of handling sensitive medical data securely, primarily within our on-

premise environment but also interacting with cloud services.

Your Impact:

As an important member of our technical team, you will:

Contribute to AI Enablement: Build and maintain the essential data pipelines that provide clean, reliable data for our AI models.
Help Ensure Data Reliability: Implement processes and monitoring to support data quality and system uptime.
Support Our Data Foundation: Contribute to the development and maintenance of our data storage solutions and overall data infrastructure.
Assist in Optimization: Help identify and implement improvements to make our data processes more efficient and scalable.

Key Responsibilities:

Build and Maintain Data Pipelines: Implement, automate, schedule, and maintain scalable ETL/ELT pipelines using tools like Apache Airflow.
Implement Data Solutions: Work with senior team members to implement data models and database schemas.
Manage Data Storage: Assist in managing and optimizing our data storage solutions, including our current PostgreSQL databases.
Implement Data Quality Checks: Implement and monitor data quality rules and checks within pipelines.
Performance Tuning: Assist in monitoring and tuning the performance of data pipelines and SQL queries under guidance.
Utilize Tools: Work effectively with our data engineering stack, including Python, Airflow, SQL databases, and potentially cloud services.
Collaboration: Work closely within the team, collaborating with Data Scientists, MLOps Engineers, and Senior Data Engineers.
Follow Security Practices: Ensure data security best practices are followed in implemented solutions.
Documentation: Document data pipelines, processes, and configurations clearly.

Who You Are:

Experienced Data Engineer: Proven practical experience working as a Data Engineer, demonstrating the ability to build and maintain data systems.
Python Proficient: Proficiency in Python is required for scripting and pipeline development.
SQL & Database Skills: Solid understanding of SQL and practical experience working with relational databases (including PostgreSQL).
Pipeline Orchestration Experience: Hands-on experience building and maintaining data pipelines, with practical experience using Apache Airflow required.
Data Concepts Aware: Understanding of core data engineering concepts, ETL/ELT processes, and data modeling principles.
Cloud Familiarity: Basic familiarity with cloud platforms (AWS, GCP, Azure) and their data services is beneficial.
Problem Solver: Good analytical and problem-solving skills, with an eagerness to tackle technical challenges.
Team Player & Learner: Collaborative, communicative, and keen to learn and apply new data engineering techniques.
Educated: Bachelor's degree in Computer Science, Engineering, or a related technical field often expected, or equivalent practical experience.

Nice to Haves:

Experience working in healthcare or another regulated industry.
Experience with other data processing tools (e.g., Spark, Pandas for larger datasets).
Experience with NoSQL databases.
Familiarity with data warehousing concepts.
Understanding of data governance principles.
Experience with containerization (Docker).
Familiarity with Infrastructure-as-Code (IaC) tools (e.g., Terraform, Ansible).

What We Offer:

An opportunity to make a real-world impact by contributing to solutions that help diagnose rare diseases.
A chance to work on challenging data engineering problems with complex medical data using modern tools like Airflow.
Significant opportunities for learning and professional growth within a supportive team.
Exposure to the application of AI in healthcare.
A dynamic, innovative, and collaborative startup culture.
Competitive salary and benefits package.

Ready to build the data backbone for healthcare AI?

If you are a Data Engineer passionate about building reliable data systems and eager to apply your skills (including Python and Airflow) in a meaningful domain, we encourage you to apply!