Saventic Health is a mission-driven health-tech startup based in Warsaw, dedicated to transforming the diagnosis of rare diseases using cutting-edge Artificial Intelligence. We are developing innovative solutions that have the potential to significantly shorten diagnostic timelines and improve patient outcomes. As a startup, we thrive on innovation, agility, and a collaborative spirit where every team member makes a tangible impact. We work with complex medical data, requiring robust and reliable data infrastructure.
The Opportunity:
We are looking for a skilled and enthusiastic Data Engineer to join our team and play a vital role in building and maintaining the data infrastructure that powers our AI platform. This is a fantastic opportunity to develop your expertise in data engineering within a meaningful healthcare context, working with complex data and modern tools.
As a Data Engineer, you will collaborate with senior engineers, data scientists, and MLOps engineers to implement, manage, and optimize our data pipelines and storage solutions.
You'll gain hands-on experience across the data lifecycle, ensuring our systems are efficient, reliable, and capable of handling sensitive medical data securely, primarily within our on-
premise environment but also interacting with cloud services.
Your Impact:
As an important member of our technical team, you will:
-
Contribute to AI Enablement: Build and maintain the essential data pipelines that provide clean, reliable data for our AI models.
-
Help Ensure Data Reliability: Implement processes and monitoring to support data quality and system uptime.
-
Support Our Data Foundation: Contribute to the development and maintenance of our data storage solutions and overall data infrastructure.
-
Assist in Optimization: Help identify and implement improvements to make our data processes more efficient and scalable.
Key Responsibilities:
-
Build and Maintain Data Pipelines: Implement, automate, schedule, and maintain scalable ETL/ELT pipelines using tools like Apache Airflow.
-
Implement Data Solutions: Work with senior team members to implement data models and database schemas.
-
Manage Data Storage: Assist in managing and optimizing our data storage solutions, including our current PostgreSQL databases.
-
Implement Data Quality Checks: Implement and monitor data quality rules and checks within pipelines.
-
Performance Tuning: Assist in monitoring and tuning the performance of data pipelines and SQL queries under guidance.
-
Utilize Tools: Work effectively with our data engineering stack, including Python, Airflow, SQL databases, and potentially cloud services.
-
Collaboration: Work closely within the team, collaborating with Data Scientists, MLOps Engineers, and Senior Data Engineers.
-
Follow Security Practices: Ensure data security best practices are followed in implemented solutions.
-
Documentation: Document data pipelines, processes, and configurations clearly.
Who You Are:
-
Experienced Data Engineer: Proven practical experience working as a Data Engineer, demonstrating the ability to build and maintain data systems.
-
Python Proficient: Proficiency in Python is required for scripting and pipeline development.
-
SQL & Database Skills: Solid understanding of SQL and practical experience working with relational databases (including PostgreSQL).
-
Pipeline Orchestration Experience: Hands-on experience building and maintaining data pipelines, with practical experience using Apache Airflow required.
-
Data Concepts Aware: Understanding of core data engineering concepts, ETL/ELT processes, and data modeling principles.
-
Cloud Familiarity: Basic familiarity with cloud platforms (AWS, GCP, Azure) and their data services is beneficial.
-
Problem Solver: Good analytical and problem-solving skills, with an eagerness to tackle technical challenges.
-
Team Player & Learner: Collaborative, communicative, and keen to learn and apply new data engineering techniques.
-
Educated: Bachelor's degree in Computer Science, Engineering, or a related technical field often expected, or equivalent practical experience.
Nice to Haves:
-
Experience working in healthcare or another regulated industry.
-
Experience with other data processing tools (e.g., Spark, Pandas for larger datasets).
-
Experience with NoSQL databases.
-
Familiarity with data warehousing concepts.
-
Understanding of data governance principles.
-
Experience with containerization (Docker).
-
Familiarity with Infrastructure-as-Code (IaC) tools (e.g., Terraform, Ansible).
What We Offer:
-
An opportunity to make a real-world impact by contributing to solutions that help diagnose rare diseases.
-
A chance to work on challenging data engineering problems with complex medical data using modern tools like Airflow.
-
Significant opportunities for learning and professional growth within a supportive team.
-
Exposure to the application of AI in healthcare.
-
A dynamic, innovative, and collaborative startup culture.
-
Competitive salary and benefits package.
Ready to build the data backbone for healthcare AI?
If you are a Data Engineer passionate about building reliable data systems and eager to apply your skills (including Python and Airflow) in a meaningful domain, we encourage you to apply!