Saventic Health is a mission-driven health-tech startup based in Warsaw, dedicated to transforming the diagnosis of rare diseases using cutting-edge Artificial Intelligence. We are developing innovative solutions that have the potential to significantly shorten diagnostic timelines and improve patient outcomes. As a startup, we thrive on innovation, agility, and a collaborative spirit where every team member makes a tangible impact. We work with complex medical data, requiring robust and reliable data infrastructure.
The Opportunity:
We are looking for a skilled and enthusiastic Data Engineer to join our team and play a vital role in building and maintaining the data infrastructure that powers our AI platform. This is a fantastic opportunity to develop your expertise in data engineering within a meaningful healthcare context, working with complex data and modern tools.
As a Data Engineer, you will collaborate with senior engineers, data scientists, and MLOps engineers to implement, manage, and optimize our data pipelines and storage solutions.
You'll gain hands-on experience across the data lifecycle, ensuring our systems are efficient, reliable, and capable of handling sensitive medical data securely, primarily within our on-
premise environment but also interacting with cloud services.
Your Impact:
As an important member of our technical team, you will:
- Contribute to AI Enablement: Build and maintain the essential data pipelines that provide clean, reliable data for our AI models.
- Help Ensure Data Reliability: Implement processes and monitoring to support data quality and system uptime.
- Support Our Data Foundation: Contribute to the development and maintenance of our data storage solutions and overall data infrastructure.
- Assist in Optimization: Help identify and implement improvements to make our data processes more efficient and scalable.
Key Responsibilities:
- Build and Maintain Data Pipelines: Implement, automate, schedule, and maintain scalable ETL/ELT pipelines using tools like Apache Airflow.
- Implement Data Solutions: Work with senior team members to implement data models and database schemas.
- Manage Data Storage: Assist in managing and optimizing our data storage solutions, including our current PostgreSQL databases.
- Implement Data Quality Checks: Implement and monitor data quality rules and checks within pipelines.
- Performance Tuning: Assist in monitoring and tuning the performance of data pipelines and SQL queries under guidance.
- Utilize Tools: Work effectively with our data engineering stack, including Python, Airflow, SQL databases, and potentially cloud services.
- Collaboration: Work closely within the team, collaborating with Data Scientists, MLOps Engineers, and Senior Data Engineers.
- Follow Security Practices: Ensure data security best practices are followed in implemented solutions.
- Documentation: Document data pipelines, processes, and configurations clearly.
Who You Are:
- Experienced Data Engineer: Proven practical experience working as a Data Engineer, demonstrating the ability to build and maintain data systems.
- Python Proficient: Proficiency in Python is required for scripting and pipeline development.
- SQL & Database Skills: Solid understanding of SQL and practical experience working with relational databases (including PostgreSQL).
- Pipeline Orchestration Experience: Hands-on experience building and maintaining data pipelines, with practical experience using Apache Airflow required.
- Data Concepts Aware: Understanding of core data engineering concepts, ETL/ELT processes, and data modeling principles.
- Cloud Familiarity: Basic familiarity with cloud platforms (AWS, GCP, Azure) and their data services is beneficial.
- Problem Solver: Good analytical and problem-solving skills, with an eagerness to tackle technical challenges.
- Team Player & Learner: Collaborative, communicative, and keen to learn and apply new data engineering techniques.
- Educated: Bachelor's degree in Computer Science, Engineering, or a related technical field often expected, or equivalent practical experience.
Nice to Haves:
- Experience working in healthcare or another regulated industry.
- Experience with other data processing tools (e.g., Spark, Pandas for larger datasets).
- Experience with NoSQL databases.
- Familiarity with data warehousing concepts.
- Understanding of data governance principles.
- Experience with containerization (Docker).
- Familiarity with Infrastructure-as-Code (IaC) tools (e.g., Terraform, Ansible).
What We Offer:
- An opportunity to make a real-world impact by contributing to solutions that help diagnose rare diseases.
- A chance to work on challenging data engineering problems with complex medical data using modern tools like Airflow.
- Significant opportunities for learning and professional growth within a supportive team.
- Exposure to the application of AI in healthcare.
- A dynamic, innovative, and collaborative startup culture.
- Competitive salary and benefits package.
Ready to build the data backbone for healthcare AI?
If you are a Data Engineer passionate about building reliable data systems and eager to apply your skills (including Python and Airflow) in a meaningful domain, we encourage you to apply!