Senior Data Engineer, Clinical Data Platform
Project overview
You will work on a platform that processes clinical and real-world data (EHRs, labs, registries, trial data) and powers analytics, reporting, and data products for a healthcare / clinical research client.
Position overview
We are looking for a Senior Data Engineer to build and operate a clinical data platform on Databricks, with a strong focus on robust data pipelines, data models, and data quality.
Technology stack
The platform is built on Databricks (Spark, Delta Lake) and includes reusable pipelines, a shared data model, and automated data quality checks.
Responsibilities
Design, build, and maintain end-to-end Databricks data pipelines (ingestion, transformation, publishing) for production use
Work with data models (staging, curated, canonical, or dimensional) and help evolve them together with architects and analysts
Embed data quality and data governance rules into all pipelines (checks, validation, monitoring, alerting)
Optimize Databricks jobs for performance and cost (cluster configuration, partitioning, caching, file layout)
Collaborate with data architects, analysts, and domain experts to clarify requirements and refine technical solutions
Requirements
5+ years of experience in data engineering, DWH, or big data, including production data pipelines
Strong hands-on experience with Databricks: Spark (PySpark/Scala), Delta Lake, Databricks Jobs / Workflows
Proven experience designing and operating end-to-end pipelines on Databricks for batch or near-real-time data
Experience with data pipelines and CI/CD for data
Practical experience with data modeling (layered models, canonical or dimensional models) for analytics and reporting
Experience embedding data quality and data governance rules into pipelines (schema checks, business rules, SLOs, monitoring)
Good communication skills, upper-intermediate or higher English proficiency, and the ability to work closely with stakeholders in distributed teams and communicate directly with clients
Nice to have
Experience designing and delivering PoC solutions on Databricks to quickly validate ideas using real data
Experience with ontologies or a semantic layer (business concepts, metrics, mappings) on top of analytical data
Senior Data Engineer, Clinical Data Platform
Senior Data Engineer, Clinical Data Platform