Project information:
- Industry: Insurance
- Rate: up to 185 pln/h net + vat, B2B
- Location: Warsaw – hybrid
- Project language: Polish, English
About the role:
As a Data Engineer, you will design, build, and maintain Data Hubs that integrate multiple data sources for analytical, reporting, operational, and Generative AI use cases.
Responsibilities
- Data Hub Development – Build scalable and efficient Data Hubs for enterprise-wide use.
- Data Pipelines – Develop and optimize ETL/ELT processes for structured and unstructured data.
- Real-Time Processing – Enable real-time ingestion and model updates for streaming analytics.
- Data Quality & Monitoring – Implement validation, anomaly detection, and performance tracking.
- Anomaly Response – Actively monitor data quality, investigate anomalies, and take corrective actions.
- Automation & CI/CD – Automate workflows and deployments using best DevOps practices.
- Collaboration – Work with cross-functional teams to align solutions with business needs.
- Documentation – Maintain clear technical documentation of data models and pipelines.
Requirements:
- Programming – Strong Python and SQL skills for data engineering.
- Cloud Data Services – Experience with Azure Data Factory, ADLS, and Azure SQL.
- ETL/ELT & Streaming – Hands-on experience with batch and real-time data processing.
- Databricks & Spark – Expertise in Databricks (primary tool) and Apache Spark.
- Automation & DevOps – Knowledge of CI/CD, Terraform, Docker, Kubernetes/AKS.
- Data Governance – Understanding of security, compliance, and best practices.
- Monitoring & Quality Control – Experience in data quality monitoring and anomaly detection.
- Collaboration & Agile – Ability to work in cross-functional Agile teams.
- Documentation – Clear and structured technical writing skills.
- Language Skills – Proficient in English (spoken and written), minimum B2 level.
Technology Stack:
- Data Platform – Databricks, Apache Spark, Delta Lake.
- Cloud & Data Services – Azure Data Factory, ADLS, Azure SQL, Azure DevOps.
- Streaming & Real-Time – Azure Stream Analytics, Azure Event Hubs, Azure Synapse.
- Development Tools – Python, SQL, GitHub, Terraform, Docker, Kubernetes/AKS.