Project information:
- Industry: insurance and IT services
- Rate: up to 160 PLN/H net + VAT, B2B
- Location: Warsaw – first 2-3 months hybrid model of work, then remote
- Project language: Polish, English
As a Data Engineer, the primary function of this role is to design, build, and maintain Data Hubs that integrate data from multiple sources to support analytical, reporting, operational, and Generative AI use cases. The role is essential to create a scalable and efficient data infrastructure that facilitates real-time data availability and model updates. Collaboration is expected with data architects, AI engineers, and business teams. Tools include Databricks, Azure Data Factory, and Azure SQL.
- Data Hub Development – Design and implement scalable Data Hubs to support enterprise-wide data needs.
- Data Pipeline Engineering – Build and optimize ETL/ELT pipelines for efficient data ingestion, transformation, and storage.
- Logical Data Modeling – Structure Data Hubs to ensure efficient access patterns and support diverse use cases.
- Real-Time Analytics – Enable real-time data ingestion and updating models to support streaming and real-time analytics.
- Data Quality & Monitoring – Develop data validation, anomaly detection, and monitoring features to ensure high data reliability.
- Performance Optimization – Optimize data processing and storage for large-scale datasets.
- Automation & CI/CD – Implement CI/CD pipelines to automate data workflows and deployments.
- Collaboration – Work with data architects, AI engineers, and business teams to align data solutions.
- Monitoring & Maintenance – Continuously improve data infrastructure for scalability and reliability.
- Agile Practices – Participate in Scrum/Agile methodologies to deliver high-quality data solutions.
- Documentation – Create and maintain clear, structured documentation for data models, pipelines, and technical decisions.
- Strong Python skills for data engineering.
- Experience with Azure Data Factory, ADLS, and Azure SQL.
- Hands-on experience in ETL/ELT development.
- Experience with real-time data processing.
- Understanding of AI/ML data processing.
- Proficiency in SQL.
- Knowledge of CI/CD and infrastructure-as-code (Terraform).
- Understanding of data governance and compliance.
- Experience with Databricks and Apache Spark.
- Familiarity with containerization (Docker, Kubernetes).
- Ability to produce high-quality technical documentation.
- Minimum B2 level in English.
Nice to have:
- Background from large consulting firms.
- Graduation from prestigious universities.