Please note it's now remote role but later turns into hybrid - so only candidates from Warsaw and surroundings are required.
About:
We are seeking a highly motivated and self-driven data engineer for our growing data team -who is able to work and deliver independently and as a team. In this role, you will play a crucial part in designing, building and maintaining our ETL infrastructure and data pipelines.
Major Responsibilities:
● Design, develop, and deploy Python scripts and ETL processes with Prefect and Airflow to prepare data for analysis.
● Model dimensional and denormalized schemas for optimal performance reporting and discovery.
● Design AI-friendly DB schemas and ontologies.
● Architect cloud ops solutions for data topologies.
● Transform and migrate data with Python, DBT, and Pandas.
● Work with event-based/streaming technologies for real-time ETL.
● Ingest and transform structured, semi-structured, and unstructured data.
● Optimize ETL jobs for performance and scalability to handle big data workloads.
● Monitor and troubleshoot ETL jobs to identify and resolve issues or bottlenecks.
● Implement best practices for data management, security, and governance with Prefect, DBT, and Pandas.
● Write SQL queries, program stored procedures, and reverse engineer existing data pipelines.
● Perform code reviews to ensure fit to requirements, optimal execution pattern,s and adherence to established standards.
● Assist with automated release management and CI/CD processes.
● Validate and cleanse data and handle error conditions gracefully.
Skills
● 3+ years of Python development experience, including Pandas
● 5+ years writing complex SQL queries with RDBMSes.
● 5+ years of Experience with developing and deploying ETL pipelines using Airflow, Prefect, or similar tools.
● Experience with cloud-based data warehouses in environments such as RDS, Redshift, or Snowflake.
● Experience with data warehouse design: OLTP, OLAP, Dimensions, and Facts.
● Experience with Cloud-based data architectures, messaging, and analytics.
Pluses: Experience with
● Docker
● Kubernetes
● CI/CD automation
● AWS lambdas/step functions
● Data partitioning
● Databricks
● Pyspark
● Cloud certifications
Net per month - B2B
Gross per month - Permanent
Check similar offers