We are seeking a highly skilled and visionary Data Architect to lead the design and implementation of our global data model at InPost. This position is ideal for individuals passionate about data ecosystems, medallion architecture, and top-noch frameworks for processing large-scale datasets. This role will encompass both performance (vast data volumes) and conceptual aspects (naming convention and its consistency).
- Defining and overseeing the implementation of data modeling strategies
- Defining and overseeing the data naming strategy (establishing conventions, harmonizing existing names of schemas, objects, columns, ensuring compliance)
- Training data engineers and data consultants on data modeling approaches (theory, tools, strategies)
- Creating technical specifications for data model development and data architecture documentation
- Ensuring effective collaboration between engineering and analytical teams
- Tracking trends and introducing innovations in data modeling approaches (tools, performance, naming)
- Minimum 6 years of experience in the field of data engineering / analytics engineering
- 2 years of experience as a Data Architect
- Expert in SQL
- Advanced knowledge of Python, Spark, Databricks, Azure, dbt + willingness to work with in-house frameworks for creating data objects
- Extensive experience in designing and implementing data warehouses (preferably modern big data warehouses, data lakehouse), (long-term) experience with data modeling in star schema, theoretical and practical knowledge of Kimball approach
- Deep understanding of medalion architecture
- Proven track record of successfully delivering Big Data & Analytics solutions
- Proficiency in modern Data & Analytics technology stacks and architectural patterns
- Knowledge and experience in optimization of: queries in the Big Data stack, ETL processes, data storage solutions and partitioning strategies for high-performance analytics and processing
- Familiarity with the Parquet file format and experience working with Delta Lake, including data versioning, storage optimization, ensuring data integrity, and handling ACID transactions
- Understanding of the end-to-end development lifecycle of analytical data products
- High level of communication and cross-team collaboration skills
- Familiarity with CI/CD and git (preferred GitLab)
- Polish native, fluent English
We will appreciate also:
- Knowledge of BI tools (Power BI preferred / Tableau)
- Familiarity with CI/CD and git (preferred GitLab)
- Experienced in establishing and enforcing data governance policies using Unity Catalog in Databricks for centralized data access control, lineage, and auditing
- Capability in defining and managing data quality frameworks, metadata management, and data cataloging
- Expertise in implementing role-based access control (RBAC) and data encryption (at rest and in transit) to ensure data security and compliance with regulations like GDPR, CCPA, and HIPAA
- Experience in building global data models