Designs, develops and maintains scalable data pipelines with unstructured and structured data
Collaborates with analytics, business and production teams to improve data models (e.g. product parts, product variants/options, sensor data collection) that feed business intelligence tools, increasing data accessibility and fostering data-driven decision making across the organisation
Implements processes and systems to monitor data quality, ensuring production data is always accurate and available for key stakeholders and business processes that depend on it
Writes unit/integration tests, contributes to engineering wiki, and documents work
Performs data analysis required to troubleshoot data related issues and assist in the resolution of data issues
Works closely with a team of frontend and backend engineers, product managers, and analysts
Helps in defining and designing company data assets, data integrations and data quality framework
Designs and evaluates open source and vendor tools for data lineage and compliance requirements such as GDPR
Qualifications / Skills:
Sound knowledge and experience of best practices in software architecture, development and operation in an always-up, always-available service and around the 4 V’s of big data (knowledge of Azure cloud data pipeline components like Data Factory and technical SAP are significant plus)
Sound knowledge of software/warehouse/cloud development and programming (e.g. Apache Spark, Apache Hive. Apache Kafka and big data in general)
Sound knowledge of data lineage and access control frameworks (e.g. Apache Atlas, Apache Ranger)
Experience with or knowledge of Agile Software Development methodologies
Experience with self-service reporting (e.g. PowerBI)
Excellent problem solving and troubleshooting skills
Process oriented with good documentation skills
Excellent oral and written communication skills with a keen sense of customer service