Data Engineer with GCP
Language: English (fluent), German (nice to have)
Start date for assignment: ASAP
Duration: 6 months + extensions
Workload: Full-time
Location: 100% remote
Summary
The role focuses on enhancing data-driven personalization for strategic digital products. The primary objective is to design and implement scalable data architectures on Google Cloud to improve user engagement, retention, and monetization through relevant content delivery.
Main Responsibilities
Design and implement scalable data pipelines using Google Cloud technologies (e.g., BigQuery, Dataflow, Pub/Sub).
Develop and optimize both batch and streaming data pipelines for personalization use cases.
Transform and provision user and content-related data for machine learning applications.
Implement data quality checks and establish logging and monitoring mechanisms.
Optimize existing data processing workflows for improved performance.
Integrate machine learning models into production workflows.
Document data structures, transformation logic, and relevant interfaces.
Provide technical consulting on event structures, tracking, and data architecture within the personalization framework.
Key Requirements
Experience with Google Cloud Platform (GCP).
Proficient in Python and FastAPI.
Strong skills in dbt, SQL, and BigQuery.
Familiarity with data pipeline frameworks like Airflow.
Understanding of Kubernetes and Docker for containerization.
Nice to Have
Experience with Vertex AI for machine learning initiatives.
Knowledge of Terraform for infrastructure as code.
Experience with CI/CD practices, particularly in GitLab.
Familiarity with project tracking tools like JIRA and Confluence.
Other Details
Team: Personalization Team in a digital product context.
Environment: Data-driven and cloud-centric.
Collaboration Tools: Microsoft Teams and Projektron Time Recording.
Data Engineer with GCP
Data Engineer with GCP