Media Data Engineer (Databricks)

Data

Media Data Engineer (Databricks)

Data
Full-time
Permanent
Mid
Hybrid

Job description

About the Role:

Join our dynamic Central Europe Backend Engineering team as a Media Data Engineer and play a crucial role in shaping our media analytics capabilities! We're seeking a highly skilled and hands-on professional, proficient in Databricks and data engineering, to design, develop, and implement robust data pipelines, specifically processing campaign data from social media platforms like TikTok, YouTube, and Instagram across Central Europe.

This is a high-impact opportunity to contribute to a significant scope: powering media digital products and serving 700 users across the region. Your work will directly enable insightful analytics, drive data-driven decision-making, and help us continuously innovate and optimize our media strategies.

If you thrive in a fast-paced environment, love solving complex data challenges, and are passionate about building scalable, efficient data solutions, we encourage you to apply!

Key Responsibilities:

As a Media Data Engineer, you will:

  • Develop Core Data Solutions: Design and develop high-quality code within Databricks (leveraging PySpark notebooks and SQL) to meet specific business requirements related to media campaign data, comprising at least 70% of your primary responsibilities.

  • Accelerate Development: Utilize existing AI capabilities, such as GitHub Copilot or internal tools like BMAD, to enhance productivity and accelerate development cycles.

  • Data Set Assembly: Assemble and prepare large, complex datasets from social media sources, ensuring they meet critical functional and non-functional business requirements for diverse analytics applications.

  • Architectural Collaboration: Partner with data asset managers, architects, and development leads to ensure all technical data solutions are fit for purpose, align with architectural blueprints, and deliver high-quality, reliable media campaign data.

  • Maintain Standards: Contribute to and actively leverage established coding standards and best practices, ensuring that all services and components are efficient, scalable, and reusable.

  • Cross-Functional Partnership: Collaborate effectively with front-end teams, embracing a "data as a product" mindset to ensure seamless data delivery and integration for media insights.

  • Adhere to Best Practices: Consistently apply sound development practices and adhere to agreed-upon architectural designs throughout the development lifecycle on Google Cloud Platform.

  • Technical Debt Reduction: Proactively identify and define infrastructure revamp initiatives aimed at reducing technical debt and enhancing system longevity.

  • Agile Delivery: As an integral member of a Scrum team, deliver data engineering projects efficiently and in alignment with business priorities and agile methodologies.

  • Operational Support: Provide timely L3 support for existing data processes, thoroughly analyzing bugs and incidents to ensure system stability and performance of media data pipelines.

  • Process Improvement: Identify, design, and implement continuous internal process improvements to streamline and automate backend operations.

  • Continuous Learning: Stay abreast of industry trends, emerging technologies, and best practices in data engineering and media analytics, applying this knowledge to drive innovation, foster improvement, and contribute to team-wide knowledge sharing initiatives.

Job Qualifications

We are looking for a candidate with:

  • PySpark Expertise: Strong proficiency in PySpark for efficient data processing, transformation, and analysis.

  • Databricks Proficiency: Proven hands-on experience with Databricks, including cluster management, notebook development, and job scheduling.

  • SQL Mastery: Advanced proficiency in SQL for complex data manipulation, querying, and performance tuning.

  • Google Cloud Platform (GCP) Expertise: Demonstrable hands-on experience and strong understanding of GCP services relevant to data engineering (e.g., BigQuery, Cloud Storage, Dataflow, Pub/Sub).

  • Data Pipeline Experience: Solid experience in designing, implementing, and optimizing robust data pipelines and ETL/ELT processes using PySpark, Databricks, and GCP data services.

  • Data Modeling Knowledge: Familiarity with data modeling, data warehousing concepts, and dimensional modeling techniques.

  • Data Architecture Understanding: A clear understanding of data integration patterns, data lake architectures, and best practices for ensuring data quality.

  • Social Media Data Experience: Experience working with or integrating data from social media platforms (e.g., TikTok, YouTube, Instagram) is highly desirable.

  • Workflow Orchestration (Plus): Experience with Databricks Workflow management and the orchestration of data pipelines (e.g., using Airflow on GCP) is a significant advantage.

We offer

  • P&G-sized projects and access to world leading IT partners and technologies from Day 1.

  • Wide range of self-development possibilities (training and certifications paths).

  • Competitive starting salary and benefits program (private health care, P&G stock, saving plans, sport cards).

  • Regular salary increases and possible promotions - in line with your results and performance.

  • Opportunity to change role every few years to be in the best place for you and best for P&G.

At Procter & Gamble we embrace a hybrid work model that combines the flexibility of remote work with the collaborative benefits of in-office engagement. Employees can enjoy the option to work from home two days a week while also spending time in the office to foster teamwork and enhance communication.

Watch this video to learn more about our full recruiting process: https://www.youtube.com/watch?v=0bicvbpy0gI

Kindly be advised that at P&G, employment is exclusively extended on the basis of an "Umowa o Pracę" (Full-time Employment Contract). Apply only if you agree to these conditions.

About us

We produce globally recognized brands and we grow the best business leaders in the industry. With a portfolio of trusted brands as diverse as ours, it is paramount our leaders can lead with courage the vast array of brands, categories and functions. We serve consumers around the world with one of the strongest portfolios of trusted, quality, leadership brands, including Always®, Ariel®, Gillette®, Head & Shoulders®, Herbal Essences®, Oral-B®, Pampers®, Pantene®, Tampax® and more. Our community includes operations in approximately 70 countries worldwide.

Visit http://www.pg.com to know more.

We are an equal opportunity employer and value diversity at our company. We do not discriminate against individuals on the basis of race, color, gender, age, national origin, religion, sexual orientation, gender identity or expression, marital status, citizenship, disability, HIV/AIDS status, or any other legally protected factor.

Tech stack

    English

    C1

    Databricks

    master

    Google Cloud Platform

    advanced

    PySpark

    advanced

    SQL

    advanced

Office location

About the company

Procter & Gamble

Procter & Gamble is a global corporation where business and technology converge to make impactful advances. Their Central Europe Technology Hub in Poland, one of the largest IT hubs globally for P&G, employs over 1,100 I...
Company profile

Media Data Engineer (Databricks)

Summary of the offer

Media Data Engineer (Databricks)

Zabraniecka 20, Warsaw
Procter & Gamble
By applying, I consent to the processing of my personal data for the purpose of conducting the recruitment process. Informujemy, że administratorem danych jest Procter & Gamble z siedzibą w Warszawie, ul. Zabraniecka 20 (dalej jako "administrator"). ... MoreThis site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.