#1 Job Board for tech industry in Europe

Data Engineer (Scala, Spark & Azure)
Scala

Data Engineer (Scala, Spark & Azure)

Trójmiasto
Type of work
Full-time
Experience
Senior
Employment Type
B2B
Operating mode
Remote

Tech stack

    Spark

    advanced

    JSON

    advanced

    SQL

    advanced

    Scala

    advanced

    Microsoft Azure

    regular

    Azure

    regular

    Python

    regular

Job description

Online interview

We are seeking a highly skilled Senior Data Engineer with deep expertise in Scala, Spark, and Microsoft Azure to join our dynamic team. This role offers an exciting opportunity to lead data engineering initiatives, optimize complex pipelines, and collaborate with cross-functional teams to deliver high-quality data solutions.


Technical Requirements

7+ years of professional experience in Data Engineering.

6-7 years of hands-on experience with Scala, Spark, and Azure in Data Engineering projects.

Proficient in programming languages such as Python and Scala.

Expertise in data pipeline tools and processes.

Strong knowledge of SQL and both relational/non-relational databases.

Familiarity with GitHub for version control and CI/CD workflows.

Solid understanding of Microsoft Azure data services.

Experience with JSON-based configurations for managing multiple data zones.

Knowledge of containerization and orchestration tools (e.g., Docker, Kubernetes) is a plus.

Strong problem-solving skills, fluent English, and excellent communication abilities.


Main Responsibilities


Data Pipeline Maintenance: Continuously monitor and maintain data pipelines for ingesting and transforming data using Scala and SQL on Spark. Diagnose and resolve errors and performance bottlenecks, addressing data discrepancies, ambiguities, and inconsistencies as needed.


Technical Support and Version Control: Provide technical support for data analysis while managing source code and configuration artifacts via GitHub. Deploy code artifacts through GitHub Workflows/Actions.


Technical Leadership: Offer hands-on technical guidance and leadership in developing Spark-based data processing applications using Scala, with a focus on Microsoft Azure Synapse Spark Runtime.


Pipeline Optimization: Design and enhance data pipelines to streamline processing across various stages of the Medallion architecture using Azure Synapse Pipelines.


Data Management: Oversee data ingestion processes, enforce data quality checks using tools like DQ, and manage validation and error-handling workflows.


Configuration Management: Develop and manage configuration settings using JSON-based configurations (e.g., ApplicationConfig, TableConfig) for multiple data zones.


Collaboration: Work closely with data scientists, analysts, and cross-functional teams to ensure smooth integration of data engineering efforts with marketing and business strategies.


Logging and Auditing: Implement and manage logging, auditing, and error-handling practices to maintain data processing integrity, leveraging tools like Azure Log Analytics and KQL queries where applicable.


Testing and Quality Assurance: Conduct unit testing with tools like ScalaTest and maintain rigorous data quality checks to ensure dependable processing outcomes.


Required Technical Skills

  • Data Engineering
  • Python, Scala, Spark
  • SQL
  • Microsoft Azure
  • GitHub
  • JSON