All offersKrakówDataData Engineer
Data Engineer
Data
Dotcommunity

Data Engineer

Dotcommunity
Kraków
Type of work
Undetermined
Experience
Mid
Employment Type
Permanent
Operating mode
Office

Tech stack

    SQL
    master
    data processing
    advanced
    Data storage
    advanced
    English
    advanced
    Big Data
    regular
    Spark
    nice to have
    AWS
    nice to have
    Yarn
    nice to have
    Airflow
    nice to have
    Microservices
    nice to have

Job description

We are looking for a candidate to join our client's team as a Data Engineer. They are a software house which operates in the advertising and media industry, located at Armii Krajowej street.

Advertising Solutions is a relatively new area in Media organisation which houses engineering teams for our back-office systems used by various sales organisations at the company. These systems include Rose which is used to book advertising campaigns and Vantage which provides campaign reporting.


ABOUT THE TEAM

They are now looking to establish a team to own and operate the Advertising API that underpins these products, along with others within Schibsted. You will be off to a running start and will have to learn the ropes of existing systems with the help of their established teams, plan and execute on the hand-over from the current team in London. Expect initial travels to London and/or hosting the London team members locally.

Once you learn the system you and your teammates will continuously work on its technical evolution, scaling and simplification. You will be expected to be an active participant when deciding how to implement new features together with the neighbouring teams that depend on you for their work.


SKILLS & REQUIREMENTS

They handle more than 250000 campaigns, 100000 advertisers, and more than 140 publishers across 20 different countries. About 1.5 TB of data is processed every day using more than 100 Spark jobs.

Their data pipeline is built on top of AWS EMR, Spark, Yarn, Airflow and microservices based on Twitter Finatra framework. Apache Avro and Parquet are used for data serialization and schema definition/evolution.

They don’t expect you to have experience with all the technologies that they use but it would be good if you know at least some of them or have worked with similar ones.


  • You should be deeply interested in data processing and data storage in general. You need to be well-versed in the area of databases – mainly SQL, optimization of them, schematic design and indexes.
  • You should have an understanding of such big data concepts as map reduce, CAP theorem and big table.
  • This role does not require cooperation with data science, at least at this point. And our scale is substantial but rather not huge.