#1 Job Board for tech industry in Europe

Senior Data Engineer / Data Architect
Data

Senior Data Engineer / Data Architect

Kraków
Type of work
Undetermined
Experience
Senior
Employment Type
B2B
Operating mode
Remote

Tech stack

    Databricks

    advanced

    Spark

    advanced

    Azure

    advanced

    Python

    advanced

    Databases

    regular

    Kafka

    regular

    Event Hub

    regular

    Kubernetes

    regular

Job description

Online interview
 
Description

We are on a mission to make science open so everyone can live healthy lives on a healthy planet


Who we are

Frontiers is an award-winning open science platform and leading open access scholarly publisher.

We are one of the largest and most cited publishers globally. To date, our 200,000 freely available research articles have received more than 1 billion views and downloads and 2 million citations. Our journals span science, health, humanities and social sciences, engineering, and sustainability. And we continue to expand into new academic disciplines so more researchers can publish open access.

Be part of the publishing revolution and help us transform the way research is published, evaluated, and communicated to the world.

The Role

To empower scientists and radically improve how science is published, evaluated and disseminated to researchers, innovators and the public, we have built our own state-of-the-art Artificial Intelligence Review Assistant (AIRA). Data is at the heart of AIRA in the form of AIRA Knowledge – a rich graph of academic knowledge such as scientific publications, citation relationships between those publications, as well as authors, institutions and fields of research. This serves as the basis of all the AI/ML models used by our reviewer recommendation service and our quality checks.

We are now looking for a passionate Senior Data Engineer / Data Architect to join our growing team and help us evolve AIRA Knowledge.


Key Responsibilities

As a Senior Data Engineer, you will responsible for optimizing or even re-designing AIRA Knowledge’s data architecture to support our next generation of product features and data initiatives. You will be expanding and optimizing our data pipeline architecture, as well as optimizing data flow and collection for AIRA.

The ideal candidate is an experienced data pipeline builder and data wrangler who enjoys optimizing existing data systems or building them from the ground up. You will work together with other data engineers, software developers, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects.


  • Design and develop scalable end-to-end processes and pipelines to consume, integrate, and analyze complex data from different data sources;
  • Assemble large, complex data sets that meet functional / non-functional business requirements;
  • Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability etc.
  • Build the solution required for optimal extraction, transformation, and loading of data from a wide variety of data sources using Azure Big Data, AI, ML and Analytics technologies.
  • Employ strong engineering mind-set in design and development of automated monitoring, alerting, and self-healing features.
  • Work together with other data engineers, software developers, data analysts and data scientists to strive for greater functionality in platform
  • Proactively identify opportunities for improving the data management standards, guidelines, and policies.

About You

  • Demonstrated experience in designing and developing data ingestion, data processing and analytical pipelines for big data, relational databases, NoSQL and data warehouse solutions.
  • Proven experience with Enterprise Data Platform architecture, Event-Driven Architecture, Data Streaming, Software Design Patterns, and Best Practices
  • Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
  • Strong analytic skills related to working with unstructured datasets.
  • Build processes supporting data transformation, data structures, metadata, dependency and workload management.
  • A successful history of manipulating, processing and extracting value from large disconnected datasets.
  • Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores is nice to have.
  • Robust analytical, critical thinking, and creative problem solving skills
  • Accustomed to fast-paced environments with simultaneous, high-priority tasks
  • Good written and verbal communication skills with the ability to clearly articulate ideas to both technical and non-technical audiences (working language is English)
  • Technologies:
    • Hands-on experience implementing data migration, streaming, and processing using cloud services, preferably in Azure or AWS.
    • Advanced experience with analytics platforms: Databricks, Spark.
    • Experience Azure Data Lake, Azure Delta Lake, Azure Data Factory, Azure Functions, Azure Synapse, Azure SQL, Azure Stream Analytics, Azure Analysis Service, ML Studio, Azure ML Studio.
    • Experience with relational SQL and NoSQL databases.
    • Knowledge of stream-processing systems: Event Hub, Kafka and Confluent.
    • Knowledge of object-oriented, functional, and scripting languages such as Python, R, Scala, C++.
    • Knowledge of reporting tools: PowerBI, Tableau.
    • Knowledge of CD/CD tools and infrastructure: Azure DevOps and Kubernetes.