#1 Job Board for tech industry in Europe

All offers London DataHead of Data Engineering

Head of Data Engineering

Offer expired

Data

Head of Data Engineering

SelfDecode

London

Type of work

Undetermined

Experience

Senior

Employment Type

Permanent

Operating mode

Remote

Tech stack

Python

advanced

AWS

advanced

Scala

advanced

Airflow

advanced

Apache Spark

advanced

Apache Hadoop

regular

Apache Kafka

regular

NoSQL

regular

hBase

regular

Cassandra

regular

Job description

Online interview

Friendly offer

About Us:

SelfDecode is a well-funded biotech startup in the personalized health space. We build software to help interpret people's genetics, lab tests, and symptoms in order to give personalized health recommendations. Our primary goal is to give people the tools they need to live healthier and better lives.

We are a flat organization and prioritize efficiency.
We work as a team and every input and suggestion is taken into account, no matter who it comes from.
We thrive on open communication and dedication. We are a meritocracy and people who show good abilities or skills can move up in the organization fast, get raises, etc…
We expect people to work full-time without side gigs.
We expect the applicant to have a long-term relationship with our company
We expect employees to be proactive and autonomous.
We do not micromanage.
Dishonesty is not tolerated at all, and we thrive on trust.
When you're working, we expect you to work.
We emphasize skills & abilities rather than formal education.

Job Description:

As Head of Data Engineering, you will lead a team of software engineers responsible for the design, development, and operation of all back-end services. This includes data integration and ingestion, processing, and the application of machine learning model algorithms on large, complex, biological data sets. Our product engineering teams use these back-end services to build and deliver cutting-edge genomic analysis to our customers. We have decided to modernize and re-architect our core data platform and move from a batch-based processing system to a continuous event-based streaming system. The core technologies of the platform will include AWS, Airflow, HDFS/Hadoop, Spark, Kafka, NoSQL (Hbase/Cassandra), and Clickhouse. We primarily use Python.

The role reports to the CTO, where you will have the opportunity to make a significant impact on the company's success at a critical stage in our growth. This is an incredible opportunity to discover the world of real-time data processing and the use of artificial intelligence at scale. You will play an active role in leading our next-generation platform's design, development, and deployment, which is critical to our success.

We are looking for a strong engineering leader who is a strategic and innovative problem-solver; someone with a passion for applying technology to solve real-world customer problems at scale. The ideal candidate is passionate about building high-performance teams who are focused on quality and innovation, and who demonstrates excellent organizational and communication skills with other engineers and leaders throughout the company.

Responsibilities:

Lead and develop a team of talented data/software engineers to design, plan, develop, and deploy improvements to back-end platform services related to data ingestion, data processing, and analytics.
Create a culture of working with big and sensitive data.
Design the architecture and then lead the implementation of scalable data processing systems.
Plan the development of a data platform as a SaaS product.
Collaborate broadly across the organization and with senior leadership to drive team and individual performance focused on clear outcomes and team OKRs.
Evaluate resource costs, determine the composition of the required team, top-level roadmap, and perform project risk assessment.
Foster the adoption of best engineering practices across all aspects of software development to build, deploy, test, and release large scale services with quality and agility, while maintaining our current platform to continue to meet customer commitments.
Facilitate overall technology strategy, quarterly, and yearly goals, drive engineering best practices, and take ownership of delivering on core outcomes.

Required Skills & Experience:

7+ years of extensive experience in Data technologies across streaming and batch-oriented realms, cutting across data acquisition, storage, processing, and consumption patterns in operational and analytical domains, as well as expertise in cloud-related data services (AWS / Azure / GCP).
5+ years leading highly technical and high performance engineering teams, with experience in people management (hiring and layoff) and performance management (coaching & mentoring). Have led technical Architecture, Design, and Delivery of Big Data and Cloud Data solutions (AWS, Azure, GCP) for multiple projects. Proven track record of architecting, designing, and delivering complex Big Data and Cloud Data projects (AWS, Azure, GCP) to solve problems at scale, especially distributed data platforms (Hadoop/Kafka).
Expert in distributed data processing frameworks like Spark, Storm, Flink, and Parquet across batch and streaming realms; expert in programming languages, preferably Scala, with Python secondary, and expert at distributed messaging/streaming frameworks like Kafka, Pulsar, Google Pub/Sub, Azure EventHub, and AWS Kinesis.
Experience with NoSQL databases (Cassandra/HBase/MongoDB/ElasticSearch/Neo4j) and scalable, analytical data stores like Snowflake, BigQuery, Redshift, and Teradata.
Professional experience with workflow management (Nextflow, Snakemake, Airflow, etc.).
Deep knowledge of scalable data models, queries, and operations that address various consumption patterns, including random-access and sequential-access, and necessary optimisations like bucketing, aggregating, and sharding.
Experience in performance-tuning, optimization, and scaling solutions from a storage/processing standpoint.
Experience with setting up data engineering practices across architecture, design, coding, quality assurance, and deployment of such, using industry-standard DevOps practices for CI/CD, and leveraging tools like Jenkins/Bamboo, Maven, Junit, SonarQube, Terraform (one-click infrastructure setup), Kubernetes, and containerisation.
Solid understanding of Data Governance, Data Security, Data Cataloguing, and Data Lineage concepts (experience with tools like Collibra in these areas is preferred).
Passion for recruiting, developing, mentoring, and retaining a world-class engineering team.
Lean-thinking mindset, comfortable with Agile planning and estimation rituals, flexible, and able to thrive in a fast-paced, innovative young company.
Excellent written and verbal English-language communication skills, with the ability to adapt the level of detail to various audiences, and able to concisely explain technical concepts to business stakeholders.

Plusses:

Knowledge of statistical techniques
Bioinformatics knowledge
Strong math ability

Your Time Zone:

Important: Share your LinkedIn profile. Having an up-to-date LinkedIn profile will make you a more competitive applicant! If you're up for the challenge, then we invite you to apply!

Questions?

If you have any questions, you can email us at recruiting@selfdecode.com

Note: Please complete the application and pre-screening within one week of starting.

Check similar offers

Senior Data Engineer

New

Altimetrik Poland

6.1K - 7.38K USD

Kraków

, Fully remote

Fully remote

ETL

SQL

Python

Data Engineer (Azure & Python)

New

emagine Polska

7.05K - 7.99K USD

Warszawa

, Fully remote

Fully remote

Python

CI/CD pipelines

Azure Databricks

Data Engineer

New

Craftware

6.84K - 8.97K USD

Rzeszów

, Fully remote

Fully remote

SQL database

Microsoft Azure Cloud

Data Factory

Senior Data Engineeer

New

Acaisoft

5.6K - 8.14K USD

Warszawa

, Fully remote

Fully remote

Airflow

ETL

AWS

Senior Data Engineer

New

ITMAGINATION

4.83K - 6.61K USD

Warszawa

, Fully remote

Fully remote

Apache Spark

Google Cloud Platform

SQL