#1 Job Board for tech industry in Europe

  • Job offers
  • All offersLondonDataHead of Data Engineering
    Head of Data Engineering
    Data
    SelfDecode

    Head of Data Engineering

    SelfDecode

    London
    Type of work
    Undetermined
    Experience
    Senior
    Employment Type
    Permanent
    Operating mode
    Remote

    Tech stack

      Python

      advanced

      AWS

      advanced

      Scala

      advanced

      Airflow

      advanced

      Apache Spark

      advanced

      Apache Hadoop

      regular

      Apache Kafka

      regular

      NoSQL

      regular

      hBase

      regular

      Cassandra

      regular

    Job description

    Online interview
    Friendly offer
    About Us:

    SelfDecode is a well-funded biotech startup in the personalized health space. We build software to help interpret people's genetics, lab tests, and symptoms in order to give personalized health recommendations. Our primary goal is to give people the tools they need to live healthier and better lives.

    • We are a flat organization and prioritize efficiency.
    • We work as a team and every input and suggestion is taken into account, no matter who it comes from.
    • We thrive on open communication and dedication. We are a meritocracy and people who show good abilities or skills can move up in the organization fast, get raises, etc…
    • We expect people to work full-time without side gigs.
    • We expect the applicant to have a long-term relationship with our company
    • We expect employees to be proactive and autonomous.
    • We do not micromanage.
    • Dishonesty is not tolerated at all, and we thrive on trust.
    • When you're working, we expect you to work.
    • We emphasize skills & abilities rather than formal education.

    Job Description:

    As Head of Data Engineering, you will lead a team of software engineers responsible for the design, development, and operation of all back-end services. This includes data integration and ingestion, processing, and the application of machine learning model algorithms on large, complex, biological data sets. Our product engineering teams use these back-end services to build and deliver cutting-edge genomic analysis to our customers. We have decided to modernize and re-architect our core data platform and move from a batch-based processing system to a continuous event-based streaming system. The core technologies of the platform will include AWS, Airflow, HDFS/Hadoop, Spark, Kafka, NoSQL (Hbase/Cassandra), and Clickhouse. We primarily use Python.
     
    The role reports to the CTO, where you will have the opportunity to make a significant impact on the company's success at a critical stage in our growth. This is an incredible opportunity to discover the world of real-time data processing and the use of artificial intelligence at scale. You will play an active role in leading our next-generation platform's design, development, and deployment, which is critical to our success.
     
    We are looking for a strong engineering leader who is a strategic and innovative problem-solver; someone with a passion for applying technology to solve real-world customer problems at scale. The ideal candidate is passionate about building high-performance teams who are focused on quality and innovation, and who demonstrates excellent organizational and communication skills with other engineers and leaders throughout the company.

    Responsibilities:

    • Lead and develop a team of talented data/software engineers to design, plan, develop, and deploy improvements to back-end platform services related to data ingestion, data processing, and analytics.
    • Create a culture of working with big and sensitive data.
    • Design the architecture and then lead the implementation of scalable data processing systems.
    • Plan the development of a data platform as a SaaS product.
    • Collaborate broadly across the organization and with senior leadership to drive team and individual performance focused on clear outcomes and team OKRs.
    • Evaluate resource costs, determine the composition of the required team, top-level roadmap, and perform project risk assessment.
    • Foster the adoption of best engineering practices across all aspects of software development to build, deploy, test, and release large scale services with quality and agility, while maintaining our current platform to continue to meet customer commitments.
    • Facilitate overall technology strategy, quarterly, and yearly goals, drive engineering best practices, and take ownership of delivering on core outcomes.

    Required Skills & Experience:

    • 7+ years of extensive experience in Data technologies across streaming and batch-oriented realms, cutting across data acquisition, storage, processing, and consumption patterns in operational and analytical domains, as well as expertise in cloud-related data services (AWS / Azure / GCP).
    • 5+ years leading highly technical and high performance engineering teams, with experience in people management (hiring and layoff) and performance management (coaching & mentoring). Have led technical Architecture, Design, and Delivery of Big Data and Cloud Data solutions (AWS, Azure, GCP) for multiple projects. Proven track record of architecting, designing, and delivering complex Big Data and Cloud Data projects (AWS, Azure, GCP) to solve problems at scale, especially distributed data platforms (Hadoop/Kafka).
    • Expert in distributed data processing frameworks like Spark, Storm, Flink, and Parquet across batch and streaming realms; expert in programming languages, preferably Scala, with Python secondary, and expert at distributed messaging/streaming frameworks like Kafka, Pulsar, Google Pub/Sub, Azure EventHub, and AWS Kinesis.
    • Experience with NoSQL databases (Cassandra/HBase/MongoDB/ElasticSearch/Neo4j) and scalable, analytical data stores like Snowflake, BigQuery, Redshift, and Teradata.
    • Professional experience with workflow management (Nextflow, Snakemake, Airflow, etc.).
    • Deep knowledge of scalable data models, queries, and operations that address various consumption patterns, including random-access and sequential-access, and necessary optimisations like bucketing, aggregating, and sharding.
    • Experience in performance-tuning, optimization, and scaling solutions from a storage/processing standpoint.
    • Experience with setting up data engineering practices across architecture, design, coding, quality assurance, and deployment of such, using industry-standard DevOps practices for CI/CD, and leveraging tools like Jenkins/Bamboo, Maven, Junit, SonarQube, Terraform (one-click infrastructure setup), Kubernetes, and containerisation.
    • Solid understanding of Data Governance, Data Security, Data Cataloguing, and Data Lineage concepts (experience with tools like Collibra in these areas is preferred).
    • Passion for recruiting, developing, mentoring, and retaining a world-class engineering team.
    • Lean-thinking mindset, comfortable with Agile planning and estimation rituals, flexible, and able to thrive in a fast-paced, innovative young company.
    • Excellent written and verbal English-language communication skills, with the ability to adapt the level of detail to various audiences, and able to concisely explain technical concepts to business stakeholders.

    Plusses:

    • Knowledge of statistical techniques
    • Bioinformatics knowledge
    • Strong math ability

    Your Time Zone:

    • Any

    Important: Share your LinkedIn profile. Having an up-to-date LinkedIn profile will make you a more competitive applicant!  If you're up for the challenge, then we invite you to apply!  
     
    Questions?
    If you have any questions, you can email us at recruiting@selfdecode.com
     
    Note: Please complete the application and pre-screening within one week of starting.

    Check similar offers

    Senior Data Engineer

    Senior Data Engineer

    New
    Altimetrik Poland
    6.1K - 7.38K USD
    Kraków
    , Fully remote
    Fully remote
    ETL
    SQL
    Python
    Data Engineer (Azure & Python)

    Data Engineer (Azure & Python)

    New
    emagine Polska
    7.05K - 7.99K USD
    Warszawa
    , Fully remote
    Fully remote
    Python
    CI/CD pipelines
    Azure Databricks
    Data Engineer

    Data Engineer

    New
    Craftware
    6.84K - 8.97K USD
    Rzeszów
    , Fully remote
    Fully remote
    SQL database
    Microsoft Azure Cloud
    Data Factory
    Senior Data Engineeer

    Senior Data Engineeer

    New
    Acaisoft
    5.6K - 8.14K USD
    Warszawa
    , Fully remote
    Fully remote
    Airflow
    ETL
    AWS
    Senior Data Engineer

    Senior Data Engineer

    New
    ITMAGINATION
    4.83K - 6.61K USD
    Warszawa
    , Fully remote
    Fully remote
    Apache Spark
    Google Cloud Platform
    SQL