Senior Data Engineer with Databricks and PySpark, Exposure Platform
Project overview
The Exposure Platform is a long-running enterprise data transformation initiative focused on replacing legacy SQL-based logic with modern, scalable data processing solutions. The platform handles large data volumes and supports multiple business domains through a shared and business-critical codebase. The work emphasises maintainability, performance, and strong software engineering practices across the organisation.
Team
You will work in a small cross-functional data engineering team consisting of senior data engineers and software engineers. The team collaborates closely through code reviews, shared design discussions, and agreed engineering standards while contributing to a common codebase. The team is currently expanding and values structured collaboration and technical ownership.
Position overview
We are looking for a Senior Data Engineer to support a large-scale transformation from SQL Server-based systems to a Databricks and Delta Lake platform. You will focus on enterprise-grade data engineering and software development, building maintainable and scalable data processing solutions used by multiple teams. This role is not focused on analytics or reporting, but on core data transformation and platform development.
Technology stack
Databricks, Delta Lake, Python, PySpark, SQL Server, Azure Data Factory, Azure DevOps, Git, CI CD pipelines
Responsibilities
Read, understand, and reason about complex SQL stored procedures and embedded business logic
Redesign and implement existing SQL logic as clean and maintainable Python and PySpark code in Databricks
Develop production-grade transformation code using reusable packages, modules, and components
Apply software engineering best practices, including clean code, object-oriented design, modularisation, and refactoring
Design and evolve data models across Bronze, Silver, and Gold layers
Work with very large data volumes and highly parallel event-driven data transformations
Participate actively in code reviews and technical design discussions
Contribute to the stability, scalability, and long-term maintainability of the shared data engineering codebase
Requirements
Strong experience with Python and PySpark in production data engineering environments
Hands-on experience working with Databricks and Delta Lake
Strong SQL skills with the ability to read, analyse, and translate complex stored procedures
Experience working in large shared codebases beyond notebook-based development
Solid understanding of object-oriented programming and software engineering principles
Experience applying clean code practices, refactoring, and maintainable design
Strong background in data modelling, including transactional and analytical models
Experience working with layered data architectures such as Bronze, Silver, and Gold
Ability to analyse existing code line by line and explain technical and business logic clearly
Nice to have
Experience with Power BI for data consumption or validation
Exposure to enterprise-scale data platforms in complex organisational environments
Senior Data Engineer with Databricks and PySpark, Exposure Platform
Senior Data Engineer with Databricks and PySpark, Exposure Platform