Databricks Data Engineer
Responsibilities:
Developing robust batch and Structured Streaming workflows.
Partnering with BI and Analytics teams on data modeling tasks.
Managing and refining data structures (partitioning, Delta Tables, tuning).
Handling large-scale data processing using PySpark and Spark SQL.
Integrating various sources including APIs, databases, and file systems.
Delivering ETL/ELT pipelines within the Databricks environment.
Overseeing cost efficiency and performance of Databricks clusters.
Applying Lakehouse and Delta Lake architectural solutions.
Managing CI/CD pipelines across DEV, TEST, and PROD stages.
Ensuring high standards through data quality tests and monitoring.
Requirements:
4+ years of professional background in Data Engineering.
Advanced knowledge of Spark SQL, PySpark, and Apache Spark.
Expertise in Delta Lake and Lakehouse frameworks.
Practical, hands-on experience with the Databricks platform.
Proven track record in ETL/ELT design and implementation.
Strong proficiency in SQL and version control (Git).
Background in processing and managing massive datasets.
Familiarity with cloud providers (AWS, Azure, or GCP).
Understanding of Data Quality and Governance principles.
Databricks Data Engineer
Databricks Data Engineer