Data Engineer - PySpark
Józefa Kustronia 4, Kraków
Hirexa
Experience:
• 5–10 years of hands-on experience in the data analytics space as a data engineer. Familiar with ETL, DQ, DM, and reject and recycling concepts.
• A significant portion of this experience should involve building data analytics solutions in a big data environment using Hadoop clusters or cloud environment.
Technical Skills:
• Extensive experience in building data pipelines using Spark, particularly PySpark.
• Candidates must have a minimum of 3 years of hands-on experience in coding with PySpark applications using RRDs, DataFrames & datasets and NOT Spark SQLs.
• Candide should has developed numerous spark application for various use cases of processing large volumes of data, used performance tuning, works extensively on complex transformation skills using group, window,
• Candidate who has participated in PySpark code hackathon
• Please apply only if you are confident in writing decent PySpark code during the interview.
• Strong proficiency in Python.
• Write clean, efficient, and reusable Python code
• Identify, troubleshoot, and fix bugs in programs to ensure code quality
• Creating scripts and tools to automate tasks and processes
Note: This role requires advanced Python coding skills, not just basic knowledge or simple coding experience. Candidates will be required to demonstrate their Python skills during the interview.
Familiarity or exposure to tools such as Airflow, Databricks, and Azure is a plus. The primary focus is on Spark, PySpark, and data engineering.
Additional Skills:
• Strong problem-solving and analytical abilities.
• Ability to comprehend business requirements and translate them into technical solutions.
• Good communication and collaboration skills to work effectively with team members.
• Familiarity with the software development lifecycle, including CI/CD pipelines.
• Experience working in an Agile environment.
Data Engineer - PySpark
Data Engineer - PySpark
Józefa Kustronia 4, Kraków
Hirexa