Senior/Lead Data Software Engineer (Python, Spark, Azure)
We are seeking a Senior/Lead Data Software Engineer to join our team working on a scalable, ML-ready platform that enhances portfolio model development and deployment with advanced data governance and AI capabilities.
You will play a key role in migrating from an IaaS Big Data platform to Azure-native Databricks, optimizing data workflows and improving data quality. Join us to contribute to innovative solutions that boost client services and regulatory compliance.
Responsibilities
Migrate and optimize over 500 data jobs using Azure Databricks optimization techniques
Manage and process 12 TB of data efficiently across platforms
Tune machine learning models for Azure environments using Java Spark and Delta tables
Update and maintain libraries to address security vulnerabilities
Develop and maintain ETL/ELT pipelines using PySpark and related technologies
Collaborate with cross-functional teams to integrate GenAI capabilities into data workflows
Monitor data quality and implement improvements to ensure accuracy and reliability
Automate deployment and operational tasks using Terraform and GitLab CI/CD
Support data governance initiatives to comply with regulatory standards
Troubleshoot and resolve performance issues in data processing systems
Document system processes and provide technical guidance to junior engineers
Implement best practices for code quality and data security
Participate in code reviews and knowledge sharing sessions
Optimize costs associated with data storage and processing
Requirements
Proficiency in Python and Spark with at least 3 years in data engineering roles
Strong experience with Azure Databricks and PySpark
Proven expertise in designing and implementing ETL/ELT solutions
Experience migrating big data platforms to Azure-native services
Proficiency with Delta tables for model tuning
Knowledge of data governance and regulatory compliance frameworks
Familiarity with Docker, Kubernetes (AKS), and Terraform for infrastructure automation
Ability to manage large data volumes with high efficiency
Excellent problem-solving and analytical skills
Strong communication and collaboration abilities
English proficiency at B2 level or higher
We offer/Benefits
We gather like-minded people:
Engineering community of industry professionals
Friendly team and enjoyable working environment
Flexible schedule and opportunity to work remotely within Poland
Chance to work abroad for up to 60 days annually
Business-driven relocation opportunities
We provide growth opportunities:
Outstanding career roadmap
Leadership development, career advising, soft skills, and well-being programs
Certification (GCP, Azure, AWS)
Unlimited access to LinkedIn Learning, Get Abstract, Cloud Guru
English classes
We cover it all:
Stable income (Employment Contract or B2B)
Participation in the Employee Stock Purchase Plan
Benefits package (health insurance, multisport, shopping vouchers)
Strategically located offices featuring entertainment and relaxation zones, table tennis and football, free snacks, fantastic coffee, and more
Referral bonuses
Corporate, social and well-being events
Please, note:
The set of bonuses might vary based on the role you apply for – specifics will be discussed with our recruiter during the general interview.
We will reach out to selected candidates exclusively.
EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.
Senior/Lead Data Software Engineer (Python, Spark, Azure)
Senior/Lead Data Software Engineer (Python, Spark, Azure)