Senior Data Engineer / Data Scientist
Project info:
We are seeking a skilled Senior Data Engineer / Data Scientist to join our client's team.
We are seeking highly skilled Data Engineers and Data Scientists to strengthen Data Mastering and Entity Resolution capabilities within our client's platform. The ideal candidate will possess deep expertise in semantic matching, data normalization, entity resolution, and large-scale data processing. This role is critical in enhancing mastering accuracy across data ingestion, matching, merging, and rollup processes while ensuring scalability and data quality across enterprise platforms.
Responsibilities:
Data Mastering & Entity Resolution
* Design, develop, and optimize data mastering frameworks and entity resolution solutions.
* Improve matching accuracy across ingestion, merge, and rollup layers using advanced algorithms and AI/ML techniques.
* Build and enhance semantic matching capabilities for structured and unstructured data sources.
* Develop strategies for data standardization, normalization, deduplication, and golden record creation.
Machine Learning & AI
* Leverage machine learning, NLP, embeddings, and AI-based approaches to improve semantic matching and record linkage.
* Develop and deploy models for entity identification, clustering, similarity scoring, and data quality enhancement.
* Evaluate and optimize model performance using precision, recall, F1-score, and other relevant metrics.
Data Engineering & Platform Development
* Design and build scalable, production-grade data pipelines for processing large volumes of data.
* Implement data ingestion, transformation, validation, and mastering workflows.
* Ensure data quality, governance, lineage, and observability across the platform.
* Collaborate with engineering teams to operationalize AI/ML models within production environments.
Cross-Functional Collaboration
* Work closely with product, engineering, architecture, and business teams to define mastering requirements and success metrics.
* Drive continuous improvements in data quality, matching accuracy, and platform scalability.
* Contribute to technical design reviews, architecture discussions, and best practices.
Job requirements:
8+ years of experience in Data Engineering, Data Science, Machine Learning, or related disciplines.
Strong hands-on experience building and maintaining scalable data pipelines and data processing solutions in production environments.
Proven expertise in Data Mastering, Entity Resolution, Record Linkage, Data Matching, or related data quality domains.
Experience applying Machine Learning, NLP, embeddings, and AI-based techniques to solve semantic matching, deduplication, and data normalization challenges.
Strong proficiency in Python and SQL, with experience developing data-intensive applications and analytical solutions.
Experience working with large-scale structured and unstructured datasets.
Hands-on experience designing, developing, and optimizing matching algorithms, similarity scoring models, and golden record creation processes.
Experience implementing data ingestion, transformation, validation, and mastering workflows.
Solid understanding of data quality frameworks, data governance, lineage, and observability principles.
Experience developing, evaluating, and deploying ML models, including performance measurement using metrics such as precision, recall, and F1-score.
Familiarity with distributed data processing frameworks and modern data platform technologies.
Strong analytical and problem-solving skills, with the ability to translate business requirements into scalable technical solutions.
Experience collaborating effectively with cross-functional teams, including Product, Engineering, and Business stakeholders.
Preferred requirements:
Experience with cloud platforms such as AWS, Azure, or GCP.
Experience with Spark, Databricks, or similar large-scale data processing technologies.
Familiarity with vector embeddings, semantic search, similarity matching, or knowledge graph concepts.
Experience operationalizing ML models and supporting MLOps practices in production environments.
Experience working with customer, product, supplier, or reference data mastering use cases.
Must possess a legal work permit in Poland
Benefits:
General benefits - depends on the form of employment
Hybrid work model & remote work
Attractively located office with collaboration spaces
Onsite parking space for employees
Referral program with financial bonus
Life Insurance
Budget for development (including language courses and others), clear career path with the possibility to gain experience in international environment
Access to internal Learning Platform with multiple trainings oriented for professional growth
Lifestyle benefits:
Access to MyBenefit platform (Multisport included)
Team Building activities
Charity initiatives
Working environment promoting diversity and inclusion
Health benefits:
Private medical care - Platinum Package

Tenarai
Tenarai is a leading global provider of technology solutions and services, specializing in digital transformation, software engineering, cloud services, and enterprise software solutions to empower businesses across vari...Senior Data Engineer / Data Scientist
Senior Data Engineer / Data Scientist