Senior Data Engineer
PROJECT DETAILS:
Contract: B2B
Project length: 3 months + possibility of an extension
Start date: ASAP
Rate up to 220 PLN/h nett + VAT
Working model: hybrid, 3 days from the Cracow office, 2 days remotely
Possible strong part-time
AI / Data Engineer – Python, NLP, Spark, ML Pipelines
We are looking for an AI / Data Engineer to join an international project focused on building and improving AI-driven data solutions for large-scale web content processing, attribute extraction and market expansion.
In this role, you will work at the intersection of data engineering, machine learning and applied AI. You will help design, build and evaluate data pipelines, fine-tune lightweight ML models, and support the development of internal AI research agents used across different geographic markets and data domains.
Your responsibilities will include:
Building and optimising Spark pipelines for large-scale web content ingestion and processing.
Using Python, including Polars and/or Pandas, for data processing, analysis and pipeline development.
Fine-tuning lightweight ML models for task-specific attribute extraction.
Preparing training data, managing data quality and evaluating model performance end-to-end.
Working with NLP techniques to extract, classify and reason over information from web content.
Expanding an internal AI research agent to new geographic markets and adapting logic to local data conditions.
Supporting evidence collection and reasoning logic for new place-related attributes.
Evaluating ML systems across different locales, domains and data sources.
Working with pipeline orchestration, optimisation and multi-source ingestion processes.
Potentially using Scala and Spark in data engineering tracks.
What we are looking for:
Strong Python skills, especially with Polars and/or Pandas.
Experience with NLP and fine-tuning lightweight ML models.
Practical experience in designing, building and evaluating data pipelines.
Experience with Spark and, ideally, Scala.
Familiarity with agent frameworks, especially LangGraph.
Understanding of data quality, model evaluation and performance measurement.
Ability to adapt ML/data solutions to different countries, languages and data domains.
Experience with pipeline orchestration and optimisation for large-scale data ingestion.
A hands-on, problem-solving mindset and ability to work in a fast-moving environment.
Project details:
Onboarding: 2 weeks in Malmö (fully covered by the Client)
Notice period: ideally around 1 week maximum
Recruitment process: 1 technical and 1 non-technical interview, approximately 60 minutes each.
Senior Data Engineer
Senior Data Engineer