AI Data Engineer (Python & LLMs)

3 272 - 3 818 USDNet per month - B2B
AI/ML

AI Data Engineer (Python & LLMs)

AI/ML
Książęca 4, Warszawa

Tax Insight

Full-time
B2B
Mid
Hybrid
3 272 - 3 818 USDNet per month - B2B

Job description

We are automating tax reporting and accounting processes using AI. You will build pipelines that scrape data from complex external sources and extract high-precision structured data from financial documents (PDFs, invoices, tax forms) using Python and LLMs. 

You will join a tight-knit, fast-moving engineering team in a scaling company, where your code will have an immediate impact on our products. 

You will bridge the gap between law and code, working directly with tax and legal experts to transform complex tax regulations into precise, compliant algorithmic solutions. 

 

Requirements: 

Core (Must-Haves): 

- Python Mastery: 2+ years of professional experience writing clean, maintainable code. 

 - Advanced Web Scraping: Proficiency with Playwright (preferred) or Selenium., including experience bypassing anti-bot measures (e.g. CAPTCHAs, rate limits, fingerprinting). Some experience with "BeautifulSoup" and "requests" libraries is also required. 

- Data Extraction: Strong hands-on experience with Regex, and other techniques to transform messy, real-world data (raw HTML, malformed JSON files, OCR text) into structured formats. 

- Data Cleaning: Proficiency with Regex, Pandas, and NumPy for data cleaning and preprocessing. 

- Database Fundamentals: Solid understanding of SQL or NoSQL databases. 

- LLM Integration: Experience  in prompt engineering for LLM APIs, including few-shot prompting, defining output schemas, and handling/parsing responses programmatically. 

- PDF handling:  Hands-on experience with libraries like pdfplumber, docling or pymupdf to extract data from text-based PDF files of varying qualities. 

- Mentorship: Ability to perform high-quality code reviews and guide junior developers when necessary. 

- Task Delegation: Ability to break down complex architectural features into clear, manageable tasks for the junior developer to execute. 

- Task Planning: Ability to decompose project stages into weekly sprints, ensuring the team stays unblocked and delivers features in a timely manner. 

- English Proficiency (B2+):  Ability to analyze international financial documents, technical documentation and prompt LLMs effectively. 

  

Preferred: 

- Vector Search & RAG: Experience with one of the vector databases (e.g., Qdrant, Milvus) and Retrieval-Augmented Generation workflows. 

 - Async & Concurrency: Experience with asynchronous programming (asyncio) and concurrency, specifically for efficient scraping. 

 

Nice to have: 

 - LLM Optimization: Experience with Batch LLM APIs,  concurrent requests, and caching strategies to optimize costs and latency. 

 - PostgreSQL: Specific experience with Postgres. 

 - Frontend Basics: Familiarity with JavaScript or TypeScript  for building internal GUIs. 

- Domain Expertise: Background in Fintech, Regtech, or TaxTech, with knowledge of local regulations (e.g., Polish VAT or KSeF). 

Tech stack

    English

    B2

    Web scraping

    advanced

    Python

    advanced

    Regex

    regular

    LLM / OpenAI API

    regular

    PDF handling

    junior

    database fundamentals

    junior

Office location

AI Data Engineer (Python & LLMs)

3 272 - 3 818 USDNet per month - B2B
Summary of the offer

AI Data Engineer (Python & LLMs)

Książęca 4, Warszawa
Tax Insight
3 272 - 3 818 USDNet per month - B2B
By applying, I consent to the processing of my personal data for the purpose of conducting the recruitment process. Informujemy, że administratorem danych jest TAX INSIGHT sp. z o.o. z siedzibą w warszawie, ul. Książęca 4 (dalej jako "administrator")... MoreThis site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Check similar offers
Addepto

Addepto

Remote

Remote

3 436 - 5 268USD/month
Machine Learning
LLM
Cloud
Backend
AI
Software Development
Docker
Python
Generative AI
Data Science
MidMidB2BB2B
New
ADVERTISEMENT: Recommended by Just Join IT
Salary
3 272 - 3 818 USD
Net per month - B2B
Applied -
Applied -
Check similar offers
Addepto

Addepto

Remote

Remote

3 436 - 5 268USD/month
Machine Learning
LLM
Cloud
Backend
AI
Software Development
Docker
Python
Generative AI
Data Science
MidMidB2BB2B
New
DCG

DCG

Warszawa

Hybrid

Hybrid

4 676 - 6 326USD/month
Machine Learning
LLM
PyTorch
AI
RAG
Langchain
Red Hat
Deep Learning
Python
Hugging Face
MidMidPermanentPermanent
New
7N

7N

Warszawa

Hybrid

Hybrid

6 476 - 7 632USD/month
GenAI
Machine Learning
LLM
SQL
Python
Data Science
MidMidB2B, PermanentB2B, Permanent
New
Tidio

Tidio

Warszawa

Hybrid

Hybrid

5 999 - 6 817USD/month
NLP
LLM
PyTorch
AI
RAG
REST API
Python
MLOps
Hugging Face
MidMidB2BB2B
Link Group

Link Group

Warszawa

Hybrid

Hybrid

30 - 33USD/h
AI
Python
MidMidB2BB2B
New
ADVERTISEMENT: Recommended by Just Join IT