AI Data Engineer (Python & LLMs)

3 372.04 - 3 934.04 USDNet per month - B2B
AI/ML

AI Data Engineer (Python & LLMs)

AI/ML
Książęca 4, Warszawa

Tax Insight

Full-time
B2B
Mid
Hybrid
3 372.04 - 3 934.04 USD
Net per month - B2B

Job description

We are automating tax reporting and accounting processes using AI. You will build pipelines that scrape data from complex external sources and extract high-precision structured data from financial documents (PDFs, invoices, tax forms) using Python and LLMs. 

You will join a tight-knit, fast-moving engineering team in a scaling company, where your code will have an immediate impact on our products. 

You will bridge the gap between law and code, working directly with tax and legal experts to transform complex tax regulations into precise, compliant algorithmic solutions. 

 

Requirements: 

Core (Must-Haves): 

- Python Mastery: 2+ years of professional experience writing clean, maintainable code. 

 - Advanced Web Scraping: Proficiency with Playwright (preferred) or Selenium., including experience bypassing anti-bot measures (e.g. CAPTCHAs, rate limits, fingerprinting). Some experience with "BeautifulSoup" and "requests" libraries is also required. 

- Data Extraction: Strong hands-on experience with Regex, and other techniques to transform messy, real-world data (raw HTML, malformed JSON files, OCR text) into structured formats. 

- Data Cleaning: Proficiency with Regex, Pandas, and NumPy for data cleaning and preprocessing. 

- Database Fundamentals: Solid understanding of SQL or NoSQL databases. 

- LLM Integration: Experience  in prompt engineering for LLM APIs, including few-shot prompting, defining output schemas, and handling/parsing responses programmatically. 

- PDF handling:  Hands-on experience with libraries like pdfplumber, docling or pymupdf to extract data from text-based PDF files of varying qualities. 

- Mentorship: Ability to perform high-quality code reviews and guide junior developers when necessary. 

- Task Delegation: Ability to break down complex architectural features into clear, manageable tasks for the junior developer to execute. 

- Task Planning: Ability to decompose project stages into weekly sprints, ensuring the team stays unblocked and delivers features in a timely manner. 

- English Proficiency (B2+):  Ability to analyze international financial documents, technical documentation and prompt LLMs effectively. 

  

Preferred: 

- Vector Search & RAG: Experience with one of the vector databases (e.g., Qdrant, Milvus) and Retrieval-Augmented Generation workflows. 

 - Async & Concurrency: Experience with asynchronous programming (asyncio) and concurrency, specifically for efficient scraping. 

 

Nice to have: 

 - LLM Optimization: Experience with Batch LLM APIs,  concurrent requests, and caching strategies to optimize costs and latency. 

 - PostgreSQL: Specific experience with Postgres. 

 - Frontend Basics: Familiarity with JavaScript or TypeScript  for building internal GUIs. 

- Domain Expertise: Background in Fintech, Regtech, or TaxTech, with knowledge of local regulations (e.g., Polish VAT or KSeF). 

Tech stack

    English

    B2

    Python

    advanced

    Web scraping

    advanced

    LLM / OpenAI API

    regular

    Regex

    regular

    database fundamentals

    junior

    PDF handling

    junior

Office location

Published: 17.02.2026

AI Data Engineer (Python & LLMs)

3 372.04 - 3 934.04 USDNet per month - B2B
Summary of the offer

AI Data Engineer (Python & LLMs)

Książęca 4, Warszawa
Tax Insight
3 372.04 - 3 934.04 USDNet per month - B2B
By applying, I consent to the processing of my personal data for the purpose of conducting the recruitment process. Informujemy, że administratorem danych jest TAX INSIGHT sp. z o.o. z siedzibą w warszawie, ul. Książęca 4 (dalej jako "administrator")... MoreThis site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.