#1 Job Board for tech industry in Europe

Senior Data Engineer (AI Consumer Intelligence Platform)

Offer expired

Data

Senior Data Engineer (AI Consumer Intelligence Platform)

Data

Remote, New York

Kratos Growth

Full-time

B2B

Senior

Remote

Job description

Our client is hiring Data Engineers

Join a rapidly growing AI Consumer Intelligence Platform Delivering Insights for the World’s Biggest Brands

Hiring Company Background

We're an AI-powered consumer intelligence platform that processes 50+ billion data points monthly - Google searches, social conversations, product reviews, and videos - to deliver actionable consumer insights for Fortune 500 brands in days instead of months. Our clients include global leaders in beverages, personal care, and consumer packaged goods.

The Role

As a Senior Data Engineer, you'll architect and scale production data pipelines that power our NLP and ML systems processing billions of multilingual data points daily.

Reporting to our newly appointed CTO, you'll own the complete data lifecycle—from ingestion and transformation through deployment and observability—while defining infrastructure standards for a growing engineering team.

This is a high-ownership role at an early stage: no legacy code politics and no entrenched hierarchies. You'll convert MVPs into scalable products, establish DataOps/DevOps standards, and design governance mechanisms that prevent technical debt. Your architectural decisions will directly impact how Fortune 500 companies access real-time consumer intelligence.

Tech Stack

- Core: Python, PySpark, SQL

- Cloud & Infrastructure: Azure ecosystem, Databricks

- Deployment: Kubernetes, containerization, observability tooling

- NLP/ML: Large Language Models, LLM APIs, Spacy/NLTK/CoreNLP/TextBlob

- Data: Robust pipelines for multi-language text at scale

What You'll Do

• Design, build, and maintain production data pipelines processing 10M+ text records daily across multiple languages

• Architect scalable NLP data infrastructure using PySpark, Databricks, and Azure services

• Integrate Large Language Model APIs into production pipelines for text analysis and enrichment

• Establish DataOps standards including CI/CD, testing frameworks, and deployment automation

• Implement observability and alerting for pipeline health, data quality, and system performance

• Collaborate with data scientists to productionize ML models and NLP systems

• Define data governance frameworks and quality SLAs for enterprise client delivery

• Mentor team members and contribute to technical hiring as the team scales

Required Qualifications

Experience

- 5+ years building and maintaining production ETL/ELT data pipelines

- 2+ years working with text/NLP data (tokenization, embeddings, multilingual processing)

- 3+ years of data products shipped to production that serve active business uers

Technical Skills

- Python: 4+ years in production environments (required)

- PySpark: 2+ years (or, Spark with 1+ years + strong Python at 4+ years)

- SQL: 3+ years including complex queries and performance optimization

- Databricks: 1+ years production use (notebooks, Delta Lake, job scheduling)

- Cloud Platform: 2+ years with Azure (preferred) or equivalent AWS/GCP experience

- Containers/Kubernetes: Experience deploying containerized applications to Kubernetes

Education

Bachelor's degree in Computer Science, Data Science, Engineering, or related quantitative field.

Preferred Qualifications

- 3+ years Databrick experience, including Delta Lake architecture and Unity Catalog (strongly preferred)

- Azure ecosystem depth: Data Factory, Databricks, Blob Storage, DevOps

- LLM integration experience: OpenAI, Anthropic, or Azure OpenAI API integration in production

- LLM fine-tuning experience

- Experience with observability tools: DataDog, Grafana, or Azure Monitor

- Processing experience at scale: 1B+ records

- Multilingual text processing: 3+ non-English languages with Unicode and tokenization handling

- NLP libraries: Spacy, NLTK, CoreNLP, TextBlob (familiarity)

- Consumer insights, CPG/FMCG, or advertising technology experience

- Experience at early-stage, high-growth companies

What We Offer

- Competitive compensation (with performance bonus and equity opportunities)

- Fully remote (4+ hour overlap with U.S. Eastern Time Zone desired)

- Modern Stack: Work with cutting-edge NLP/ML technologies at scale

- High ownership & impact: Directly shape architecture decisions for a platform serving Fortune 500 clients

- Growth Trajectory: Join as a foundational team member with clear path to technical leadership

Tech stack

PySpark

advanced

Azure

advanced

Kubernetes

advanced

Databricks

advanced

SQL

advanced

Python

advanced

NLP

regular

Office location

Published: 13.02.2026

Check similar offers

Data Engineers (GCP)

5 436 - 7 611USD/month

ADVERTISEMENT: Recommended by Just Join IT

Check similar offers

Data Engineers (GCP)

5 436 - 7 611USD/month

Data Engineer - Azure Databricks

5 165 - 5 980USD/month

AWS Data Quality Engineer

Senior Data Engineer (Azure / Microsoft Dynamics)

Data Modeling & BigQuery Ekspert/Ekspertka - Warszawa (Mazowieckie), Polska

6 850 - 8 220USD/day

Warszawa (Mazowieckie)

Senior

B2B

Hybrid

ADVERTISEMENT: Recommended by Just Join IT