Data Engineer (NLP) – Production Data Systems (Remote)
Kratos Growth's client is hiring Data Engineers (NLP) – Production Data Systems
Join a rapidly growing AI Consumer Intelligence Platform Delivering Insights for the World’s Biggest Brands
Hiring Company Background
Led by industry veterans from Unilever and Coca-Cola, our platform synthesizes massive-scale data (billions of Google searches, social conversations, product reviews, and videos) to deliver actionable consumer insights for Fortune 500 clients in days instead of months.
Our clients include global leaders in beverages, personal care, and consumer packaged goods.
With a newly appointed CTO building our engineering team, this is your opportunity to shape data engineering standards, define infrastructure architecture, and establish pipeline best practices at a high-growth company, while working remotely from anywhere in the world as we scale in 2026 and beyond.
No legacy code politics, no entrenched hierarchies, no technical debt from someone else's decisions that you're powerless to change.
This is your chance to build production data systems at a high-growth company. Your impact will be immediate and visible.
Your Mission
As a Senior Data Engineer, you'll architect and scale production data pipelines that process hundreds of billions of data points for NLP and ML systems.
You'll own the complete data lifecycle—from ingestion and transformation through deployment and observability—while shaping our infrastructure strategy to optimize delivery speed, cost efficiency, and data quality.
This isn't just implementing prototypes. You'll convert MVPs into scalable products, establish DataOps/DevOps standards, and design governance mechanisms to eliminate technical debt across the stack. Your decisions will directly impact how Fortune 500 companies access billions of data points in real-time.
What You'll Work With
• Core Stack: Python, PySpark, SQL
• Cloud & Infrastructure: Azure ecosystem, Databricks (deep expertise required)
• Production Deployment: Kubernetes, containerization, observability tools
• NLP/ML: Large Language Models, LLM API integration, modern NLP libraries (Spacy, NLTK, CoreNLP, TextBlob)
• Data Engineering: Building robust, scalable pipelines for multi-language text processing
What We're Looking For
• 5+ years in data engineering practices, with strong NLP focus
• Strong experience solving real business problems through data engineering and data science
• Python expertise (PySpark proficiency preferred, or solid Python + Spark fundamentals)
• Production deployment experience in enterprise environments (containers, Kubernetes)
• Track record of shipping data products that customers actually use
• Understanding of non-English text processing
• Experience with Large Language Models in coding environments, including pipeline integration and fine-tuning (a plus)
• Computer Science degree
What We Offer
• Competitive compensation
• Fully remote: Work from anywhere
• Cutting-edge problems: Real-world challenges at scale with the latest ML/NLP technologies
• Ownership & impact: Shape architecture decisions for a fast-growing platform
• Autonomy: We trust you to deliver exceptional work on your terms
Data Engineer (NLP) – Production Data Systems (Remote)
Data Engineer (NLP) – Production Data Systems (Remote)