Senior Data Engineer
Overview
We are looking for a Senior Data Engineer to support a long-term engagement focused on simulation data processing for Autonomous Vehicle (AV) development. This role sits at the intersection of large-scale data engineering, real-world sensor analysis, and safety-critical automotive systems.
You will work with high-volume multimodal sensor data collected from a test AV fleet and help transform raw inputs into simulation-ready datasets used to develop and validate advanced autonomous driving features. The work directly contributes to next-generation vehicle safety in collaboration with a leading global automotive OEM.
What You’ll Work On
You’ll handle complex real-world driving data and scenarios such as:
Obstacle detection
Path planning
Complex traffic environments (e.g., tunnels, unusual vehicles, temporary network issues)
Edge cases critical for safe autonomous driving
Sensor inputs include 8–12 cameras, LiDAR, and radar, generating up to ~1TB of data per hour.
Key Responsibilities
Analyze large-scale real-world sensor datasets to identify edge cases (e.g., hard braking, close-proximity vehicles, unusual road behavior)
Design and write advanced SQL, Python, and Spark/PySpark queries for data filtering, transformation, and preparation
Work with internal platforms for data search, labeling, and auto-labeling workflows
Process structured and semi-structured data, including object detection and perception outputs
Select and prepare relevant data for AV simulation environments and ML pipelines
Contribute to improvements in data discovery and curation processes
Build and maintain data mining scripts and ETL pipelines
Develop internal tools to enhance analytics capabilities and streamline engineering workflows
Collaborate closely with engineers and researchers to support development and validation of safety-critical AV features
Requirements
4+ years of experience in Data Engineering or similar role
Strong software engineering mindset (not a traditional DBA profile)
Advanced SQL (ability to write complex queries)
Advanced Python
Advanced Spark / PySpark
Hands-on experience with Databricks
Experience working with large-scale or complex datasets
Experience in advanced data analytics, including time series analysis
Understanding of ML workflows (data preparation for training/validation)
Solid understanding of data pipelines and distributed data processing
Availability for daily 1-hour overlap with US team (around 6 PM CST)
Nice to Have
Experience in the Autonomous Vehicle (AV) or ADAS domain
Exposure to sensor data (camera, LiDAR, radar)
Understanding of real-world driving edge cases
University degree in Computer Science or a related technical field

Spyrosoft
Spyrosoft is a leading technology company specializing in software development and IT services. The company provides a wide range of expertise including artificial intelligence, cloud services, cybersecurity, digital pro...
Senior Data Engineer
Senior Data Engineer