As a Data Engineer, you will be crucial in building and optimizing our data pipeline, which is central to our platform’s reliability and performance. You will be responsible for architecting and maintaining efficient data streams and integrating the latest technologies, focusing on complex event processing (CEP), to support our strategic goals and build a robust data engineering ecosystem.
Data Engineering & Stream Processing
- Architect, develop, and maintain scalable data pipelines using AWS services such as Kinesis, Firehose, Glue, Lambda, and Redshift.
- Implement and optimize stream processing solutions with frameworks like Apache Flink and Kafka to support real-time data ingestion, analytics, and complex event processing (CEP).
- Design and build CEP systems to detect and respond to patterns of interest in real-time data streams.
- Develop, optimize, and maintain the streams pipeline to ensure efficient, reliable, and scalable data processing.
- Ensure data quality and reliability by implementing robust data validation, error handling, and monitoring frameworks.
Monitoring, Observability & Optimization
- Utilize Apache Flink for real-time stream processing and monitoring to handle complex event processing and data transformations effectively.
- Use tools like Prometheus and Grafana for observability to ensure the health and performance of the data pipeline.
- Continuously monitor, troubleshoot, and optimize data pipeline performance to handle billions of events per month.
Collaboration & Integration
- Work closely with cross-functional teams, including front-end developers and data scientists, to build a robust data platform that meets current and future business needs.
- Participate in the design and architecture of scalable data systems, integrating new data sources, and optimizing existing data processes.
- Write production-level code in Kotlin and Python to build data processing applications, automation scripts, and CEP logic.