Senior Big Data Engineer
We are seeking a Senior Big Data Engineer to join our innovative team in building next-generation reporting pipelines and infrastructure capable of processing massive datasets. In this role, you will design, code, and verify scalable data jobs using the latest version of Scala and Spark, ensuring the highest accuracy for algorithms that drive strategic decisions for our stakeholders. You will work within a distributed team of leading big data engineers across Europe and overseas, taking ownership of both data logic and the underlying infrastructure. You will be at the forefront of large-scale data engineering, working for a global leader in the mobile app ecosystem and the world's largest application marketplace to redefine how high-volume data insights are generated.
Responsibilities:
Design, develop, and optimize scalable big data processing pipelines using the latest versions of Scala and Apache Spark.
Implement and verify complex data jobs to guarantee that statistical algorithms return precise and reliable metrics for global stakeholders.
Work closely with infrastructure and cloud environments, leveraging Kubernetes to manage, monitor, and optimize how Spark runs on Kube clusters.
Orchestrate complex data workflows and scheduling pipelines using Apache Airflow.
Build and maintain robust data streaming architectures using Spark Streaming and Apache Kafka.
Collaborate seamlessly within a distributed international team, interfacing with top-tier big data engineers in Europe and overseas.
Design and query complex data models across diverse database engines including Oracle, Postgres, Teradata, and Cassandra.
Proactively integrate DevOps and CI/CD practices into the data lifecycle to manage backend infrastructure efficiently.
Min requirements:
Strong software development experience with Scala and deep proficiency in Apache Spark (including Dataframe/SQL API).
Practical experience with Spark Streaming for real-time data processing.
Good understanding of Kubernetes, specifically regarding how Spark operates on Kube infrastructure.
Hands-on expertise with distributed storage and data lakehouse formats like Hive and Iceberg.
Proven experience in workflow orchestration using Apache Airflow.
Strong proficiency in SQL, with advanced knowledge of SparkSQL (including window functions and complex group-by cases).
Solid understanding of distributed computing technologies, architectural patterns, and hands-on experience with Hadoop.
Would be a plus:
Practical experience with Apache Flink for stream processing.
Familiarity or hands-on experience utilizing Claude (Anthropic) for engineering workflows or automation.
Proven track record of implementing Data Lakes, Data Warehousing, or advanced analytics systems.
Strong background in DevOps, infrastructure management, and building automated CI/CD pipelines at scale.
We offer:
Opportunity to work on bleeding-edge projects
Work with a highly motivated and dedicated team
Competitive salary
Flexible schedule
Benefits package - medical insurance, sports
Corporate social events
Professional development opportunities
Well-equipped office
About us:
Grid Dynamics (NASDAQ: GDYN) is a leading provider of technology consulting, platform and product engineering, AI, and advanced analytics services. Fusing technical vision with business acumen, we solve the most pressing technical challenges and enable positive business outcomes for enterprise companies undergoing business transformation. A key differentiator for Grid Dynamics is our 8 years of experience and leadership in enterprise AI, supported by profound expertise and ongoing investment in data, analytics, cloud & DevOps, application modernization and customer experience. Founded in 2006, Grid Dynamics is headquartered in Silicon Valley with offices across the Americas, Europe, and India.
Collapse
Senior Big Data Engineer
Senior Big Data Engineer