Data/MLOps Engineer – CT&C
We are looking for an experienced and passionate Data/MLOps Engineer to join our CT&C Engineering team. In this role, you will bridge the gap between Data Science and Production Engineering, ensuring that machine learning solutions are scalable, reliable, secure, and production-ready.
You will play a key role in designing, building, maintaining, and optimizing our data platforms and ML infrastructure, enabling efficient data ingestion, transformation, storage, model deployment, and real-time analytics.
This position requires a strong understanding of machine learning concepts, hands-on MLOps expertise, and solid engineering skills across cloud platforms, data processing frameworks, and automation tooling.
Key Responsibilities
ML & Data Infrastructure
Deploy, maintain, and optimize end-to-end machine learning lifecycles, including automated training, deployment, monitoring, and versioning.
Build and support core MLOps capabilities such as Feature Stores, Experiment Tracking platforms, and Model Registries.
Provision and manage scalable cloud infrastructure using Infrastructure as Code (IaC) solutions such as Terraform or AWS CloudFormation.
Design and implement robust CI/CD/CT (Continuous Training) pipelines to enable reliable and repeatable production releases.
Collaborate closely with Data Scientists to productionize machine learning models and workflows.
Data Engineering & Pipeline Optimization
Design and develop high-volume data ingestion and processing pipelines using Apache Spark, PySpark, and Python.
Build scalable ETL/ELT solutions supporting advanced analytics and machine learning workloads.
Implement optimized data models and storage strategies to support low-latency model inference and high-performance analytics.
Integrate automated data quality validation, monitoring, and observability capabilities across data platforms.
Governance, Monitoring & Security
Implement proactive monitoring for model performance, model drift, data quality issues, and system latency.
Ensure complete reproducibility through robust versioning of data, code, models, and artifacts.
Apply security best practices across the ML lifecycle, including access management, data privacy, and compliance requirements.
Support operational excellence through incident management, troubleshooting, and continuous improvement initiatives.
Agile Delivery & Collaboration
Work within Agile delivery teams, participating in sprint planning, backlog refinement, daily stand-ups, and retrospectives.
Translate business and data science requirements into scalable technical solutions.
Collaborate with Product Owners, Data Scientists, Data Engineers, and Platform Teams to deliver production-grade ML solutions.
Create and maintain technical documentation covering architecture, workflows, pipelines, and operational procedures.
What We're Looking For:
Strong Python development experience
Hands-on experience with Apache Spark and PySpark
Solid understanding of machine learning lifecycle management and MLOps best practices
Experience with AWS services, particularly:
Amazon SageMaker
AWS Lambda
AWS CDK
Experience building CI/CD pipelines for data and ML workloads
Strong SQL skills
Experience designing and implementing ETL/ELT pipelines
Knowledge of PyTorch and machine learning frameworks
Experience with Infrastructure as Code (Terraform and/or CloudFormation)
Understanding of monitoring, observability, and production support practices
Experience working in Agile environments
Design and implement scalable ML solutions using PySpark and Amazon SageMaker.
Balance software engineering best practices with practical machine learning implementation.
Drive operational excellence across the entire ML lifecycle.
Experience with Feature Stores and Model Registry platforms
Experience implementing Continuous Training (CT) pipelines
Knowledge of MLOps governance frameworks
Experience with real-time streaming architectures
Exposure to large-scale cloud-native data platforms
Data/MLOps Engineer – CT&C
Data/MLOps Engineer – CT&C