Senior Machine Learning Engineer (AI & Cloud Ops)
We are looking for an experienced Senior Machine Learning Engineer for a project for our client, one of major retail companies. The project focuses on building enterprise-scale intelligent agents using the Google Cloud Agent Development Kit (ADK) and deploying AI-driven solutions on Vertex AI. You will join a highly specialized team and be instrumental in supporting a massive digital and physical footprint that serves millions of daily visitors, optimizing everything from advanced e-commerce capabilities to complex supply chain logistics.
Responsibilities:
Design, implement, and maintain secure and scalable AI agent infrastructure on Google Cloud using Vertex AI and Agentspace.
Develop and enforce identity and access management (IAM) frameworks, including role-based access control (RBAC), least-privilege configurations, and cross-cloud alignment.
Drive CI/CD automation, implement automated compliance gates, and maintain continuous monitoring workflows.
Review and stage custom Model Context Protocol (MCP) actions, ensuring safe, secure, and strictly compliant model execution boundaries.
Take ownership of AI Ops, automation, monitoring, and the complete model lifecycle management.
Min requirements:
Strong programming skills in Python.
Proven hands-on experience with Google Cloud Platform, specifically focused on Vertex AI and building intelligent agents (Google Cloud ADK).
Deep expertise in Infrastructure as Code (Terraform) and CI/CD workflows to ensure tightly controlled deployments.
Solid understanding of enterprise security concepts: workload identity federation, short-lived tokens, OPA (Open Policy Agent), VPCs, and private endpoints.
Practical knowledge of observability, monitoring, and auditing tools (e.g., OpenTelemetry, Grafana, LangSmith).
Understanding of policy-as-code and real-time policy enforcement mechanisms.
Experience working with Elasticsearch and data workflows (e.g., Google Pub/Sub, BigTable).
Excellent decision-making skills and technical credibility to autonomously drive complex operational processes.
Would be a plus:
Hands-on experience with serverless architectures and containerized environments (Kubernetes / GKE).
We offer:
Opportunity to work on bleeding-edge projects
Work with a highly motivated and dedicated team
Competitive salary
Flexible schedule
Benefits package - medical insurance, sports
Corporate social events
Professional development opportunities
Well-equipped office
About us:
Grid Dynamics (NASDAQ: GDYN) is a leading provider of technology consulting, platform and product engineering, AI, and advanced analytics services. Fusing technical vision with business acumen, we solve the most pressing technical challenges and enable positive business outcomes for enterprise companies undergoing business transformation. A key differentiator for Grid Dynamics is our 8 years of experience and leadership in enterprise AI, supported by profound expertise and ongoing investment in data, analytics, cloud & DevOps, application modernization and customer experience. Founded in 2006, Grid Dynamics is headquartered in Silicon Valley with offices across the Americas, Europe, and India.
Senior Machine Learning Engineer (AI & Cloud Ops)
Senior Machine Learning Engineer (AI & Cloud Ops)