Site Operations Manager
About the role
We are looking for a Site Operations Manager to lead the day-to-day operations of a high-density data center environment supporting GPU infrastructure for AI workloads.
This role combines operational leadership with ownership of critical facility infrastructure, including power, cooling, hardware operations, and site reliability. You will be responsible for managing on-site teams, maintaining operational continuity, and ensuring the data center runs safely, efficiently, and with minimal downtime.
Responsibilities
Lead, mentor, and manage a 24/7 site operations team, including facilities technicians, operations engineers, and support staff
Oversee day-to-day site operations and ensure high availability of critical infrastructure, including power, cooling, networking, and hardware environments
Manage preventive and corrective maintenance programs for electrical and mechanical systems, including UPS, generators, PDUs, cooling systems, fire suppression, and building management systems
Monitor and improve operational metrics such as PUE, WUE, energy consumption, and overall site efficiency
Lead incident response, emergency procedures, root cause analysis, and post-incident improvement activities related to facility operations
Work closely with DevOps/SRE, Networking, and AI/ML teams to support hardware deployments, maintenance windows, and infrastructure growth
Manage relationships with external vendors and contractors, including service delivery, SLAs, spare parts availability, and on-site execution
Ensure compliance with safety standards, environmental regulations, internal procedures, and security requirements
Manage the site operations budget, control operating costs, and identify opportunities to improve processes, automation, and monitoring
Support data center commissioning, infrastructure upgrades, and expansion projects as the organization grows
Requirements
8+ years of experience in data center operations, critical facilities, or site operations management
At least 4 years of experience in a people management or supervisory role
Proven experience managing mission-critical environments with high availability requirements
Strong technical knowledge of electrical and mechanical infrastructure used in data centers
Experience with power systems, backup systems, cooling infrastructure, and facility maintenance processes
Experience managing 24/7 shift-based teams and supporting on-call or incident response models
Familiarity with DCIM, BMS, and real-time facility monitoring tools
Ability to coordinate cross-functional operational work in a fast-paced production environment
Good organizational, leadership, and communication skills
English proficiency at least at a communicative level is required, as you will be working in an international team
Nice to have
Experience in AI infrastructure, GPU-focused data centers, HPC environments, or hyperscale facilities
Experience with liquid cooling systems
Familiarity with NVIDIA GPU hardware, high-performance networking, or AI infrastructure operating requirements
Experience with high-density environments, including deployments above 50 kW per rack
Certifications such as CDCP, CDCS, CMRP, or similar data center / facilities credentials
Experience working with EPC contractors during construction, fit-out, or commissioning phases
Higher education in Engineering, Facilities Management, or a related technical field
What we offer
Benefits package
Opportunity to lead operations in a modern, high-performance data center environment
Real impact on the reliability, efficiency, and operational standards of infrastructure supporting advanced AI workloads
Collaboration with experienced teams across infrastructure, engineering, and operations
A dynamic environment with room for ownership, process improvement, and professional growth
Site Operations Manager
Site Operations Manager