Purpose of Role:
As a Platform engineer you will be part of team responsible for providing operational support as a global team covering Platform "and" ensuring that necessary tools, software and processes are put in place to allow the system to scale with increase in operational load. You will be part of the Technical Services Department, responsible for of the last line of escalation for technical and support issues, which have been escalated by Support Teams globally. You will be highly analytical and will be an excellent problem solver with the ability to uncover not only the problem but also the root cause of it.
You will have technical skills and will be an outstanding communicator, working not only as a team at times on issues but also coaching others throughout the company.
Key Responsibilities:
- Perform initial troubleshooting for all the services - including necessary roll back and restore to maintain the high platform availability
- Good architectural understanding of deployed services to provide early feedback to the engineering team (via Production Readiness checklist).
- Create, maintain, and enhance monitoring, alerting and debugging capabilities
- Collaborate with Engineering and your Technical Escalation team colleagues to implement performance improvements identified through tracking service latency figures, CPU utilization figures, etc.
- Ensure effective communication is maintained with necessary stakeholders and support layers, detailing any changes to services or functionality and when these changes are scheduled.
- Collaborate with Engineering team on measuring SLIs and SLOs
- Provide necessary operational support to multiple platforms (both on-prem and on AWS). Participate in periodic 24x7 on-call duties
- Capacity and performance management of environments
- Responsible for availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning of their service
Essentials Skills and Qualifications:
- Understand customer issues, troubleshooting
- Experience with AWS cloud infrastructure (EC2, Cloudformation, Lambda, DynamoDB etc)
- Some CI/CD experience with Jenkins/Bamboo
- Understanding of web security and DevSecOps principles
- Knowledge of microservice implementation technologies (both serverless and containers)
- Strong communication and collaboration skills
- Ability to work across global teams and working with different cultures across different time zones
- Good written and spoken English skills