System/Cloud Administrator
The System/Cloud Admin will be in charge of all aspects of the configuration and maintenance of our
computer systems on the cloud as well as the local data center. The successful candidate will focus on
improving the effectiveness, robustness, stability and performance of the systems. In addition, he/she will be responsible for maintaining and improving the security posture of the systems, continuous
monitoring of the system's health, and troubleshooting issues that arise.
Responsibilities
Responsibilities include, but are not limited to: (cloud, data center, monitoring, users)
- designing, building and supporting a cloud infrastructure that can automate provisioning and monitoring of VM on a cloud platform (GCP, AWS, etc)
- secure user access and data transferring on cloud VMs/instances
- maintain the infrastructure built on the cloud to support development and implementation
- optimize data storage across distributed systems (different cloud platforms, local hosted servers)
- maintaining and improving local hosted compute farm to provide secure, robust and fast access to local users
- build and upgrade docker system with GPU support
- maintain the authentication from within the docker
- maintain the configuration of the host machine as well as the docker system
- troubleshooting open source and commercial applications running on cloud/local/personal machines
- developing system backups and server restoration (cloud & local)
- Oversee Corporate IT security, forecasting and budgeting
Required Skills:
- Advanced Knowledge of Linux (e.g. build, maintenance, and troubleshooting of systems)
- In-depth knowledge of popular cloud platform (e.g. GCP, AWS, etc)
- Strong shell scripting expertise
- Deep understanding of networking (Troubleshooting, DNS, protocols, hardware and software)
- Hands-on experience with LDAP and/or Active Directory, NFS
- Experience with Docker or other container technologies
- Experience with monitoring tools such as Nagios, Zabbix, Cacti and Ganglia
- Working knowledge of backup and disaster recovery solutions
- Excellent written and oral communication skills
- Strong problem-solving skills and the ability to work collaboratively with algorithm / application developers
- Ability to document system architecture and processes, as well as end user instructions
Location: Campbell, CA and Warsaw, Poland