All offersPoznańDevOpsSenior Site Reliability Engineer
Senior Site Reliability Engineer
DevOps
Egnyte Poland

Senior Site Reliability Engineer

Egnyte Poland
Poznań
Type of work
Undetermined
Experience
Senior
Employment Type
B2B
Operating mode
Remote
Egnyte Poland

Egnyte Poland

Egnyte is the secure multi-cloud platform for content security and governance that enables organizations to better protect and collaborate on their most valuable content. Established in 2008, Egnyte has democratized cloud content security for more than 22,000 + organizations, helping customers improve data security, maintain compliance, prevent and detect ransomware threats, and boost employee productivity on any app, any cloud, anywhere. For more information, visit www.egnyte.com.

Company profile

Tech stack

    Linux
    master
    Python / Golang
    advanced
    Terraform
    advanced
    Google Cloud Platform
    advanced
    Jenkins
    advanced
    Reliability Engineering
    advanced
    Kubernetes
    regular
    Prometheus
    regular
    Puppet
    regular
    Tomcat
    regular

Job description

Online interview


Egnyte, headquartered in the heart of Silicon Valley, is a provider of enterprise content governance and collaboration software.. With thousands of companies relying on our platform in their everyday processes, Egnyte needs to be available, stable, and performant. Currently, we are looking for an experienced engineer to join our SRE team.

The opportunity:

You will be ensuring reliability for large-scale software—we’re talking 16k+ customers, over 6000 instances across geo-distributed Data Centers and Cloud providers, as well as an average of 2k API requests per second as per New Relic. For us, people who own their work from start to finish are integral to Egnyte’s success. Our engineers are part of the whole process: from design through coding and testing to the deployment and back again for further iterations. We are looking for an experienced engineer who is eager to apply software development approaches to operations. You can, and will, touch every level of the infrastructure depending on the day and the project you are working on. This role requires you to take on complex problems and execute end-to-end solutions. You should also feel comfortable mentoring and sharing your knowledge with the rest of the team and new hires.

Your day-to-day at Egnyte:

  • Drive focused initiatives that improve operational efficiencies, reliability, and scalability of the platform and its applications
  • Participate in big projects like migrating solutions from self-hosted environments to the cloud, from virtual machines to Kubernetes, from monolith to microservices
  • Proactively propose and implement automation and observability solutions focusing on improving our core business
  • Address performance challenges, optimize and fine-tune production environments
  • Implement best SRE practices in making and documenting improvements to the infrastructure

About you:

  • 5+ years in an SRE/SysAdmin/DevOps/NOC, software development, or equivalent role
  • Knowledge and experience of deployment and maintenance of Java and Python apps using application and web servers such as Tomcat & Nginx
  • Good coding skills in Python and Golang. Ability to solve complex problems with simple, elegant, and optimized code
  • Good knowledge of both self-hosted and cloud environments (preferably the Google Cloud Platform)
  • Troubleshooting skills to hunt down the root causes of issues and persistence in preventing them from happening again
  • Practical knowledge of CI/CD solutions, GitLab CI or similar (Jenkins, Travis, Circle CI, etc. preferred)
  • Practical knowledge of container orchestration (Kubernetes, Docker)
  • Willingness to share knowledge, mentor more junior team members while also still learning and gaining new skills
  • Good English skills (C1+) to effectively communicate about technical matters
  • Working knowledge of Terraform, Puppet, Ansible, Prometheus

Bonus skills
:

  • Work experience as a software developer or architect
  • Practical and active knowledge of other programming languages, especially Java, Ruby
  • GCP certificate, CCNA certificate, RHCE or equivalent
  • Being an active user and open source projects contributor
  • Experience with Linux HA solutions such as HAProxy, LVS, Corosync & Pacemaker, etc.
  • Experience with message brokers (RabbitMQ, Kafka or others) and databases (MySQL or others)
  • Experience with Big Data solutions
  • Track record of acting as Incident Commander for large scale incidents
  • Operational knowledge of the ELK stack