Company Overview:
PandaDoc empowers more than 60,000 growing organizations to thrive by taking the work out of document workflow. PandaDoc provides an all-in-one document workflow automation platform that helps fast scaling teams accelerate the ability to create, manage, and sign digital documents including proposals, quotes, contracts, and more. For more information, please visit https://www.pandadoc.com.
Site Reliability Engineers are mission-critical for PandaDoc success. We ensure that PandaDoc services are reliable and service our customers with minimum downtime. Site Reliability Engineers are the driving force that keeps the system running smoothly and helps identify any bottlenecks before they become issues. We focus on metrics, observability, and eliminating manual operations work with automation.
In this role, you will:
- Build software, frameworks, and tools required for reliable operations of PandaDoc services
- Manage the stability and operation of most critical production applications through application reviews, capacity planning, and performance tuning
- Constantly develop automations/tooling for better platform reliability/availability
- Work with cutting-edge & distributed systems not limited to, but including Kubernetes, RabbitMQ, Kafka, Redis and ElasticSearch
- Collaborate with other engineers on the team as well as cross functionally to foster solid engineering principles and represent our engineering values
- Participate in various POCs on new projects and frameworks being evaluated for the product/platforms
- Improve our observability as both a developer/maintainer of systems/frameworks, and a mentor to our product development teams
About you:
- 3+ years of experience using higher level languages (e.g. Python or Go)
- Strong experience in configuration and maintenance observability tools (Prometheus, Grafana, Kibana)
- Experience in supporting critical production services
- Enjoy hands-on troubleshooting in a distributed Linux systems environment and are comfortable in tracing problems through applications, systems and networks
- Strong experience with AWS and Kubernetes
- Experience using industry standard DevOps tools such as GitLab, Jenkins, Terraform
- You enjoy communication and knowledge sharing on all things reliability
- You act like an owner and strive to do work you're proud of, both technically and in your team interactions
Benefits:
- An honest, open culture that emphasizes feedback and promotes professional and personal development
- An opportunity to work from anywhere — our team is distributed worldwide, from Lisbon to Manila, from Florida to California
- 6 self care days
- A competitive salary
- And much more!