We are a diverse, fully-distributed team from around the world, working toward a shared vision to raise the standard of trust online.
Join us as a Senior Site Reliability Engineer to help build and maintain world-class infrastructure to meet the needs of millions of users.
At DuckDuckGo, we currently serve 100+ million search queries a day (nearly doubling each year), anonymously leverage over 400 upstream sources for results, and serve more than 1PB of proxied traffic per month.
As part of our growing team, you will be dedicated to improving and scaling the reliability of our end-to-end infrastructure. We dive deep into complex operational challenges, including software, systems, automation, and process analysis. We are looking for candidates that can read, write, troubleshoot, and deploy all types of software as we face unique challenges in privacy and scale.
We empower our team to be self-directed and self-motivated in their work. If you'd thrive in that environment, and our core values resonate with you -- build trust, question assumptions, and validate direction -- you'll fit right in!
What You Will Do:
- Lead projects from proposal through postmortem, assessing vague problems, proposing high-impact solutions, and executing them against a set of success criteria.
- Develop effective tools, services, alerts, and responses to identify and address reliability risks.
- Work closely with search engineers to triage production issues and determine appropriate remediation, including code changes and performance considerations.
- Participate in our on-call rotation; triage and address reliability issues that come up in production.
- Help determine the future technical direction of our deployment with an effort to improve reliability and performance.
What We Are Looking For:
- At least 7 years of engineering experience, with 5+ years focused on tackling the reliability challenges of large-scale deployments and high-traffic, distributed systems
- Experience with production troubleshooting, including: distributed systems, code, storage, networking, and operating systems
- Moderate-to-advanced programming experience, preferably in a high-level language like Perl or Python
- Experience participating in a 24x7 on-call rotation for a large-scale deployment.
- Experience configuring and troubleshooting Linux and NGiNX
- Strong organizational skills, you have an eye for detail and are not afraid to use it!
- Effective project management skills; you have successfully launched projects from inception to production
- Strong communication skills: You clearly articulate, in verbal and written communication, your recommendations and decisions
- Comfortable providing feedback to an array of stakeholders, both internal and external
DuckDuckGo Culture
For over a decade, we've built a unique culture that helps us continuously improve job satisfaction and productivity. Want to know more? Check out
DuckDuckGo Culture: How We Work for an overview of how we collaborate worldwide.
Your Well-Being
Maintaining satisfaction at work is one of our company objectives, just like maintaining and improving our private search engine. Our
Team Member Support Guide explains how we make you our top priority.
Hiring Process
Hiring works best when it's a two-way street. Learn how we help you get to know DuckDuckGo and envision your future role here. Find out more about
how we hire.