#1 Job Board for tech industry in Europe

  • Job offers
  • SRE (Site Reliability Engineer)
    New
    DevOps

    SRE (Site Reliability Engineer)

    Type of work
    Full-time
    Experience
    Mid
    Employment Type
    B2B
    Operating mode
    Remote

    Tech stack

      English

      B2

      Ubuntu

      advanced

      Linux

      advanced

      Troubleshooting

      advanced

      Networking

      advanced

      bare-metal

      regular

    Job description

    Online interview
    Friendly offer

    SRE (Site Reliability Engineer)

    Poland, Remote or Ukraine, Remote or Romania, Remote


    Point Wild helps customers monitor, manage, and protect against the risks associated with their identities and personal information in a digital world. Backed by WndrCo, Warburg Pincus and General Catalyst, Point Wild is dedicated to creating the world’s most comprehensive portfolio of industry-leading cybersecurity solutions. Our vision is to become THE go-to resource for every cyber protection need individuals may face - today and in the future. 


    Join us for the ride!


    About the Role:

    We are seeking a highly motivated and skilled Site Reliability Engineer (SRE) to join our dynamic engineering team. The SRE will play a crucial role in maintaining the reliability, availability, and performance of our systems and applications. You will work collaboratively with development and operations teams to implement best practices, automate processes, and ensure that our infrastructure can scale seamlessly to meet business demands.


    Day to Day:

    • System Monitoring & Incident Response: Develop and implement monitoring tools to ensure system health. Respond to incidents, troubleshoot issues, and provide timely resolutions.
    • Automation & Infrastructure as Code: Design and implement automation solutions to manage infrastructure and application deployment using tools like Terraform, Ansible, or similar technologies.
    • Performance Optimization: Analyze system performance and capacity; implement improvements to enhance system reliability and efficiency.
    • Collaboration: Work closely with development teams to improve system design and deployment practices. Advocate for reliability improvements in the software development lifecycle.
    • Documentation & Reporting: Maintain thorough documentation of system architecture, processes, and incident response procedures. Provide regular reports on system performance and reliability metrics.
    • Recovery & Backup: Design and implement disaster recovery plans and ensure effective data backup solutions are in place.
    • Security Best Practices: Collaborate with security teams to ensure best practices are followed to protect systems and data.


    What you bring to the table:

    • Proven experience in a Site Reliability Engineering, DevOps, or related role.
    • Knowledge of cloud services (AWS, Azure, Google Cloud) and container orchestration (Kubernetes, Docker).
    • Proficiency in scripting languages (Python, Bash, ansible, etc.), experience with CI/CD tools (Jenkins, GitLab CI/CD, etc.) and infrastructure as code tools (Terraform, Ansible).
    • 3+ years of proven track record with production monitoring using Prometheus, ELK, Grafana and OpsGenie/PagerDuty.
    • 3+ years of experience in Linux system administration (preferably Ubuntu)
    • Solid understanding of networking, security, system architecture, and data center operations in a fast-paced, 24x7, production environment
    • Strong understanding of networking concepts, protocols (TCP/IP, BGP, OSPF), and technologies (LAN, WAN, VPN) with proficiency in network monitoring tools and software.
    Undisclosed Salary

    B2B

    Check similar offers

    DevOps Engineer AWS

    New
    Connectis
    7.94K - 8.47K USD/month
    Warszawa
    , Fully remote
    Fully remote
    Terraform
    Kubernetes
    Python