All offersLublinDevOpsSite Reliability Engineer
Site Reliability Engineer
DevOps
DataArt

Site Reliability Engineer

DataArt
Lublin
Type of work
Undetermined
Experience
Junior
Employment Type
B2B
Operating mode
Office

Tech stack

    DevOps
    junior
    Java
    junior
    SQL
    junior
    Linux
    junior

Job description

About the vacancy

Our client is one of the biggest online retailers worldwide with an annual revenue of £1 billion. Over the years we helped the client develop web-portals, mobile apps, delivery control systems, staff management tools, data storage and much more. The systems we’ve built together are in operation 24/7, contributing to the client’s success.

Site Reliability Engineering is a new role, first introduced by Google, that combines the skills of developers and ops to deliver more reliable, scalable software. The goal is to analyze a diverse set of applications (primarily built using Java, Oracle, AWS, Google Cloud services and a number of other technologies) and bind them into a reliable self-healing suite, working within defined reliability requirements. This requires proactive work to ensure observability, analyze potential bottlenecks and suggest their fixes before they become a production incident.

This position may be of interest to DevOps engineers who would like to get closer to the code or get valuable specialization with a focus on JVM stack. The position may also appeal to developers who are interested in how large scale systems operate and what happens to the code after it is live.

Responsibilities

  • Analyze and improve the availability, latency, performance, and efficiency of the applications
  • Proactive support of production applications (both in-office and out of hours) across a range of domains, these are mainly written in Java and use Oracle databases
  • Improve the monitoring and alerting of the applications
  • Capacity planning and provisioning
  • Improve and standardize build pipelines, identify and reduce any areas of manual toil through automation.
  • Consult in areas of reliability and scalability for the development of new applications.
  • Work together with teams in other departments to find solutions
  • Conduct periodic on-call duties

Must have

  • Understanding of software development life cycle, experience with project management tools like Jira
  • Good Understanding of Linux and UNIX-based systems
  • Ability to build and run SQL queries
  • Good English communication and problem-solving skills

Would be a plus

  • Experience with real-time monitoring/alerting
  • Knowledge of Java and Spring framework
  • Understanding of NoSQL databases
  • Understanding of virtualization and containers (Docker, Kubernetes)
  • Knowledge of scripting languages (python, ruby)