#1 Job Board for tech industry in Europe

Senior Site Reliability Engineer

Offer expired

DevOps

Senior Site Reliability Engineer

Exadel Poland sp. z o.o.

Warszawa

Type of work

Undetermined

Experience

Senior

Employment Type

B2B, Permanent

Operating mode

Remote

Tech stack

visualization tools

regular

Code Management Tools

regular

Remedyforce

regular

Pagerduty

regular

English

regular

Teamwork

regular

Problem Solving

regular

Communication Skills

regular

Job description

Online interview

Friendly offer

We're currently looking for a talented Site Reliability Engineer (SRE) who will be embedded within the product development team and be responsible for the overall reliability and availability of those applications. This person must have a passion for troubleshooting and getting to the root cause of any issue that is identified, resolving that issue, and owning the lifecycle of that feedback within the application teams.

Work at Exadel - Who We Are:

Since 1998, Exadel has been engineering its own software products and custom software for clients of all sizes. Headquartered in Walnut Creek, California, Exadel currently has 2700+ employees in development centers across the Americas, Europe, and Asia. Our people drive Exadel’s success, and they are at the core of our values.

About Our Customer:

The customer is an American company based in Chicago with more than 40 years of experience in the P&C insurance industry. The customer moved to the cloud in 2003, and it began using its first ML algorithms as early as 10 years ago. The company accelerates digital transformation for the insurance and automotive industries with AI, IoT, and workflow solutions.

The core product is a comprehensive SaaS platform that consolidates about 30,000 stakeholders, namely insurance companies, repair facilities, auto manufacturers, lenders, fleets, and everyone involved in resolving critical moments following an accident.

Client team culture and development approach:

People who are flexible and ready to learn new technologies
Most importantly, good Java understanding and attitude to learn new things in order to keep pace with the rapidly changing development environment where new technologies need to be constantly implemented
Cross-functional team where developers are expected to deliver both back-end and front-end code
Don’t worry, we've got you covered! The customer provides education courses for front-end technologies, the team adjusts the tasks at the start to allow you to gradually pick up the front end, and we provide experienced mentors from our side. You’ll have all the support you need for your professional development

About Our Project:

Workflow is an extensive platform that unifies many web and mobile applications.

Each SRE will be responsible for 1 or 2 applications within the Workflow platform and working with the corresponding development team.

Key Areas of Focus:

Reducing Technical Debt
Reducing Toil
Observability/System Monitoring
Incident Response throughout SDLC
Problem Management

Product/Project Tech Stack:

Java, J2EE, RESTful services, JMS, Kafka, SQL, SOAP, ACTIVEMQ
JavaScript, vue.js, jQuery, JSP, Struts
Oracle, MySQL, Postgres
Oracle WebLogic, Amazon, Kubernetes
Jenkins, Spinnaker, CI/CD Pipeline
SVN, Git/Gitlab
Python
Ceph, S3

Requirements:

Minimum years of experience: 3 – 5 years

Experience with monitoring and data visualization tools: Appdynamics, Alertsite, Nagios, Grafana, Prometheus, Kibana, Datadog, any cloud native monitoring services such as Cloudwatch
Experience with source code management tools: Github, GitLab, SVN, Bitbucket
Experience with incident management tools: RemedyForce, Pagerduty
Experience with collaboration tools: Teams, Confluence, Microsoft Office 365
Experience with project management: Version One, JIRA
Solid understanding of microservices and APIs
Being versed in system management, monitoring, and analysis to identify opportunities for improving service health, manageability, and reliability
Proven ability to dig through metrics, logs, and available sources to triage and resolve an incident at any time
Eager to problem-solve and troubleshoot issues that may arise day-to-day
Ability to document solutions, SRE architectural patterns, and best practices to ensure that teams have guidance as needed
Experience and interest in working in an Agile environment
Effective communication and interpersonal skills

Nice to have:

Past enterprise-level experience in DevOps, software, infrastructure, or site reliability engineering with the ability to demonstrate understanding of high level technical briefs, talks, and ideas
Experience leading teams in troubleshooting, issue resolution, or escalations

English level:

Intermediate+

Responsibilities:

First 6 months in the position:

Cleanup work, bug fixing, preparing the basis for the future SRE work
Apply automation to any tasks/parts of the system that are performed manually
Configuring and maintaining the monitoring tooling as it relates to the target application
Monitor application/infrastructure and take steps to improve overall system software performance, availability, and reliability by incorporating changes through defined feedback loops within the software delivery lifecycle
Document tribal knowledge as you acquire it over time by creating runbooks/playbooks and ensuring critical system information is readily available to those who need it through dashboards

After the first 6 months in the position:

Work closely with software developers and testers to ensure the product is responding correctly to non-functional requirements such as security, performance, and availability
Resolve NOC escalations and help prevent reiteration of incidents by creating processes and automation
Be key part of our response to high-severity internal customer incidents, ensuring we meet all SLAs and SLOs
Help build an SRE culture by sharing best practices, approaches, documentation, and code with other engineering teams across the organization
Assist product development team with managing their error budget
Embrace failures and treat incidents as learning opportunities through conducting blameless postmortems reports
Participate in product engineering stand-ups and related design activities
Coach other team members to ensure systems are supported by following SRE best practices

Advantages of Working with Exadel:

You can build your expertise with our Client Engagement team, who provide assistance with existing and potential projects
You can join any Exadel Community or create your own to communicate with like-minded colleagues
You can participate in continuing education as a mentor or speaker. You will not only be emotionally but also financially rewarded for mentoring
You can take part in internal and external meetups as a speaker or listener. We support you in broadening your horizons and encourage knowledge sharing for all of our employees
You can learn English with the support of native speakers
You can take part in cultural, sporting, charity, and entertainment events
Working at Exadel means always upgrading your skills and proficiency, so we provide plenty of opportunities for professional development. If you’re looking for a challenge that will lead you to the next level of your career, you’ve found the right place
We work hard to ensure honest and open relations between employees and leadership, so our offices are friendly environments