Data Engineer

Data

Warszawa, Warszawa

RITS Professional Services

Full-time

B2B

Senior

Remote

6 866 - 9 063 USD

Net per month - B2B

Tech stack

SQL

advanced

ETL

advanced

GCP

advanced

Amazon AWS

advanced

Spark

advanced

Kafka

advanced

Kubernetes

advanced

Linux

advanced

Python

advanced

Job description

We are looking for a Data Engineer to join remotely one of the exciting projects within a global, American, market-leading, NYSE-listed trading platform.

100% Remote!
Working hours: Starting from 12:00 or ideally from 2 PM CET to have the overlap with the US team in the Eastern Time zone (especially during the onboarding process)

We (at RITS) work in the cooperation/partnership model where we see our consultants as clients as well. Currently, we are an exclusive Polish vendor for this particular client which is a very stable company that has been in existence for over 25 years and helps the world's leading asset managers, central banks, hedge funds, and other institutional investors access the liquidity they need through a range of electronic marketplaces and trades on average of about 30 trillion dollars A MONTH.

Job Responsibilities:

Build and run data platform using such technologies as public cloud infrastructure (AWS and GCP), Kafka, Spark, databases and containers
Develop data platform based on open source software and Cloud services
Build and run ETL pipelines to onboard data into the platform, define schema, build DAG processing pipelines and monitor data quality.
Help develop machine learning development framework and pipelines
Manage and run mission crucial production services.

Interview Process:

Introductory call with RITS Recruiter
interview with the hiring manager, 45 mins
technical interview, 90 mins with the developers from the team

Requirements:

Strong eye for detail, data precision, and data quality.
Strong experience maintaining system stability and responsibly managing releases.
Considerable production operations and support experience.
Clear and effective communicator who is able to liaise with team members and end-users on requirements and issues.
Agile, self-starter who is able to responsibly see things through to completion with minimal assistance and oversight.
Expert level grasp of SQL and databases/persistence technologies such as MySQL, PostgreSQL, SQL Server, Snowflake, Redis, Presto, etc
Strong grasp of Python and related ecosystems such as conda or pip.
Experience building ETL and stream processing pipelines using Kafka, Spark, Flink, Airflow/Prefect, etc
Experience with using AWS/GCP (S3/GCS, EC2/GCE, IAM, etc), Kubernetes and Linux in production.
Experience with parallel and distributed computing
Strong proclivity for automation and DevOps practices and tools such as Gitlab, Terraform, Prometheus.
Experience with managing increasing data volume, velocity and variety.
Ability to deal with ambiguity in a changing environment.
At least 5-6 hours overlap starting from 9am US Eastern.

Nice to have:

Familiarity with data science stack: e.g. Jupyter, Pandas, Scikit-learn, Pytorch, MLFlow, Kubeflow etc
Development skills in C++, Java, Go, or Rust
Software builds and packaging on MS Windows
Experience managing time series data
Familiarity with working with open source communities
Financial Services experience

What we offer:

🖥️ Budget for materials and hardware: such as standing desks, laptops, monitors or screens, WeWork-type workspaces, etc.
⚕️ Free private medical insurance OR 💪 Medicover Sport Membership
🚀 Ready to have you on a team ASAP!
↗️ Long-term cooperation!
🥳 Integration Trips to NY/London/Warsaw several times a year for 3-4 days (non-mandatory, expenses covered)