We are looking for a Data Engineer to join remotely one of the exciting projects within a global, American, market-leading, NYSE-listed trading platform.
- 100% Remote!
-
Working hours: Starting from 12:00 or ideally from 2 PM CET to have the overlap with the US team in the Eastern Time zone (especially during the onboarding process)
We (at RITS) work in the cooperation/partnership model where we see our consultants as clients as well. Currently, we are an exclusive Polish vendor for this particular client which is a very stable company that has been in existence for over 25 years and helps the world's leading asset managers, central banks, hedge funds, and other institutional investors access the liquidity they need through a range of electronic marketplaces and trades on average of about 30 trillion dollars A MONTH.
Job Responsibilities:
- Build and run data platform using such technologies as public cloud infrastructure (AWS and GCP), Kafka, Spark, databases and containers
- Develop data platform based on open source software and Cloud services
- Build and run ETL pipelines to onboard data into the platform, define schema, build DAG processing pipelines and monitor data quality.
- Help develop machine learning development framework and pipelines
- Manage and run mission crucial production services.
Interview Process:
- Introductory call with RITS Recruiter
- interview with the hiring manager, 45 mins
- technical interview, 90 mins with the developers from the team
Requirements:
- Strong eye for detail, data precision, and data quality.
- Strong experience maintaining system stability and responsibly managing releases.
- Considerable production operations and support experience.
- Clear and effective communicator who is able to liaise with team members and end-users on requirements and issues.
- Agile, self-starter who is able to responsibly see things through to completion with minimal assistance and oversight.
- Expert level grasp of SQL and databases/persistence technologies such as MySQL, PostgreSQL, SQL Server, Snowflake, Redis, Presto, etc
- Strong grasp of Python and related ecosystems such as conda or pip.
- Experience building ETL and stream processing pipelines using Kafka, Spark, Flink, Airflow/Prefect, etc
- Experience with using AWS/GCP (S3/GCS, EC2/GCE, IAM, etc), Kubernetes and Linux in production.
- Experience with parallel and distributed computing
- Strong proclivity for automation and DevOps practices and tools such as Gitlab, Terraform, Prometheus.
- Experience with managing increasing data volume, velocity and variety.
- Ability to deal with ambiguity in a changing environment.
- At least 5-6 hours overlap starting from 9am US Eastern.
Nice to have:
- Familiarity with data science stack: e.g. Jupyter, Pandas, Scikit-learn, Pytorch, MLFlow, Kubeflow etc
- Development skills in C++, Java, Go, or Rust
- Software builds and packaging on MS Windows
- Experience managing time series data
- Familiarity with working with open source communities
- Financial Services experience
What we offer:
- 🖥️ Budget for materials and hardware: such as standing desks, laptops, monitors or screens, WeWork-type workspaces, etc.
- ⚕️ Free private medical insurance OR 💪 Medicover Sport Membership
- 🚀 Ready to have you on a team ASAP!
- ↗️ Long-term cooperation!
- 🥳 Integration Trips to NY/London/Warsaw several times a year for 3-4 days (non-mandatory, expenses covered)