Responsibilities:
- Work as an Individual Contributor developing Python services and Databricks notebooks with Apache Spark.
- Leverages prior industry knowledge to create detailed business requirements for functional capabilities which will need to exist on the platform
- Understands the technical specifications to guide system architectural design and development and review QA tasks to ensure completeness of requirements
- Leads the design and review of test cases to ensure tests adequately meet the requirements
- Develops, schedules, deploys event driven ETL jobs
- Productionize ML models trained on big data
- Deliver high-quality, self-documenting code
- Collaborate with the team via pair programming as well as actively participating in code reviews
Requirements:
- 5-7 years’ experience working on product development within an agile delivery framework, DevOps practices and micro services environment
- Deep experience using Azure DevOps
- Proficiency in developing high-quality Python code. Skills include:
o Module creation
o Linting (pylint, etc.)
o Static typing (mypy)
o Test driven development (pytest, unittest, etc.; unit and integration testing).
- Expertise with Databricks, Apache AirFlow and Apache Spark (PySpark). Experience working with Databricks and Pyspark in production environments
- Sound working knowledge of data ingestion, data cleaning, and ETL
Nice to know:
- Experience with Tox and other Python specific CI tools. (Familiarity with Azure Pipelines).
- Experience with commonly used ML libraries and frameworks such as Scikit-learn, Spark ML Lib, and Tensorflow.
- Previous experience with PDF document processing and MLOps using Azure stack is a plus
Our offer:
- Workplace: 100% remote
- MultiSport Plus
- Group insurance
- Medicover Premium
- e-learning platform