All offersWarszawaScalaAutoML Software Engineer
AutoML Software Engineer
Scala
dotData, Inc.

AutoML Software Engineer

dotData, Inc.
Warszawa
Type of work
Undetermined
Experience
Senior
Employment Type
B2B
Operating mode
Office

Tech stack

    Scala
    advanced
    Akka
    advanced
    Apache Spark
    regular
    Apache Hadoop
    regular
    Python
    nice to have
    Machine Learning
    nice to have

Job description

Job overview


dotData is hiring high caliber engineers who are excited to democratize data science with automation. You will work on dotData’s proprietary core engine components: AI-powered feature engineering and AutoML:
  • AI-powered Feature Engineering discovers and evaluates millions of combinations of joins and aggregations of relational data. To explore them efficiently we built our own distributed in-memory engine on top of Apache Spark. Skewed keys, wide tables, exploding number of records after join - we have to take care of all of that.
  • Our advanced AutoML component automatically explores and evaluates state-of-the-art ML models with automated hyper-parameter optimization and model selection. Logical plan is divided into computationally-intensive jobs that are executed in parallel with strict resiliency requirements. We take advantage of industry-standard machine learning libraries, like sklearn, XGBoost, LightGBM, TensorFlow, and Pytorch.

Job requirements


Non-technical
  • Startup experience is nearly required. Knowing what that means and seeking it out is required. This includes being able to seek out answers and fill in the blanks, proposing new ways of approaching problems, and being able to handle all sorts of different projects.
  • You take ownership, end-to-end, of the features that are your responsibility.
  • You work collaboratively, and can do so in a global multi-cultural environment.
  • You are willing to learn, and will not be afraid to jump into something with which you have no prior experience

Technical
  • You were a key implementor coding, testing, and shipping multiple Scala or Java-based enterprise-grade products.
  • You write clean, maintainable code using the best agile software engineering practices.
  • You do not compromise on quality, and you write the tests to guarantee it.
  • You have a strong foundation in distributed computing and / or Machine Learning.
  • You’re proficient in Scala, Akka, Spark, YARN, HDFS, etc.
    • A very strong Java programmer in the other technologies might work out, too.
  • Strong CS skills including such things as time / space complexity, data structures, functional programming, understanding of operating systems... 
    • tl;dr: CS Master’s or equivalent.

Preferred
  • Working knowledge of DevOps platforms such as Jenkins, Github, JIRA, etc.
  • Expertise in Python is a plus.


About dotData


dotData is a Silicon Valley based startup focused on full-cycle Machine Learning and Data Science automation. Our platform automates the entire process of building predictive models starting from raw business data through data and feature engineering to machine learning all the way to production.  We have offices in the USA, Japan, and Poland. Fortune 500 organizations around the world use dotData to accelerate their ML and AI projects.

Unique to the dotData Platform is its AI-powered feature engineering, which eliminates the most time-consuming and labor- and skill-intensive aspects of the full data science process by discovering and evaluating millions of features derived from relational, transactional, temporal, geo-locational, or text data.

dotData stemmed from Dr. Ryohei Fujimaki’s experience in leading more than 100 data analysis projects at NEC, across a variety of industries. Prior to founding dotData, he was the youngest research fellow ever appointed in the 119-year history of NEC, an honor given to only six individuals worldwide among NEC’s 1000+ researchers.