Once the data science is done (and you know where your data comes from, what it looks like, and what it can predict) comes the next big step: you now have to put your model into production and make it useful for the rest of the business. This is the start of the model operations life cycle. The key focus areas (detailed in the diagram below) are usually managed by machine learning engineers after the data scientists have done their work.
This was originally published on the Fast Forward Labs blog
We are excited to release Learning with Limited Labeled Data, the latest report and prototype from Cloudera Fast Forward Labs.
Being able to learn with limited labeled data relaxes the stringent labeled data requirement for supervised machine learning. Our report focuses on active learning, a technique that relies on collaboration between machines and humans to label smartly.
Spark ML is one of the dominant frameworks for many major machine learning algorithms, such as the Alternating Least Squares (ALS) algorithm for recommendation systems, the Principal Component Analysis algorithm, and the Random Forest algorithm.
[Editor’s note: This article was originally published on the Hortonworks Community Connection, but reproduced here because CDSW is now available on both Cloudera and Hortonworks platforms.]
Using Deployed Models as a Function as a Service
Using Cloudera Data Science Workbench with Apache NiFi, we can easily call functions within our deployed models from Apache NiFi as part of flows. I am working against CDSW on HDP (https://www.cloudera.com/documentation/data-science-workbench/latest/topics/cdsw_hdp.html),