Here at Cloudera, we’re committed to helping make the lives of data practitioners as painless as possible. For data scientists, we continue to provide new Applied Machine Learning Prototypes (AMPs), which are open source and available on GitHub. These pre-built reference examples are complete end-to-end data science projects. In Cloudera Machine Learning (CML), you can deploy them with the single click of a button, bringing data scientists that much closer to providing value.
The hardworking team at Cloudera’s Fast Forward Labs has hit it out of the park once again and we are happy to announce the release of two new AMPs: Video Classification and Continuous Model Monitoring.
Video Classification
Video footage constitutes a significant portion of all data in the world. The 30,000 hours of video uploaded to YouTube every hour is a part of that data; another portion is produced by 770 million surveillance cameras globally. In addition to being plentiful, video data has tremendous capacity to store useful information. Its vastness, richness, and applicability make the understanding of video a key activity within the field of computer vision.
This AMP provides a Jupyter Notebook walk-through of video classification/action recognition with a pre-trained TensorFlow model and provides guidance for working with video data. Also included is a script that demonstrates how to perform larger-scale model inference.
To learn more about video classification, check out this blog from our team at Fast Forward Labs. It dives deeper into the various aspects of classifying videos, from action detection to dense captioning.
Continuous Model Monitoring
Machine learning models are almost commonplace in the modern business world. It seems like every company is working leveraging ML methodologies to gain an advantage. However, many companies have not experienced the growing pains yet of monitoring a production model over time. One of the main issues for those new to ML in production is the reality that data is not impervious to change over time. As the data changes, so do the underlying relationships between the various independent and dependent variables. This phenomenon is referred to as concept drift.
To combat concept drift in production systems, it’s important to have robust monitoring capabilities that alert stakeholders when relationships in the incoming data or model have changed. In this AMP, we demonstrate how this can be achieved in CML. Specifically, we leverage CML’s Model Metrics feature in combination with Evidently.ai’s Data Drift, Numerical Target Drift, and Regression Performance reports to monitor a simulated production model that predicts housing prices over time.
Learn More
If you are not a Cloudera customer already and want to learn more about AMPs, go check out the catalog or read the docs. To see how easy it is to get started with AMPs in CDP, register for a test drive!
If you are already a Cloudera customer, go to the AMP tab in Cloudera Data Science Workbench or Cloudera Machine Learning and try launching an AMP for yourself!