Tag Archives: machine learning

Putting Machine Learning Models into Production

Categories: AI and Machine Learning Cloudera Data Science Workbench Spark

Once the data science is done (and you know where your data comes from, what it looks like, and what it can predict) comes the next big step: you now have to put your model into production and make it useful for the rest of the business. This is the start of the model operations life cycle. The key focus areas (detailed in the diagram below) are usually managed by machine learning engineers after the data scientists have done their work.

Read more

Visual Model Interpretability for Telco Churn in Cloudera Data Science Workbench

Categories: CDH Cloudera Data Science Workbench Fast Forward Labs Spark

Disclaimer: the scenario below is hypothetical.   Any similarity to any specific telecommunications company is purely coincidental.  

Although we use the example of a telecommunications company the following applies to every organization with customers or voluntary stakeholders.  

Introduction

Imagine that you are a Chief Data Officer at a major telecommunications provider and the CEO has asked you to overhaul the existing customer churn analytics.  The current process relies on manual export of data from dozens of data sources including ERP,

Read more

Using Native Math Libraries to Accelerate Spark Machine Learning Applications

Categories: AI and Machine Learning CDH Performance Spark

[Editor’s note: The original version of this article was published as part of our Guru How-To series for Data Science. Be sure to also check out the series for Cloudera Data Warehouse.]

 

Spark ML is one of the dominant frameworks for many major machine learning algorithms, such as the Alternating Least Squares (ALS) algorithm for recommendation systems, the Principal Component Analysis algorithm, and the Random Forest algorithm.

Read more

Integrating Machine Learning Models into Your Big Data Pipelines in Real-Time With No Coding

Categories: AI and Machine Learning CDH Cloudera Data Science Workbench How-to

[Editor’s note: This article was originally published on the Hortonworks Community Connection, but reproduced here because CDSW is now available on both Cloudera and Hortonworks platforms.]

Using Deployed Models as a Function as a Service

104409 dataengineering 104410 datascience 104431 flowmanagement

Using Cloudera Data Science Workbench with Apache NiFi, we can easily call functions within our deployed models from Apache NiFi as part of flows. I am working against CDSW on HDP (https://www.cloudera.com/documentation/data-science-workbench/latest/topics/cdsw_hdp.html), 

Read more

Altus SDK for Java

Categories: Altus

We are excited to announce the general availability of Cloudera Altus SDK for Java to programmatically leverage the Altus platform-as-a service for ETL, batch machine learning, and cloud bursting. Altus empowers customers and partners alike, to run data engineering workloads in the cloud, leveraging cloud infrastructures such as AWS. Cloudera Altus also provides the ability to create data engineering pipelines using both a web console and CLI.

Cloudera Altus SDK for Java was developed to provide easier programmatic access with the popular Java programming language so that users can automate their data engineering workloads.

Read more