Cloudera Machine Learning for CDP: Purpose Built for the AI-First Enterprise

The AI Opportunity

Today’s modern enterprises are collecting data at exponential rates, and it’s no mystery that effectively making use of that data has become a top priority for many. According to a recent survey of 2000 global enterprises by McKinsey & Company, 47% of organizations have embedded at least one AI capability in their standard business processes. This is up from 20% in 2017 and it’s clear that this growth has created a global race to enabling the next important evolution of business as we know it: The AI-first enterprise.

But what does this actually mean? With investment in AI technologies poised to reach $9.5 billion over the next three years, the imminent opportunity involves embedding data and machine learning intelligence across the business at scale — predicting the next best move for growth, making every product a data product, or creating entirely new data-driven revenue streams. To truly become AI-first requires transforming your brand and business around the possibility of automating, augmenting or completely reinventing processes with machine learning. But the truth is, enabling your business to be successful with ML and AI is hard, and cutting through the hype isn’t easy.

The Challenge

Unlike traditional descriptive analytics — which focus on optimizing how we count things to help us understand what has already happened — each prediction problem is unique. These problems often require experimentation to reach the desired business outcome, and for AI-minded organizations, this means empowering data science teams to try lots of things quickly and affordably in a secure and scalable governed environment. 

While this may sound simple, in practice there are many considerations with streamlining these workflows. How do I provide secure, governed access to all our disparate corporate data? How do we eliminate data science silos between individuals, teams, and departments? And most importantly, how do I lower the barriers of fast experimentation, while optimizing production workloads at scale?

At Cloudera, we believe that data makes what’s impossible today, possible tomorrow. We created Cloudera Machine Learning (CML) for Cloudera Data Platform (CDP) to tackle these challenges head-on and empower our clients to incorporate machine learning and AI to augment, automate, or create entirely new business capabilities with prediction at enterprise scale.

Cloudera Data Platform: The Ideal Foundation for Enterprise ML

Cloudera Data Platform is the world’s first implementation of a new kind of data platform designed to meet the specific needs of large enterprises — a true Enterprise Data Cloud. From its inception, we set out to address the most important requirements of an Enterprise Data Cloud including multi-function capabilities that support many types of use cases on shared data; hybrid cloud deployments to offer maximum flexibility over where to store data and how to manage infrastructure; advanced security and governance controls to democratize data without creating regulatory or compliance risk; and a fully open architecture — not only open source to avoid lock-in but also interoperable with a broad ecosystem of enterprise vendors.

Cloudera Machine Learning provides the only end-to-end machine learning platform for Enterprise businesses largely because it leverages components of CDP in order to deliver a truly seamless experience from data exploration to modeling and putting ML models into production. CML is integrated with CDP by-design to provide a consistent experience with secure, shared business data across hybrid and multi-cloud environments. 

This combination of CML on CDP enables data science teams to quickly run repeatable experiments anywhere and deploy models into production from one pane of glass without leaving the platform. We envision a near future for CML where truly hybrid workflows are quick and easy — where the data storage  and compute resources necessary for training and deploying models into production can be seamlessly synchronized, customized, and replicated on any cloud or datacenter — all without breaking IT requirements, creating data silos, or running out of compute resources. 

What is Cloudera Machine Learning?

Cloudera Machine Learning is Cloudera’s enterprise machine learning service for Cloudera Data Platform. It unifies self-service data science and data engineering in a single, portable service as part of an enterprise data cloud for multi-function analytics on data anywhere. Because CML is built on CDP, data science teams can quickly experiment on secure shared data with elastic compute, within guardrails set by administrators — all while adhering to IT policies and without impacting other users or workloads.

For readers familiar with Cloudera Data Science Workbench (CDSW), Cloudera Machine Learning is the next evolution of our enterprise machine learning platform.  CML delivers the benefits of integration with CDP as an enterprise data cloud while retaining the core functionalities of CDSW including support for 3rd party editors like Jupyter and RStudio, built-in analytics Jobs scheduling, and the Experiments and Models features for end-to-end ML workflows, from research to production.

This means teams can take advantage of unified shared data catalogs, containerized multi-cloud portability, and rapid provisioning of workspaces with elastic autoscaling of CPU and GPU resources — all without configuring or managing Spark clusters or distributed dependencies as ML workloads vary. Because CML is licensed as a managed cloud service, data science teams can experiment with agility while leveraging consumption-based pricing, never paying for unused clusters or compute.

CML will be available first for your AWS public clouds with support for Azure and later GCP coming soon. Moving forward, CML will deliver the same elastic, auto-scaling experience in public and private clouds, or a configurable architecture for your datacenter.  As we continue to innovate, customers will see the latest and greatest capabilities in Cloudera Machine Learning deployed in the public cloud version first, followed by the same releases available through either traditional on-prem CDSW deployment or as a service for CDP private cloud. 

Get To Know CML for CDP

Cloudera Machine Learning for CDP delivers the world’s first Enterprise Data Cloud experience for end-to-end machine learning workflows; enabling data science teams to quickly experiment and deliver production models in a secure, shared data environment with elastic compute and advanced governance functionality, anywhere. 

If you missed it, be sure to catch the product demo webinar: Enable the AI-first enterprise with Cloudera Machine Learning for Cloudera Data Platform

Interested in learning more about Cloudera Machine Learning? Stay tuned for the latest announcements and events. If you’ll be at Strata Data Conference in New York this September, be on the look out for presentations and announcements about CML, CDP, and other innovations from Cloudera. Need to know more now? Reach out to your account representative and ask how we can help your organization apply machine learning to solve your business challenges.

Santiago Giraldo
Santiago Giraldo
Bethann Noble

Director Product Marketing, Machine Learning

1 Comments

by Bob Swan on

Great Article

Leave a comment

Your email address will not be published. Links are not permitted in comments.