Introducing Lightweight, Customizable ML Runtimes in Cloudera Machine Learning

With the complexity of data growing across the enterprise and emerging approaches to machine learning and AI use cases, data scientists and machine learning engineers have needed more versatile and efficient ways of enabling data access, faster processing, and better, more customizable resource management across their machine learning projects. The truth is that one size does not fit all, and managing all aspects of enterprise machine learning requires a lightweight, versatile, and customizable approach to effectively enable data science across your business. To address these challenges and offer a truly dynamic, self-service experience for our data science users, we are releasing new Cloudera Machine Learning Runtimes — enabling fully customizable, lightweight machine learning for both CPU and GPU processing frameworks while enabling unfettered access to data, on-demand resources, and the ability to install and use any of the libraries/algorithms without IT assistance.

Until now, Cloudera Machine Learning (CML) and Cloudera Data Science Workbench (CDSW) have enabled data scientists to work with flexible “Engines”. Engines have provided secure, containerized (isolated) working environments for data scientists out of the box, enabling true self-service access to data, compute resources, and libraries/IDEs of their choice. While this works great for a large majority of data science and machine learning use cases, Engines came with their fair share of limitations when it came to sizing and customization. Mainly, for each release of CML, a single-engine image was built and provisioned with the platform — leading to large, heavily bloated deployments with different kernels, libraries, and multiple editors packed into a single-engine image. For data scientists that needed further customization of libraries or IDEs, it meant adding more to these already over-encumbered engines — severely limiting the depth of customization and versatility of the overall platform.

New Machine Learning Runtimes For The Win

To tackle these challenges directly, we have rebuilt our ML runtimes in Cloudera Machine Learning from the ground up. Enabling lightweight deployments with maximum flexibility for customization without over encumbering the runtime profile. The new profiles are designed to meet the diverse needs of Data Scientists by enabling a variety of ML Runtimes natively:

  • Python 3.7, Python 3.8, Python 3.9, and R3.6 and R4.0 variants so that our end users can benefit and use the latest kernels available while giving them the flexibility to stick with an older version so that they don’t need to upgrade all of their projects and workloads at once. 
  • Workbench and JupyterLab variants so that they can use their preferred development environment without compromising security or governance.
  • Standard variants as the defacto environment that enables Data Scientists to start fast and easy and also enable them to customize for their specific needs. GPU variants that enable out of the box access to GPU acceleration with all the tools and libraries needed to make that fast and easy. 

Requirements for Data Science workloads differ from team to team or use-case to use-case. Building an ML Runtime that solves for all would result in heavy, bloated, and inflexible environments where resolving dependency versions would be a highly challenging task. To get around this, ML Runtimes support single teams and use-case specific workloads in an isolated way. Enabling granular customization as well as the necessary infrastructure to create, maintain, and manage them.

Authorized Data Scientists will be able to install the libraries and frameworks they need for their specific use-case in a way that doesn’t break security and compliance. 

ML Runtime Cataloging and Management

With the multitude of ML Runtimes shipped out of the box by Cloudera and customized by Data Scientist, we need an easy way for our end users to easily browse and select the right ML Runtimes for their Projects. In the near future, we will be expanding on the capabilities of what’s possible with ML Runtimes by releasing an intuitive Runtime Catalog that makes it easy to manage and reuse runtimes across your organization.

The new ML Runtime infrastructure will enable Multi-version Spark Support so Data Scientists can choose the version they want to use. 

ML Runtime infrastructure

While ML Runtimes give essential flexibility and control to Data Scientists, it also brings in infrastructure, scale, and security improvements. With the new architecture and due to the lightweight nature of the runtimes, startup performance is significantly improved for every project — especially in autoscaling environments. The small image sizes also reduce the surface area for security vulnerabilities and the new infrastructure enables us to release and patch ML Runtimes with security fixes at an increased pace. 

Learn more about Cloudera Machine Learning

Peter Ableda
Director of Product Management, Machine Learning
More by this author
Santiago Giraldo
More by this author

1 Comments

by Daniel Reichert on

Hi Peter,
sounds great! Looking forward to test it in CDP-PC.
Does the underlying Spark come prepackaged with HWC to support Hive ACID tables in the future independent to the version used? From my perspective nowadays this is a crucial integration since most Data Platforms have to be GDRP compliant.

Well, we will see as soon its GA. 🙂

Leave a comment

Your email address will not be published. Links are not permitted in comments.