Tag Archives: Cloudera Data Science Workbench

What’s New in Cloudera Director 2.8?

Categories: CDH Cloudera Director

Cloudera Director 2.8 introduces a simpler way to create clusters in AWS or Microsoft Azure that requires less information to get started than the standard procedure. A new configuration export capability enables retrieval of a client configuration file for any cluster as a starting point to create new clusters.

Cloudera Director helps you deploy, scale, and manage Cloudera clusters in AWS, Microsoft Azure, or Google Cloud Platform.

Read more

What’s New in Cloudera Director 2.7?

Categories: Cloudera Director

Cloudera Director 2.7 introduces support for LDAP authentication, improved Java 8 support, and instance template level normalization configuration. Continuing improvements have been made to the AWS plugin.

Cloudera Director helps you deploy, scale, and manage Cloudera clusters in AWS, Azure, or Google Cloud Platform. Its enterprise-grade features deliver a mechanism for establishing production-ready clusters in the cloud for big-data workloads and applications in a simple, reliable, automated fashion.

Read more

Deploy Cloudera EDH Clusters Like a Boss Revamped – Part 2

Categories: CDH Hadoop HDFS

In Part 1: Infrastructure Considerations in this three part revamped series on deploying clusters like a boss, we provided a general explanation for how nodes are classified, disk layout configurations and network topologies to think about when deploying your clusters.

In this Part 2: Service and Role Layouts segment of the series, we take a step higher up the stack looking at the various services and roles that make up your Cloudera Enterprise deployment.

Read more

New in Cloudera Data Science Workbench 1.2: Usage Monitoring for Administrators

Categories: CDH Cloudera Data Science Workbench Data Science Performance

Cloudera Data Science Workbench (CDSW) provides data science teams with a self-service platform for quickly developing machine learning workloads in their preferred language, with secure access to enterprise data and simple provisioning of compute. Individuals can request schedulable resources (e.g. compute, memory, GPUs) on a shared cluster that is managed centrally.

While self-service provisioning of resources is critical to the rapid interaction cycle of data scientists, it can pose a challenge to administrators.

Read more