What’s New in Cloudera Manager 4.5?

It has been a while since I have blogged, primarily because we have been heads-down working toward the Cloudera Manager 4.5 release that we announced yesterday!

Cloudera Manager has seen a rapid adoption among enterprise customers and as more clusters are deployed into production environments, the more feature requests we get from them. We have heard our customers and the Cloudera Manager 4.5 release aims to address many of these requests. Kudos to the engineering team for another feature-packed release.

Some key features of CM4.5 release are as follows:

Rolling Upgrades/Restarts

Traditionally, upgrading a Hadoop/CDH cluster has been a cumbersome exercise for some customers. With Cloudera Manager 4.5, we are addressing the challenge of platform upgrades head-on: Users can now download, distribute, and activate the latest release of CDH completely from within Cloudera Manager. Cloudera Manager does all the orchestration of making sure the right bits are deployed on every node of the Hadoop cluster. This feature is facilitated via a new packaging format that we are introducing, called “parcel“.

Coupled with the Rolling Restarts workflow, Cloudera Manager 4.5 enables the cluster to be upgraded with minimal downtime. In addition, the Rolling Restarts workflow is especially useful for propagating configuration updates for services (HDFS, HBASE, MR etc.) with minimal/zero downtime, as it cycles through the entire cluster and updates the appropriate configurations on a node-by-node basis.

Advanced Charting

Monitoring has been one of the core value adds of Cloudera Manager. Hadoop has tons of metrics but making sense of them requires a fair degree of expertise. Cloudera Manager brings Hadoop intelligence right into the tool, and makes Cloudera’s collective experience of managing several Hadoop clusters directly available to the users.

One of the gaps we noticed in previous releases was the inability to effectively correlate the various metrics across the stack, from services to roles to hosts. Customers ended up viewing the different related metrics on separate views and then tried to make sense of all of them. With 4.5′s new charting capabilities, you can now review all the relevant metrics (Services – HDFS, HBASE, MR etc, Roles – DATANODES, TT’s, RS’s  and Hosts) all on the same page. So next time you are experiencing a latency spike on your HBase cluster, the advanced charts will quickly help you pinpoint the problem from within a single view –  a genuine example of “breaking the silos”.

The team has also delivered a fantastic set of additional capabilities that make the lives of Hadoop administrators so much easier. For example, we have built a simple query language that enables (advanced) users to query the myriad of metrics available within Cloudera Manager to create custom charts that are relevant to their clusters. It also provides the ability to share and save these dashboards to facilitate easy sharing — both within the organization and with Cloudera Support — to more quickly troubleshoot and diagnose problems.

Node Templating

This feature has been a direct result of several customer requests (especially those with several hundreds of Hadoop nodes). Typically, customers start off with a homogenous set of hardware for their Hadoop clusters but over a period of time end up acquiring newer generations of hardware with different specs (CPUs, memory etc.). This mixed setup necessitates a more streamlined approach to managing groups of disparate hardware.

The Node Templating feature in 4.5 provides this ability to define specific role types and create custom templates based on these role types (example: Template.Large,  Template.Medium, Template.Small, etc). These templates can then be applied to the groups of hardware. This greatly simplifies the process of adding hosts and instantiating the roles that should run on those hosts.

SNMP Integration

From the beginning, we have emphasized the need for Cloudera Manager to integrate with the broader ecosystem of IT management tools in the data center. In CM3.7, we had alerts available via SMTP. In CM4.0, we added a rich set of APIs to the application to enable integration with various popular tools like Zenoss and Nagios. With CM4.5, we are making it even easier to integrate with enterprise-management tools like IBM Tivoli, HP Openview, and others via SNMP (v2 or v3).

In addition to the above core features, we have added whole set of new capabilities around Hive managament, resource management, support for auto-provisioning of clusters on Amazon EC2, automated clusterstats, and a ton of usability improvements. More details are available here.

For current Cloudera Manager users, I am sure you are looking to upgrade to 4.5 right away! For others, Cloudera Manager is the easiest and most effective way to build your Hadoop clusters. So get started!

Bala Venkatrao is a product director at Cloudera.

Filed under:

3 Responses
  • ptecza / March 01, 2013 / 2:17 PM

    Hello!

    Could you please explain me how your parcels are related to packaging system on the cluster hosts? If I understand that new feature properly, the parcels are simply the independent and alternative way for distributing your software. Am I right?

    So suppose that I use Debian Squeeze on my cluster and I’ve just activated IMPALA 0.6-1.p0.109 and CDH 4.2.0-1.cdh4.2.0.p0.10. What should I do with the installed CDH .deb packages now? Should I remove them? I don’t want both to upgrade Debian packages and download, distribute and activate the parcels.

  • Bala / March 05, 2013 / 10:28 AM

    Ptecza,

    You are correct. Parcel are a new packaging format for CDH, essentially tarballs + metadata that make it easier to have multiple version co-exist and facilitate perform rolling upgrades. It also provide greater flexibility w.r.t to where these parcels can be deployed. More information available here:

    https://ccp.cloudera.com/display/ENT45DOC/Managing+Parcels

    While you can have the parcels deployed along with the .deb pkgs, we recommend that you remove the installed packages and restart the agents to create symlinks for binaries into the parcel.

    Bla

  • ptecza / March 07, 2013 / 2:26 AM

    Hi Bala,

    Thanks a lot for your reply! Yes, I’ve removed .deb packages. But I had to use “Re-run Host Upgrade Wizard” button of CM, because the restarted agents didn’t nothing for me. After a cluster restart the parcels work fine.

Leave a comment


five − = 3