Today is an exciting day for Cloudera customers and users. With an update to our 100% open source platform and a number of new add-on products, every software component we ship is getting either a minor or major update. There’s a lot to cover and this blog post is only a summary. In the coming weeks we’ll do follow-on blog posts that go deeper into each of these releases.
We’re now supporting several hundred production Hadoop clusters. In doing so we’ve had to make a lot of advances in the functionality, reliability and manageability of the Hadoop platform. Even with these improvements, customers have been traditionally reluctant to run certain data and applications on the Apache Hadoop platform. The new products we are announcing today were designed to remove these obstacles to adoption.
Storing sensitive data in Hadoop: Cloudera Navigator 1.0. Many of our customers have the need to work with data that is considered sensitive either because of internal company policies or because they operate in regulated industries. We’ve made some great strides in improving security in the Hadoop stack but auditability and data governance has been largely lacking. No longer. Today we’re announcing the general availability of Cloudera Navigator 1.0, a data governance suite for the Hadoop stack. This first release focuses on audit and access management for all of the key objects in the Hadoop stack from HDFS files and directories to Hive tables to HBase tables. Cloudera Navigator is also designed to work with your existing compliance systems & processes.
Entrusting business critical workloads to Hadoop: Cloudera Enterprise BDR 1.0. We’ve seen many customers who have been hesitant to trust business-critical workloads to Apache Hadoop because of the lack of a complete and consistent disaster recovery capability. Problem solved. Today we are announcing the general availability of Cloudera Enterprise BDR 1.0, a disaster recovery automation solution for the Hadoop stack. In its first version BDR provides the ability to create & maintain a remote DR cluster. BDR builds on the existing data replication primitives of the Hadoop stack but adds automation, synchronization and transparency. (Editor’s note: As of Feb. 3, 2014, BDR is no longer a separate product but rather an included feature in all Cloudera Enterprise editions.)
We’re releasing a major new version of Cloudera Manager 4.5. It contains many new enhancements including custom charting, templating, SNMP support and scheduling. One big new enhancement we’ve been working on for some while is rolling upgrades. With Cloudera Manager it is now possible to patch, update, or upgrade your Hadoop cluster without taking downtime. The complete process is automated in Cloudera Manager including the ability to restore old versions. Combined with the fact that Cloudera Manager supports multiple clusters and multiple versions and you can automate the entire dev-test-prod lifecycle from within the system. This has big implications for improving the uptime of a Hadoop cluster. It also has big implications for the security and quality of production Hadoop clusters. Before rolling upgrades it simply wasn’t realistic to keep a production cluster up to date with the latest bug fixes and security patches.
Cloudera Manager Free Edition also gets an update today. Free Edition can now be used on an unlimited number of nodes. We will also add a number of new monitoring capabilities to free edition in the next few weeks as users shouldn’t have to stitch together a bunch of disparate tools to run their clusters.
We are also releasing an update to our open source platform CDH. This update CDH 4.2 contains a number of enhancement and more than a hundred bug fixes. Key enhancements include HBase snapshots (multiple point in time recovery) and the ability to run a highly available JobTracker (failover will require a restart of jobs in flight). Apache Hive gets a number of enhancements including support for decimal data types, support for user impersonation and the addition of the HCatalog features that are now folded in as part of Hive. Impala users will benefit from all of these same enhancements. There are many smaller improvements to Flume, Sqoop and Hue, too numerous to mention here. Per usual, CDH updates are designed to maintain stability and application compatibility. Users and customers may also skip CDH updates as they see fit.
Lastly, we are releasing an update to Cloudera Impala 0.6. Impala is still in beta but we hope to be able to remove that label shortly. In the meantime 0.6 adds support for Avro and RCFILE as well as JDBC. Impala 0.6 will require CDH 4.2.
As always we appreciate your feedback. You can download the latest updates to CDH, Impala, and Cloudera Manager here. Thanks so much for your support of Apache Hadoop and of Cloudera.