Cloudera is happy to announce the fourth beta release of Cloudera’s Distribution for Apache Hadoop version 3 — CDH3b4. As usual, we’d like to share a few highlights from this release.
Since this will be the last beta before we designate CDH3 stable, our focuses for this release have been on stability, security, and scalability.
Stability and ease of use
Since we released CDH3 Beta 3 in October, we’ve deployed the distribution for production use cases at a number of customers, and have also gotten considerable feedback from the open source user community. In beta 4, we’ve addressed a number of important bugs and also improved usability. For example, one piece of feedback that we heard consistently was that beta 3 was too finnicky about permissions — in beta 4 we’ve both improved the error messages when permissions are incorrect, and made the software automatically correct permissions errors where possible.
CDH3 Beta 4 is our most secure release yet. We’ve worked over the past several months to identify and fix several vulnerabilities in Apache Hadoop, and those fixes are included in this new beta release. We encourage any users running in a secured environment to upgrade as soon as possible.
CDH3 Beta 4 merges in many of the scalability improvements contributed by Yahoo! in their 0.20.100 branch of Apache Hadoop. This includes a reduction in the amount of memory required by the NameNode, improvements to MapReduce scheduling throughput, and more scalable RPC servers.
We’re confident that this release of Hadoop will scale to meet even the largest clusters and most demanding workloads.
New Component Versions
CDH3 Beta 4 also includes new versions of many components. Highlights include:
- HBase 0.90.1, including much improved stability and operability.
- Hive 0.7.0rc0, including the beginnings of authorization support, support for multiple databases, and many other new features.
- Pig 0.8.0, including many new features like scalar types, custom partitioners, and improved UDF language support.
- Flume 0.9.3, including support for Windows and improved monitoring capabilities.
- Sqoop 1.2, including improvements to usability and Oracle integration.
- Whirr 0.3, including support for starting HBase clusters on popular cloud platforms.
For a full list of new component versions and changes, please check the CDH3 Beta 4 release notes.
Additional Platform Support
CDH3b4 also includes support for two new operating system versions: Red Hat Enterprise Linux 6 and SUSE Linux Enterprise Server 11. This is in addition to our preexisting support for Ubuntu (Lucid and Maverick), RHEL5 and CentOS 5.
Also new with CDH3b4 is support for Apache Maven. All of the packages in CDH can now be built via Maven. By using Maven to manage dependencies we’ve greatly simplified the external requirements to deploy CDH.
CDH3 Beta 4 is available now from the CDH download page.
CDH3 Beta 4 will be the last beta release before we demarcate CDH3 a stable release. Thus, we don’t plan on adding any new features or upstream component versions — instead, we’re focusing only on important bug fixes and low risk backports.
If you have been waiting for CDH3 stable, that day is coming soon. If you are looking to get the jump on upgrading, we would encourage you install or upgrade to CDH3 Beta 4 in your test environments in anticipation of this release going stable.
As we continue our quality assurance processes at Cloudera, we also hope to hear feedback from the user community. Please join the cdh-user mailing list and let us know what you think!