CDH3 Update 1 Released

Continuing with our practice from Cloudera’s Distribution Including Apache Hadoop v2 (CDH2), our goal is to provide regular (quarterly), predictable updates to the generally available release of our open source distribution.  For CDH3 the first such update is available today, approximately 3 months from when CDH3 went GA.

For those of you who are recent Cloudera users, here is a refresh on our update policy:

  • We will only include patches in updates that are non-compatibility breaking.
  • We will only include patches in updates that are non-disruptive.
  • You can skip updates without penalty – i.e., if you don’t find the contents of an update compelling, you can skip it and wait for a future update without having to do a delta upgrade.

There is one new addition to our update policy going forward: when it’s possible to pull features from our CDH4 roadmap into CDH3 updates in a non-disruptive way, we’ll take advantage of that opportunity.

With all that said, there are a number of improvements coming to CDH3 with update 1.  Among them are:

  1. New features – integrated Apache-compatible licensed fast compression throughout CDH, web shell for Hue, Flume / HBase integration, Fair Scheduler ACL’s, improved datanode handling of hard drive failures, and email actions and date formatting for Oozie.
  2. Improvements (stability and performance) – HBase bulk loading, Namenode stability, Fuse-DFS (mountable HDFS).
  3. New component versions – Hive 0.7.1, Pig 0.8.1, Hbase 0.90.3, Flume 0.9.4 and Sqoop 1.3.
  4. Bug fixes – 80+ bug fixes.  Per our standard practice, the enumerated fixes and their corresponding Apache project jiras are provided in the release notes.

Update 1 is available in all the usual formats (RHEL, SLES, Ubuntu, Debian packages, tarballs, and SCM Express).  Check out the installation docs for instructions. If you’re running components from the Cloudera Management Suite they will not be impacted by moving to update 1. The next update (update 2) for CDH3 is planned for mid-October.

Thank you for supporting Apache Hadoop and thank you for supporting Cloudera.

6 Responses
  • Dam / July 25, 2011 / 8:08 AM

    What do you mean by Flume / HBase integration ?
    Could we write into Hbase tables using flume stream ?

    Thank you.

  • tasasaki / July 26, 2011 / 5:16 PM

    I updated CDH3 to CDH3u1 on my Ubuntu 11.04 environment.
    I got a problem when I ran whirr to launch a cluster on AWS, the following error message was displayed.

    Runurl http://whirr.s3.amazonaws.com/0.3.0-cdh3u1/util/configure-hostnames not found.

    I found a workaround, I can launch a cluster with –run-url-base option like the following:

    $ whirr launch-cluster –config /usr/lib/whirr/recipes/hadoop-ec2.properties –run-url-base http://whirr.s3.amazonaws.com/0.3.0-cdh3u0/util

    But this is just a workaround, please put the configure-hostnames script at http://whirr.s3.amazonaws.com/0.3.0-cdh3u1/util/configure-hostnames

    Thank you in advance.

  • Ben / August 01, 2011 / 10:36 AM

    Are there any detailed change lists that highlight the differences between CDH3 update 0 and CDH3 update 1? I am comparing the hadoop-0.20.2+923.97 and hadoop-0.20.2+923.21 release notes but this is a cumbersome method of gathering this information.

  • Charles Zedlewski / August 02, 2011 / 6:09 PM

    @Dam – correct, Flume can write directly to HBase tables.

    @Tasasaki – I’m sorry you ran into this snag. The best place to get help with this is on CDH-users. You need to join this group: https://groups.google.com/a/cloudera.org/group/cdh-user/topics.

    @ Ben – I agree this is not as convenient as it could be. All the patches that are adding post 0.20.2 are listed here: http://archive.cloudera.com/cdh/3/hadoop-0.20.2+923.97.CHANGES.txt. They are in chronological order so you can essentially start in March 2011 and you’ll see the scope of changes for U1. Alternately you can also find this information on the CDH git repo.

  • Otis Gospodnetic / August 11, 2011 / 3:26 PM

    @Dan – re Flume writing to HBase – have a look at http://blog.sematext.com/2011/07/28/flume-and-hbase-integration/ to see about the current Flume/HBase status, the 2 available sinks, etc.

  • Yeonki / November 22, 2011 / 3:48 AM

    I downloaded a document which contained a list of Cloudera’s software stacks (http://www.cloudera.com/wp-content/uploads/2011/06/Clouderas-Distribution-including-Apache-Hadoop.pdf).

    It said what CDH had its components and versions. The list was as follows:
    Apache Hadoop v0.20.2 + 923
    Apache Hive v0.7.0 +27
    Apache Pig v0.8.0 +20
    Apache HBase v0.90.1 +15
    Apache Zookeeper v3.3.2 +12
    Apache Whirr v0.3.0 +5
    Apache Flume v0.9.3 +15
    Apache Sqoop v1.2 + 24
    Hue v1.2.0 +54
    Oozie v2.3.0 +31

    My question is that there are the same versions of the above with enterprise product If I purchase enterprise license?

Leave a comment


× seven = 7