Apache Bigtop: The "Fedora of Hadoop" is Now Built on Hadoop 2.x

Categories: Bigtop CDH Hadoop

BigtopJust in time for Hadoop Summit 2013, the Apache Bigtop team is very pleased to announce the release of Bigtop 0.6.0: The very first release of a fully integrated Big Data management distribution built on the currently most advanced Hadoop 2.x, Hadoop 2.0.5-alpha.

Bigtop, as many of you might already know, is a project aimed at creating a 100% open source and community-driven Big Data management distribution based on Apache Hadoop. (You can learn more about it by reading one of our previous blog posts on Apache Blogs.) Bigtop also plays an important role in CDH, which utilizes its packaging code from Bigtop — Cloudera takes pride in developing open source packaging code and contributing the same back to the community.

The very astute readers of this blog will notice that given our quarterly release schedule, Bigtop 0.6.0 should have been called Bigtop 0.7.0. It is true that we skipped a quarter. Our excuse is that we spent all this extra time helping the Hadoop community stabilize the Hadoop 2.x code line and making it a robust kernel for all the applications that are now part of the Bigtop distribution.

And speaking of applications, we haven’t forgotten to grow the Bigtop family: Bigtop 0.6.0 adds Apache HCatalog and Apache Giraph to the mix. The full list of Hadoop applications available as part of the Bigtop 0.6.0 release is:

  • Apache Zookeeper 3.4.5
  • Apache Flume 1.3.1
  • Apache HBase 0.94.5
  • Apache Pig 0.11.1
  • Apache Hive 0.10.0
  • Apache Sqoop 2 (AKA 1.99.2)
  • Apache Oozie 3.3.2
  • Apache Whirr 0.8.2
  • Apache Mahout 0.7
  • Apache Solr (SolrCloud) 4.2.1
  • Apache Crunch (incubating) 0.5.0
  • Apache HCatalog 0.5.0
  • Apache Giraph 1.0.0
  • LinkedIn DataFu 0.0.6
  • Cloudera Hue 2.3.0

The list of supported Linux platforms has expanded to include:

  • CentOS/RHEL 5 and 6
  • Fedora 17 and 18
  • SuSE Linux Enterprise 11
  • OpenSUSE 12.2
  • Ubuntu LTS Lucid (10.04) and Precise (12.04)
  • Ubuntu Quantal (12.10)

We would like to invite everybody to give the Bigtop 0.6.0 binary distribution a try. All you have to do is to pick your favorite Linux distribution, follow our wiki instructions, and you will have your first pseudo-distributed cluster computing pi in no time.

If you’re thinking about deploying Bigtop to a fully-distributed cluster, you might find our Puppet code to be useful — after all, we use it all the time ourselves to test Bigtop. There is brief documentation on how to run our Puppet recipes in a master-less puppet configuration, but a typical Puppet master setup should work as well.

Finally, Bigtop would not have been possible without the tireless work of all the volunteer developers. This is an amazing community to be part of, and if you would like to join us, now is the time. In fact, we decided to take advantage of Hadoop Summit drawing a lot of Hadoop developers to the San Francisco Bay Area and have our first meeting of the Apache Bigtop Working Group on Thursday, June 27 at Elance’s offices in Mountain View. Come join us! It is a lot of fun to build the future of Big Data management together!

Happy Big Data discoveries,

Your faithful and tireless Bigtop development team.

Roman Shaposhnik is a Software Engineer on the Infrastructure team and VP/PMC chair of the Bigtop project.