I was positively blown away by the enthusiasm, creativity, and productivity exhibited by the participants in the CDH3b2 Hackathon. We had over twenty participants from established companies like Oracle and Akamai, stealth-mode startups and one-man consulting shops. At one point we had 9 simultaneous hacking projects going, with groups of one to five people. At the end of the day, participants voted on the most interesting project, which won a prize – an iPod Nano for each participant on that project.
CDH3 beta 2 includes Apache Hive 0.5.0, the latest version of the popular open source Apache Hadoop data warehouse platform. Hive allows you to express data analysis tasks in a dialect of SQL called HiveQL, and then compiles these tasks into MapReduce jobs and executes the jobs on your Hadoop cluster. Hive is a natural entry point to Hadoop for people who have prior experience with relational databases,
This post was contributed by John Sichi, a committer on the Apache Hive project and a member of the Data Infrastructure team at Facebook.
As many readers may already know, Hive was initially developed at Facebook for dealing with explosive growth in our multi-petabyte data warehouse. Since its release as an Apache project, it has been put into use at a number of other companies for solving big data problems. Read more
It’s official – Cloudera’s Distribution for Hadoop Version 2, which we often shorthand as CDH2, has been released. CDH2 is the product we recommend to our current production customers. It’s a stable version that has undergone a long cycle of time in the field with a variety of customers, in addition to Cloudera’s internal QA process.
And with the CDH2 release, the Cloudera engineering team is excited to start the feedback and development process for the next version of Cloudera’s Distribution for Hadoop –
In the process of working on a few things here I wanted to add some links to launch Apache Hive and the Hadoop Jobtracker. At first I considered just adding the links but I found myself wanting a button of some sort; an icon for them. I didn’t want to just use the (awesomely cute) Apache Hadoop logo elephant because these things are related to and part of Hadoop, but they aren’t Hadoop itself…