Apache Hadoop Lab at JavaOne

Categories: General

Guest post by Daniel Templeton, Product Manager at Oracle.

Aside from JavaOne ’10 having a new home as part of the greater Oracle OpenWorld conference, it was business as usual this year. Lots of great sessions, lots of interesting labs, and lots and lots of excited developers. (I think there may have even been more attendees than the conference planners expected.) This year Apache Hadoop joined the ranks of the JavaOne hands-on labs with a lab co-produced by Oracle and Cloudera.

JavaOne Hands-on Lab S314413: Extracting Real Value from Your Data With Apache Hadoop was offered as a two-hour interactive lab designed to introduce attendees to the Hadoop environment, including writing a MapReduce program, writing a custom input reader, running, monitoring, and managing Hadoop jobs, and working with the Hive data warehousing platform. The lab was designed for participants with at least some Java programming experience but not necessarily any prior exposure to Hadoop.

In case you missed the lab at JavaOne, Oracle and Cloudera are both making the lab materials available online. Oracle will post the materials as part of the greater JavaOne presentations posting. Cloudera has already posted the lab materials online in the training section of the website.

When you download the zip file, in it you will find a lab workbook as a PDF in the root directory. At the back of the workbook, you will find an appendix that describes how to set up your own lab environment. I highly recommend that you grab the Cloudera Distribution for Hadoop (v2) to use as an environment for the lab. Cloudera even makes a prebuilt Linux/Hadoop environment available as a virtual machine. The lab was written for Solaris 11 Express and NetBeans, but you should still be able to do the lab on another OS with another IDE.

At JavaOne, the lab was very successful. Turnout was good and the comments were great! I’ve already incorporated lots of great feedback from that session into the set of lab materials that Cloudera is now hosting, but I’m always happy to hear any additional comments and/or feedback. Happy coding!