Cloudera Live: The Instant Apache Hadoop Experience
Get started with Apache Hadoop and use-case examples online in just seconds.
Today, we announced Cloudera Live, a new online service for developers and analysts (currently in public beta) that makes it easy to learn, explore, and try out CDH, Cloudera’s open source software distribution containing Apache Hadoop and related projects. No downloads, no installations, no waiting — just point-and-play!
Cloudera Live is just that: a complete, live, CDH 5 cluster with a Hue interface (based on Hue 3.5.0, the latest and greatest). It includes pre-packaged examples/patterns for using Impala, Search, Apache HBase, and many other Hadoop ecosystem components. (Note: Cloudera Live is currently read-only, so loading data via the Apache Sqoop app isn’t possible. To explore CDH with ingested data, download our QuickStart VM.)
After spending some time with Cloudera Live (within a three-hour session), you may be wondering: How did we do it? As you’ll find from the answer below, the combination of Amazon Web Services (AWS) and Cloudera Manager made it easy.
Inside Cloudera Live
Cloudera Live is hosted on four AWS m3.large instances containing Ubuntu 12.04 and 100GB storage. (If you ever try to build your own cluster on AWS for your own use and thus need less performance, one xlarge instance will be enough — or, you could install fewer services on an even smaller instance.)
We configured the security group as shown below. We allow everything between the instances (the first row — don’t forget that on multi-machine clusters!) and opened up Cloudera Manager and Hue ports to the outside.
We used Cloudera Manager to auto-install everything for us based on this guide. Moreover, post-install monitoring and configuration was greatly simplified.
The first step was to connect to one of the machines:
ssh -i ~/demo.pem firstname.lastname@example.org
Next, we retrieved and started Cloudera Manager:
wget http://archive.cloudera.com/cm5/installer/latest/cloudera-manager-installer.bin chmod +x sudo ./cloudera-manager.bin
After logging in with the default credentials (admin/admin), we entered all the Public DNS IP addresses (such as ec2-11-222-333-444.compute-1.amazonaws.com) on our machines in the Install Wizard and clicked Go. Et voila, Cloudera Manager set up the entire cluster automatically! Hence, Cloudera Live was born.
We hope you enjoy Cloudera Live, and we need your feedback whether you do or not! You can do that via the upstream Hue list, the Hue forum at cloudera.com/community, or by clicking on the “Feedback” tab in the demo itself.