Cloudera Developer Blog · Cloudera Manager Posts
I am pleased to announce the release of Cloudera Impala Beta (version 0.3) and Cloudera Manager 4.1.2. Key enhancements in each release are:
Cloudera Impala Beta (version 0.3)
I am pleased to announce the release of Cloudera Impala Beta (version 0.2) and Cloudera Manager 4.1.1. These are both enhancement releases to make bug fixes available quickly. Key enhancements in each release are:
Cloudera Impala Beta (version 0.2)
I am very pleased to announce the availability of Cloudera Manager 4.1. This release adds support for the Cloudera Impala beta release, and management and monitoring of key CDH features.
Here are the highlights of Cloudera Manager 4.1:
Axemblr, purveyors of a cloud-agnostic MapReduce Web Service, have recently announced the availability of an Apache-licensed Java Client for the Cloudera Manager API.
The task at hand, according to Axemblr, is to ”deploy Hadoop on Cloud with as little user interaction as possible. We have the code to provision the hosts but we still need to install and configure Hadoop on all nodes and make it so the user has a nice experience doing it.” And voila, the answer is Cloudera Manager, with the process made easy via the REST API introduced in Release 4.0.
Thus, says Axemblr: “In the pursuit of our greatest desire (second only to coffee early in the morning), we ended up writing a Java client for Cloudera Manager’s API. Thus we achieved to automate a CDH3 Hadoop installation on Amazon EC2 and Rackspace Cloud. We also decided to open source the client so other people can play along.”
Note (added July 8, 2013): The information below is deprecated; we suggest that you refer to this post for current instructions.
Today we bring you one user’s experience using Apache Whirr to spin up a CDH cluster in the cloud. This post was originally published here by George London (@rogueleaderr) based on his personal experiences; he has graciously allowed us to bring it to you here as well in a condensed form. (Note: the configuration described here is intended for learning/testing purposes only.)
I’m going to walk you through a (relatively) simple set of steps that will get you up and running MapReduce programs on a cloud-based, six-node distributed Apache Hadoop/Apache HBase cluster as fast as possible. This is all based on what I’ve picked up on my own, so if you know of better/faster methods, please let me know in comments!
Our video animation factory has been busy lately. The embedded player below contains our two latest ones stitched together:
Get Started with Hadoop Using Cloudera Enterprise, Part 1
What do you do at Cloudera, and in which Apache project are you involved?
For the last year and a half, I’ve been an engineer on the Enterprise team. We’re the guys who build Cloudera Manager, and all the goodies that make it easy to manage and administer Apache Hadoop clusters. Specifically, I’ve worked on a number of things across the product, like scale and performance for the databases underlying the various monitoring tools available in the Enterprise edition of Cloudera Manager. I’ve also worked extensively on our operational reporting and HDFS file search capabilities. While I don’t work full-time on any of the Apache projects, I have been known to contribute to Apache Hive and Hadoop on rainy days.
Strata Conference + Hadoop World (Oct. 23-25 in New York City) is a bonanza for Hadoop and big data enthusiasts – but not only because of the technical sessions and tutorials. It’s also an important gathering place for the developer community, most of whom are eager to share info from their experiences in the “trenches”.
Just to make that process easier, Cloudera is teaming up with local meetups during that week to organize a series of meetings on a variety of topics. (If for no other reason, stop into one of these meetups for a chance to grab a coveted Cloudera t-shirt.)
As you can see, these meetups are highly parallel, so you will either have to make careful choices or have very quick feet. The good news is: there’s something for everybody.
What’s to love about Cloudera Enterprise? A lot! But rather than bury you in documentation today, we’d rather bring you a less-than-two-minute-long video:
API access was a new feature introduced in Cloudera Manager 4.0 (download free edition here.). Although not visible in the UI, this feature is very powerful, providing programmatic access to cluster operations (such as configuration and restart) and monitoring information (such as health and metrics). This article walks through an example of setting up a 4-node HDFS and MapReduce cluster via the Cloudera Manager (CM) API.
Cloudera Manager API Basics
The CM API is an HTTP REST API, using JSON serialization. The API is served on the same host and port as the CM web UI, and does not require an extra process or extra configuration. The API supports HTTP Basic Authentication, accepting the same users and credentials as the Web UI. API users have the same privileges as they do in the web UI world.
You can read the full API documentation here.