Cloudera Blog · Cloudera Manager Posts

Axemblr’s Java Client for the Cloudera Manager API

Axemblr, purveyors of a cloud-agnostic MapReduce Web Service, have recently announced the availability of an Apache-licensed Java Client for the Cloudera Manager API.

The task at hand, according to Axemblr, is to ”deploy Hadoop on Cloud with as little user interaction as possible. We have the code to provision the hosts but we still need to install and configure Hadoop on all nodes and make it so the user has a nice experience doing it.” And voila, the answer is Cloudera Manager, with the process made easy via the REST API introduced in Release 4.0.

Thus, says Axemblr: “In the pursuit of our greatest desire (second only to coffee early in the morning), we ended up writing a Java client for Cloudera Manager’s API. Thus we achieved to automate a CDH3 Hadoop installation on Amazon EC2 and Rackspace Cloud. We also decided to open source the client so other people can play along.”

How-to: Set Up an Apache Hadoop/Apache HBase Cluster on EC2 in (About) an Hour

Today we bring you one user’s experience using Apache Whirr to spin up a CDH cluster in the cloud. This post was originally published here by George London (@rogueleaderr) based on his personal experiences; he has graciously allowed us to bring it to you here as well in a condensed form. (Note: the configuration described here is intended for learning/testing purposes only.)

I’m going to walk you through a (relatively) simple set of steps that will get you up and running MapReduce programs on a cloud-based, six-node distributed Apache Hadoop/Apache HBase cluster as fast as possible. This is all based on what I’ve picked up on my own, so if you know of better/faster methods, please let me know in comments!

We’re going to be running our cluster on Amazon EC2, and launching the cluster using Apache Whirr and configuring it using Cloudera Manager Free Edition.  Then we’ll run some basic programs I’ve posted on Github that will parse data and load it into Apache HBase.

Videos: Get Started with Hadoop Using Cloudera Enterprise

Our video animation factory has been busy lately. The embedded player below contains our two latest ones stitched together:

Get Started with Hadoop Using Cloudera Enterprise, Part 1 

Meet the Engineer: Jon Natkins

In this installment of “Meet the Engineers”, meet Jonathan Natkins,  also known as “Natty” by his friends and colleagues. 

What do you do at Cloudera, and in which Apache project are you involved?

For the last year and a half, I’ve been an engineer on the Enterprise team. We’re the guys who build Cloudera Manager, and all the goodies that make it easy to manage and administer Apache Hadoop clusters. Specifically, I’ve worked on a number of things across the product, like scale and performance for the databases underlying the various monitoring tools available in the Enterprise edition of Cloudera Manager. I’ve also worked extensively on our operational reporting and HDFS file search capabilities. While I don’t work full-time on any of the Apache projects, I have been known to contribute to Apache Hive and Hadoop on rainy days.

Community Meetups at Strata + Hadoop World 2012

Strata Conference + Hadoop World (Oct. 23-25 in New York City) is a bonanza for Hadoop and big data enthusiasts – but not only because of the technical sessions and tutorials. It’s also an important gathering place for the developer community, most of whom are eager to share info from their experiences in the “trenches”.

Just to make that process easier, Cloudera is teaming up with local meetups during that week to organize a series of meetings on a variety of topics. (If for no other reason, stop into one of these meetups for a chance to grab a coveted Cloudera t-shirt.)

As you can see, these meetups are highly parallel, so you will either have to make careful choices or have very quick feet. The good news is: there’s something for everybody.

Cloudera Enterprise in Less Than Two Minutes

What’s to love about Cloudera Enterprise? A lot! But rather than bury you in documentation today, we’d rather bring you a less-than-two-minute-long video:

For a quick on-ramp, download Cloudera Manager Free Edition right now – free to use for up to 50 nodes with no term limit. See also the Installation Guide.

How-to: Automate Your Cluster with Cloudera Manager API

API access was a new feature introduced in Cloudera Manager 4.0 (download free edition here.). Although not visible in the UI, this feature is very powerful, providing programmatic access to cluster operations (such as configuration and restart) and monitoring information (such as health and metrics). This article walks through an example of setting up a 4-node HDFS and MapReduce cluster via the Cloudera Manager (CM) API.

Cloudera Manager API Basics

The CM API is an HTTP REST API, using JSON serialization. The API is served on the same host and port as the CM web UI, and does not require an extra process or extra configuration. The API supports HTTP Basic Authentication, accepting the same users and credentials as the Web UI. API users have the same privileges as they do in the web UI world.

You can read the full API documentation here.

Interacting with the API

Cloudera Manager 4.0: Customer Feedback and Adoption

It’s been roughly three months since we announced GA of Cloudera Manager 4.0 (CM4) and I wanted to provide an update on its adoption and feedback from customers.

For those new to it, Cloudera Manager is the first and market-leading management platform for CDH (Cloudera’s Distribution Including Apache Hadoop). Enterprise customers are coming to expect an end-to-end tool that manages the entire lifecycle of their Hadoop operations. In fact, in a recent Cloudera customer survey, an overwhelming 95%  emphasized the need for this approach. 

Cloudera Manager sets the standard for enterprise deployment by delivering granular visibility into and control over every part of CDH – empowering operators to improve cluster performance, enhance quality of service, increase compliance and reduce administrative costs. We have also a FREE edition to get started, so try it out today! (BTW, for more information on this subject, you can attend a free Webinar on Wednesday, Sept. 19,  on the topic “How CBS Interactive Uses Cloudera Manager to Effectively Manage Their Hadoop Cluster”.)

Meet the Engineer: Eric Sammer

In this installment of “Meet the Engineer”, we meet with Eric Sammer (invariably known as just plain “Sammer”), Apache committer and author of the upcoming O’Reilly book, Hadoop Operations.

What do you do at Cloudera, and in which Apache project are you involved?

I’ve been lucky enough to be part of a few different teams at Cloudera since I joined. Almost three years ago, I joined Cloudera as a Solution Architect; a member of the professional services team. Most of my time was spent working with customers to build out Apache Hadoop and Apache HBase clusters, and designing data integration and processing pipelines. I also occasionally had the opportunity to fill in with the training team, teaching Cloudera’s Hadoop Developer and Administration courses to both public and private groups. There’s nothing more exciting than getting to hang out with a group of smart people and talk about Hadoop all day. I moved into a Principal Solution Architect role, spending more time in the office, working on architectural patterns and problems that repeat across customers, and working with internal teams on ways to improve CDH and Cloudera Manager.

Cloudera Manager 4.0.4 & Cloudera Manager 3.7.8 Released!

Cloudera Manager 4.0.4 and Cloudera Manager 3.7.8 are now available! These are enhancement releases for Cloudera Manager 4.x and Cloudera Manager 3.7.x respectively. Key enhancements include:

Cloudera Manager 4.0.4

Newer Posts Older Posts