Cloudera Developer Blog · Cloudera Manager Posts

How-to: Easily Configure and Manage Clusters in Cloudera Manager 4.5

Helping users manage hundreds of configurations for the growing family of Apache Hadoop services has always been one of Cloudera Manager’s main goals. Prior to version 4.5, it was possible to set configurations at the service (e.g. hdfs), role type (e.g. all datanodes), or individual role level (e.g. the datanode on machine17). An individual role would inherit the configurations set at the service and role-type levels. Configurations made at the role level would override those from the role-type level. While this approach offers flexibility when configuring clusters, it was tedious to configure subsets of roles in the same way.

In Cloudera Manager 4.5, this issue is addressed with the introduction of role groups. For each role type, you can create role groups and assign configurations to them. The members of those groups then inherit those configurations. For example, in a cluster with heterogeneous hardware, a datanode role group can be created for each host type and the datanodes running on those hosts can be assigned to their corresponding role group. That makes it possible to tweak the configurations for all the datanodes running on the same hardware by modifying the configurations of one role group.

FAQ: Understanding the Parcel Binary Distribution Format

Have you ever wished you could upgrade to the latest CDH minor release with just a few mouse clicks, and even without taking any downtime on your cluster? Well, with Cloudera Manager 4.5 and its new “Parcel” feature, you can!

That release introduced many new features and capabilities related to parcels, and in this FAQ-oriented post, you will learn about most of them.

What are parcels?

How-to: Automate Your Hadoop Cluster from Java

One of the complexities of Apache Hadoop is the need to deploy clusters of servers, potentially on a regular basis. At Cloudera, which at any time maintains hundreds of test and development clusters in different configurations, this process presents a lot of operational headaches if not done in an automated fashion. In this post, I’ll describe an approach to cluster automation that works for us, as well as many of our customers and partners.

Taming Complexity

At Cloudera engineering, we have a big support matrix: We work on many versions of CDH (multiple release trains, plus things like rolling upgrade testing), and CDH works across a wide variety of OS distros (RHEL 5 & 6, Ubuntu Precise & Lucid, Debian Squeeze, and SLES 11), and complex configuration combinations — highly available HDFS or simple HDFS, Kerberized or non-secure, using YARN or MR1 as the execution framework, etc. Clearly, we need an easy way to spin-up a new cluster that has the desired setup, which we can subsequently use for integration, testing, customer support, demos, and so on.

Customer Spotlight: Nokia’s Big Data Ecosystem Connects Cloudera, Teradata, Oracle, and Others

As Cloudera’s keeper of customer stories, it’s dawned on me that others might benefit from the information I’ve spent the past year collecting: the many use cases and deployment patterns for Hadoop amongst our customer base.

This week I’d like to highlight Nokia, a global company that we’re all familiar with as a large mobile phone provider, and whose Senior Director of Analytics – Amy O’Connor – will be speaking at tomorrow’s Cloudera Sessions event in Boston.

Cloudera Academic Partnership Program: Creating Hadoop Lovers in Universities Worldwide

Today Cloudera announced a new Cloudera Academic Partnership program, in which participating universities worldwide get access to curriculum, training, certification, and software. 

As noted in the press release, the global demand for people with Apache Hadoop and data science skills is dwarfing all supply. We consider it an important mission to help accredited universities meet that demand, by equipping them with the content and training they need to educate students in the Hadoop arts.

How-to: Use Vagrant to Set Up a Virtual Hadoop Cluster

This guest post comes to us from David Greco, CTO of Eligotech.

Vagrant is a very nice tool for programmatically managing many virtual machines (VMs) on a single physical machine. It natively supports VirtualBox and also provides plugins for VMware Fusion and Amazon EC2, supporting the management of VMs in those environments as well.

How-to: Create a CDH Cluster on Amazon EC2 via Cloudera Manager

Editor’s Note (added Feb. 28, 2014): The instructions below are deprecated for Cloudera Manager releases beyond 4.5. Please refer to this doc for instructions pertaining to releases 4.6 and later.

Cloudera Manager includes a new express installation wizard for Amazon Web Services (AWS) EC2. Its goal is to enable Cloudera Manager users to provision CDH clusters and Cloudera Impala (the open source distributed query engine for Apache Hadoop) on EC2 as easily as possible (for testing and development purposes only, not supported for production workloads) - and thus is currently the fastest way to provision a Cloudera Manager-managed cluster in EC2.

How-to: Set Up Cloudera Manager 4.5 for Apache Hive

Last week Cloudera released the 4.5 release of Cloudera Manager, the leading framework for end-to-end management of Apache Hadoop clusters. (Download Cloudera Manager here, and see install instructions here.) Among many other features, Cloudera Manager 4.5 adds support for Apache Hive. In this post, I’ll explain how to set up a Hive server for use with Cloudera Manager 4.5 (and later).

For details about other new features in this release, please see the full release notes:

What’s New in Cloudera Manager 4.5?

It has been a while since I have blogged, primarily because we have been heads-down working toward the Cloudera Manager 4.5 release that we announced yesterday!

Cloudera Manager has seen a rapid adoption among enterprise customers and as more clusters are deployed into production environments, the more feature requests we get from them. We have heard our customers and the Cloudera Manager 4.5 release aims to address many of these requests. Kudos to the engineering team for another feature-packed release.

New Products and Releases: Cloudera Navigator, Cloudera Enterprise BDR, and More

Today is an exciting day for Cloudera customers and users. With an update to our 100% open source platform and a number of new add-on products, every software component we ship is getting either a minor or major update. There’s a lot to cover and this blog post is only a summary. In the coming weeks we’ll do follow-on blog posts that go deeper into each of these releases.

New Products

Newer Posts Older Posts