Tag Archives: HDFS NameNode

How-to: Install CDH on Mac OSX 10.9 Mavericks

Categories: CDH General

This overview will cover the basic tarball setup for your Mac.

If you’re an engineer building applications on CDH and becoming familiar with all the rich features for designing the next big solution, it becomes essential to have a native Mac OSX install. Sure, you may argue that your MBP with its four-core, hyper-threaded i7, SSD, 16GB of DDR3 memory are sufficient for spinning up a VM, and in most instances —

Read more

How Apache Hadoop YARN HA Works

Categories: Hadoop YARN

Thanks to recent work upstream, YARN is now a highly available service. This post explains its architecture and configuration details.

YARN, the next-generation compute and resource management framework in Apache Hadoop, until recently had a single point of failure: the ResourceManager, which coordinates work in a YARN cluster. With planned (upgrades) or unplanned (node crashes) events, this central service, and YARN itself, could become unavailable.

This post details Cloudera’s recent work in the Hadoop community (YARN-149) to make the ResourceManager (and thus YARN) highly available.

Read more

Apache Hadoop 2 is Here and Will Transform the Ecosystem

Categories: Community Hadoop HDFS YARN

The release of Apache Hadoop 2, as announced today by the Apache Software Foundation, is an exciting one for the entire Hadoop ecosystem.

Cloudera engineers have been working hard for many months with the rest of the vast Hadoop community to ensure that Hadoop 2 is the best it can possibly be, for the users of Cloudera’s platform as well as all Hadoop users generally. Hadoop 2 contains many major advances,

Read more

Guide to Using Apache HBase Ports

Categories: Hadoop HBase

For those people new to Apache HBase (version 0.90 and later), the configuration of network ports used by the system can be a little overwhelming.

In this blog post, you will learn all the TCP ports used by the different HBase processes and how and why they are used (all in one place) — to help administrators troubleshoot and set up firewall settings, and help new developers how to debug.

A typical HBase cluster has one active master,

Read more

Introduction to Apache HBase Snapshots, Part 2: Deeper Dive

Categories: HBase

In Part 1 of this series about Apache HBase snapshots, you learned how to use the new Snapshots feature and a bit of theory behind the implementation. Now, it’s time to dive into the technical details a bit more deeply.

What is a Table?

An HBase table comprises a set of metadata information and a set of key/value pairs:

  • Table Info: A manifest file that describes the table “settings”,

Read more