Category Archives: CDH

Apache Hadoop Availability

Categories: CDH General Hadoop HDFS MapReduce

A common question on the Apache Hadoop mailing lists is what’s going on with availability? This post takes a look at availability in the context of Hadoop, gives an overview of the work in progress and where things are headed.

Background

When discussing Hadoop availability people often start with the NameNode since it is a single point of failure (SPOF) in HDFS, and most components in the Hadoop ecosystem (MapReduce,

Read More