Category Archives: ZooKeeper

Migrating to CDH

Categories: General Hadoop HBase HDFS Hive MapReduce Pig ZooKeeper

With the recent release of CDH3b2, many users are more interested than ever to try out Cloudera’s Distribution for Hadoop (CDH). One of the questions we often hear is, “what does it take to migrate?”.

Why Migrate?

If you’re not familiar with CDH3b2, here’s what you need to know.

All versions of CDH provide:

  • RPM and Debian packages for simple installation and management.
  • Clean integration with the host operating system.

Read more

Building a distributed concurrent queue with Apache ZooKeeper

Categories: ZooKeeper

In my first few weeks here at Cloudera, I’ve been tasked with helping out with the Apache ZooKeeper system, part of the umbrella Hadoop project. ZooKeeper is a system for coordinating distributed processes. In a distributed environment, getting processes to act in any kind of synchrony is an extremely hard problem. For example, simply having a set of processes wait until they’ve all reached the same point in their execution –

Read more