As part of our series of announcements at the recent Hadoop Summit, Cloudera released two of its previously internal projects into open source. One of those was the HUE user interface environment, which we’ll be saying a bit more about later this week. The other was our data movement platform Flume. We’ve been working on Flume for many months, and it’s really exciting to be able to share the details of what we’ve been doing.
In my first few weeks here at Cloudera, I’ve been tasked with helping out with the Apache ZooKeeper system, part of the umbrella Hadoop project. ZooKeeper is a system for coordinating distributed processes. In a distributed environment, getting processes to act in any kind of synchrony is an extremely hard problem. For example, simply having a set of processes wait until they’ve all reached the same point in their execution –