Our thanks to Jordan Zimmerman, software engineer at Netflix, for the guest post below about the recently announced Apache Curator (incubating) project.
Apache ZooKeeper (zookeeper.apache.org) is a client/server system for distributed coordination. On the client side, you use the client library (from Java, C/C++, etc.) to connect to the server. The client library exposes APIs that resemble a simple filesystem.
It’s widely accepted that you should never design or implement your own cryptographic algorithms but rather use well-tested, peer-reviewed libraries instead. The same can be said of distributed systems: Making up your own protocols for coordinating a cluster will almost certainly result in frustration and failure.
Architecting a distributed system is not a trivial problem; it is very prone to race conditions, deadlocks, and inconsistency. Making cluster coordination fast and scalable is just as hard as making it reliable.
In this installment of “Meet the Engineer”, get to know Customer Operations Engineering Manager/Apache Sqoop committer Kathleen Ting (@kate_ting).
What do you do at Cloudera, and in what open-source projects are you involved?
I’m a support manager at Cloudera, and an Apache Sqoop committer and PMC member. I also contribute to the Apache Flume and Apache ZooKeeper mailing lists and organize and present at meetups, as well as speak at conferences,
For several good reasons, 2013 is a Happy New Year for Apache Hadoop enthusiasts.
In 2012, we saw continued progress on developing the next generation of the MapReduce processing framework (MRv2), work that will bear fruit this year. HDFS experienced major progress toward becoming a lights-out, fully enterprise-ready distributed filesystem with the addition of high availability features and increased performance. And a hint of the future of the Hadoop platform was provided with the Beta release of Cloudera Impala,
AssignmentManager is a module in the Apache HBase Master that manages regions to RegionServers assignment. (See HBase architecture for more information.) It ensures that all regions are assigned and each region is assigned to just one RegionServer.
Although the AssignmentManager generally does a good job, the existing implementation does not handle assignments as well as it could. For example, if a region was assigned to two or more RegionServers,