Flavio Junqueira (PMC Chair of the Apache ZooKeeper project and a member of the Systems and Networking Group at Microsoft Research) and Benjamin Reed (PMC Member and Software Engineer at Facebook) are the co-authors of the new O’Reilly Media book ZooKeeper: Distributed Process Coordination. We had a chat with Flavio and Ben recently about the rationale for writing the book, and what it will add to the distributed systems conversation.
Apache ZooKeeper is a client/server system for distributed coordination that exposes an interface similar to a filesystem, where each node (called a znode) may contain data and a set of children. Each znode has a name and can be identified using a filesystem-like path (for example, /root-znode/sub-znode/my-znode).
In Apache HBase, ZooKeeper coordinates, communicates, and shares state between the Masters and RegionServers. HBase has a design policy of using ZooKeeper only for transient data (that is,
Our thanks to Jordan Zimmerman, software engineer at Netflix, for the guest post below about the recently announced Apache Curator (incubating) project.
Apache ZooKeeper (zookeeper.apache.org) is a client/server system for distributed coordination. On the client side, you use the client library (from Java, C/C++, etc.) to connect to the server. The client library exposes APIs that resemble a simple filesystem.
It’s widely accepted that you should never design or implement your own cryptographic algorithms but rather use well-tested, peer-reviewed libraries instead. The same can be said of distributed systems: Making up your own protocols for coordinating a cluster will almost certainly result in frustration and failure.
Architecting a distributed system is not a trivial problem; it is very prone to race conditions, deadlocks, and inconsistency. Making cluster coordination fast and scalable is just as hard as making it reliable.
In this installment of “Meet the Engineer”, get to know Customer Operations Engineering Manager/Apache Sqoop committer Kathleen Ting (@kate_ting).
What do you do at Cloudera, and in what open-source projects are you involved?
I’m a support manager at Cloudera, and an Apache Sqoop committer and PMC member. I also contribute to the Apache Flume and Apache ZooKeeper mailing lists and organize and present at meetups, as well as speak at conferences,