Apache ZooKeeper is a core infrastructure component in Apache Hadoop stack and is also widely used by many companies for service discovery, configuration management, and so on. Previously ZooKeeper does not support authentication and authorization of servers that are participating in the leader election and quorum forming process; ZooKeeper assumes that every server that is listed in the ZooKeeper configuration file (zoo.cfg) is authenticated. As a result, a server listed in zoo.cfg can join the ensemble even if it is compromised, which is a serious security issue because such a server could damage the integrity of the quorum by leaking sensitive data.
To address this concern, ZooKeeper has added a new feature called “server to server mutual authentication and authorization scheme based on SASL” in ZOOKEEPER-1045. The feature is committed upstream, and will be available in the upcoming ZooKeeper 3.4.10 release. It will also be included in a future stable 3.5 upstream release, as well as a future release of Cloudera’s Distribution of Hadoop (CDH).
In this blog post, we will first review how ZooKeeper servers work together, and clarify the concepts such as quorum, leader election, and SASL which will help us understand how mutual authentication and authorization works between ZooKeeper servers. We will then go through the design and implementation of this new security related feature.
Quorum and Leader Election
A ZooKeeper ensemble consists of a set of ZooKeeper servers. Every server has a role, which is one of these types: Leader, Follower, or Observer. Leader and Followers are also called Participants. The Participants will form a quorum through leader election, and the ZooKeeper ensemble will only start serving client requests after the quorum is formed.
A Participant ZooKeeper can be in one of three possible states: Looking, Following, or Leading. A ZooKeeper server is in the state of Looking when it is first started. The server will transition to the state of Following if there is a quorum after it synchronizes states with the elected Leader or it will transition to the Leading state if there is no quorum and the server is elected as Leader. Of course, if there are not enough servers to form a quorum (which can occur in cases such as a network partition), the server will remain in the Looking state until the partition heals.
ZooKeeper implements leader election and the quorum forming process on top of TCP sockets, which provides reliable bi-directional communication between ZooKeeper servers that join a quorum are known as quorum peers.There are two types of network sockets in use by a ZooKeeper server: one type for incoming connections initiated from other ZooKeeper servers, and the other type is for outgoing connections initiated by the server itself. Each ZooKeeper server has two roles: it serves requests initiated from other servers (QuorumServer) and initiates requests to other servers (QuorumLearner).
The problem we are about to solve here is to secure the communications between these TCP sockets by authenticating and authorizing QuorumServer and QuorumLearner. More formally:
- Each QuorumServer should authenticate and authorize the requests coming from peer QuorumLearners.
- Each QuorumLearner should authenticate and authorize the requests coming from peer QuorumServers.
Now let’s see what options we have for authentication and authorization.
Authentication and Authorization : SASL and SSL
The community has decided to adopt SASL based approach in ZOOKEEPER-1045 because ZooKeeper already supports SASL based authentication and authorization scheme between the ZooKeeper client and ZooKeeper server. The SASL based solution is battle tested in the field, well supported by major Hadoop distributions such as CDH, and is integrated with software that depends on ZooKeeper (e.g. Kafka, HBase). For SSL, ZooKeeper’s support of SSL based communication between client and server is only available starting in the 3.5.1 alpha release. The SSL solution is also less mature, because it depends on the Netty network stack which is less battle tested in real production environments compared to the traditional NIO based networking stack ZooKeeper is using by default. Choosing a matured, battle tested solution is better as the ZooKeeper community takes stability and compatibility very seriously.
That said, SSL and SASL are not mutually exclusive and having both supported would be nice. Adding SSL support between ZooKeeper servers is on the ZooKeeper upstream roadmap, and is tracked by ZOOKEEPER-1000. Ultimately it is up to ZooKeeper users to choose what is the best solution for their ZooKeeper deployments based on different use cases.
SASL is a framework that defines a challenge-response protocol for authentication data exchange between a SASL client and a SASL server. In ZooKeeper, the SASL client and SASL server are represented as QuorumAuthLearner and QuorumAuthServer respectively, where the challenge-response process is implemented. SASL supports various schemes including Kerberos, which we will use in rest of our post. Please refer to this document for details on concepts of Kerberos.
A basic workflow for authenticating an incoming connection request at one Quorum Peer would look like this:
Similarly, on the other Quorum Peer, there is a corresponding QuorumAuthLearner that authenticates the response coming from the previous Quorum Peer.
Now that we can mutually authenticate Quorum Peers, let’s take a look at how authorization is implemented. As we know, every ZooKeeper server has to be listed in zoo.cfg before it can be considered part of ensemble. So zoo.cfg essentially stores the list of servers that are authorized to join the ensemble and is used as the white list of servers that ZooKeeper will check against once the servers are authenticated, by comparing the server FQDN (fully.qualified.domain.name) extracted from the server principal name against the server FQDN listed in zoo.cfg.
Because ZooKeeper requires the FQDN to be encoded in principal names, there are certain cases where authorization is not supported, for example, if a user chooses to use Digest based SASL or to deploy every ZooKeeper with a shared Kerberos credential that does not explicitly encode a host name in the principal name authorization will not work.
Previously, ZooKeeper handled connection requests between servers in a single, dedicated thread. SASL evaluation rounds that occur in the connection establishment process cost non trivial computing cycles, thus the connection handling thread will be blocked until the current connection request is authenticated through SASL. To improve performance, ZooKeeper introduces a thread pool to handle connection requests and SASL evaluation rounds concurrently.
ZooKeeper is a core component in many distributed systems so we care a lot about stability and compatibility. We are aware that adding such a big feature to a very stable ZooKeeper branch (3.4) might disrupt the compatibility and stability. So we provide a feature flag that can turn the feature on or off on demand. By default, the feature flag is turned off so there is no impact on users upgrading to new ZooKeeper releases that include this feature. Users who want this feature should turn on the feature flag explicitly.
We carefully designed the system such that all new code paths will only be executed if the feature flag is turned on. If the feature flag is turned off (which is the default), the new code paths will not be activated and there is reduced risk of regression.
Deployment and Administration
None Shared Kerberos Keytab
None Shared Kerberos deployment strategy enables both authentication and authorization by having unique keytab files deployed on each server, each keytab encodes the unique FQDN host name of each server. This is the recommended deployment for enabling server to server mutual authentication.
Shared Kerberos Keytab
To enable easy deployment of Kerberos credentials on a cluster of servers, we support sharing the same Kerberos credential among entire ZooKeeper ensemble. In the shared Kerberos deployment case, every ZooKeeper server has the same Kerberos keytab file deployed. This will greatly simplify the administration and operation of Kerberos credentials for a ZooKeeper cluster. This deployment does not support authorization, because shared kerberos keytab files do not encode unique the FQDN of each host.
Please refer to ZooKeeper wiki page for complete examples of both shared and not shared Kerberos deployment.
There are a few important configuration parameters introduced as part of this new feature:
- Feature flag: users can use this flag to turn the feature on or off.
- Rolling upgrade: users can use these parameters to perform a rolling upgrade.
- Service principal: configure service principal names.
For a complete list of configuration parameters and a reference deployment guide, please refer to ZooKeeper wiki page.
Because ZooKeeper is such a key component many software stacks, we would like to minimize the disruption during ZooKeeper upgrades. This is achieved by upgrading ZooKeeper binaries on each server one by one, such that at any given time during the rolling upgrade, there always will be a quorum available to serve client requests. Note that at certain point of the process, we have to upgrade the software on the leader, and restart of current leader will trigger a new leader election round. During leader election and immediately after leader election (before servers have finished synchronizing with each other’s state), the quorum will not be available. Thus, the goal of rolling upgrade here is not to completely avoid the unavailability of a quorum, but rather, to minimize unavailability.
The rolling upgrade functionality in ZOOKEEPER-1045 is designed based on the fact that a ZooKeeper ensemble can be configured so that it allows a mix of servers, running both the old and new ZooKeeper versions to join the quorum. This allows a server running the new version of ZooKeeper to send out authentication packets, but does not enforcing authentication of incoming requests from servers running an older version of ZooKeeper (which are unable to send out authentication packets). Please refer to ZooKeeper wiki on the exact upgrade steps.
Note this process requires three restarts of the leader, rather than one restart if we were to upgrade software on every server at the same time. The update-all-at-once approach will make the quorum unavailable during the entire upgrade, but that might be OK depending on the use cases and trade offs administrators would like to make. In the end, it is up to users to decide what upgrade path is best for them.
Conclusion and Future Work
This blog post provides insights into why we want SASL mutual authentication and authorization between ZooKeeper servers as well as how the feature is designed and implemented. This feature is a great addition to ZooKeeper and hardens the security of ZooKeeper.
But we are not quite done yet! With the great Dynamic Reconfiguration feature introduced as part of the ZooKeeper 3.5 release we need make sure that SASL based server to server mutual auth work seamlessly with the Dynamic Reconfigure feature. The community will start working on this once we finish forward porting this feature to the 3.5 branch.
Thanks to everyone who actively participated in the discussions and code reviews of ZOOKEEPER-1045 upstream issue. Special thanks to Patrick Hunt who helped reviewing and committing the patch.
Rakesh Radhakrishnan is a Software Engineer at Intel and a committer of Apache ZooKeeper project.
Michael Han is a Software Engineer at Cloudera and a committer of Apache ZooKeeper project.