Inside Apache Oozie HA
Oozie’s new HA qualities help cluster operators sleep well at night. Here’s how it works.
One of the big new features in CDH 5 for Apache Oozie is High Availability (HA). In designing this feature, the Oozie team at Cloudera had two main goals: 1) Don’t change the API or usage patterns, and 2) the user shouldn’t even have to know that HA is enabled. In other words, we wanted Oozie HA to be as easy and transparent as possible.
In this blog post, I’ll explain how Oozie HA works and how it achieves those goals.
What is Active-Active HA?
HA can be defined as a system without non-planned downtime, even when partial failures occur. This goal is usually achieved via redundancies and by removing single points of failure. For example, we’ve had HDFS HA for quite a while now, and, in a nutshell, it works by having two NameNodes; if the first goes down, the system “fails-over” to the second one automatically, which takes over. This setup is often called “active standby” or “hot-warm” because you have one active server and one server ready to take over if something bad happens.
For Oozie, we implemented “active-active” or “hot-hot” HA, which means that both Oozie servers are active at the same time — there is no failover. In fact, you can actually have as many active Oozie servers as you want (within reason, of course!). A nice bonus of this architecture is that you get horizontal scalability for free: the Oozie service can now have more computing power.
Oozie stores almost all its state (submitted jobs, workflow definitions, and so on) in a database. In fact, Oozie was designed to be stateless: if the Oozie server goes down, you don’t actually lose anything. Already-running jobs will actually continue running while the Oozie server is down; once the server comes back up, it will start any pending jobs and transition any workflows with finished actions. This is convenient because you don’t have to worry about losing anything when an Oozie server goes down.
As a summary, Oozie maintains in-memory locks for each job to prevent multiple threads (intra-process) from trying to process the same job at the same time. With multiple Oozie servers, we have to extend these locks to distributed locks that all the Oozie servers can share (inter-process). To that end, we used Apache ZooKeeper for the distributed locks, and to interact with ZooKeeper, we used Apache Curator, which is a set of libraries that make interacting with ZooKeeper much easier (and something you’ve read about here before).
Now that we have the distributed locks, we simply have to point all the Oozie servers at the same database. The database must support multiple concurrent connections (Postgres, MySQL, or Oracle — not Apache Derby); it should also ideally be an HA database to prevent it from now becoming the single point of failure. Below is a diagram of what this would look like:
Usually, when you use the Oozie client, REST API, or Web UI, there’s a single address to use (http://myhost:11000/oozie, for example). But now that you have multiple Oozie servers, you have multiple addresses to which users can connect — so what happens if the one they pick goes down? There are also many clients or tools that only support a single entry point for Oozie, such as the JobTracker. To fix this issue, you need to provide a single address that will round-robin between the Oozie servers. You can use a load balancer, a virtual IP address, or DNS round-robin for this purpose. As with the database, this setup technically needs to be HA as well. Below is a diagram of what it would look like with a load balancer:
Architecture: Log Streaming
Oozie has a feature where you can stream logs for a particular job from the Oozie server to the Oozie client, REST API, or Web UI. But each Oozie server has its own log file(s) on the local filesystem; unlike the information in the database discussed earlier, these log messages are not available to every Oozie server directly.
This could be a problem. Suppose you have two servers, A and B. B processed a workflow in which you’re interested, but when you went through the load balancer, it directed your request to A. If A were to look into its own logs, it wouldn’t find any messages for the workflow you asked about.
The solution is that A will ask B for any logs it has about the workflow. This is especially important because Oozie jobs are not assigned to a specific Oozie server. And not only can a workflow be processed by any Oozie server, but even different actions within the same workflow can be processed by any server! So, whenever you ask an Oozie server for logs, it has to go and ask each of the other servers for their relevant logs, and collate the messages before streaming them back to the user. The diagram below will make this more clear:
ZooKeeper is coordinating this effort so that the Oozie servers know about each other dynamically. The only downside with this solution for log streaming is that if an Oozie server goes down, its logs will obviously be unavailable until it comes back up. (We intend to address this issue in the future but it will likely require a major refactoring of how logs are stored and streamed.)
Security is very important to Cloudera’s customers, so we made sure that Oozie HA works with Kerberos. The only time that the Oozie servers actually talk to each other is for the log streaming; any other “extra” communication for HA is with ZooKeeper. The log streaming works with Kerberos and/or HTTPS if configured. Furthemore, you can enable Kerberos-backed ACLs in ZooKeeper so that only the Oozie principal can read/write/edit/etc the znodes for HA. This is especially important in preventing a malicious user or program from acquiring one of the distributed locks, which would block Oozie from ever processing the job associated with that lock, forever!
Because Oozie servers can come and go dynamically, it can be difficult to tell which Oozie servers are actually currently working together. So, we’ve added a new admin command to list the currently connected Oozie servers and their addresses:
$ oozie admin -oozie http://loadbalancer-hostname:11000/oozie -servers hostA : http://hostA:11000/oozie hostB : http://hostB:11000/oozie hostC : http://hostC:11000/oozie
The loadbalancer-hostname in the above example would be the address of either the load balancer, virtual IP address, or DNS round-robin previously mentioned.
It also turned out that Oozie HA was perhaps a little too transparent; it’s hard to tell which Oozie server processed which job at any given time. To help make it easier to tell, log messages now indicate from which server they came; for example:
2013-09-29 16:46:20,182 WARN org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostA] USER[root] GROUP[-] TOKEN APP[demo-wf] JOB[0000000-130925230553293-oozie-oozi-W] ACTION[0000000-130925230553293-oozie-oozi-W@streaming-node] [***0000000-130925230553293-oozie-oozi-W@streaming-node***]Action status=RUNNING
In the above log message, you can see that there is now a “SERVER” field.
I haven’t gone into any of the setup or configuration in this blog post because our documentation already does that a lot of detail, which you can find here. If you plan to use Kerberos, you’ll also want to take a look at this page.
While setting up Oozie HA manually isn’t too difficult (there aren’t that many configuration properties to think about), using Cloudera Manager makes this much easier. In a few clicks, it will configure and deploy the additional Oozie servers for you, plus the ZooKeeper ensemble.
I believe we’ve accomplished our goals of making Oozie HA as easy and as transparent as possible. From the user’s perspective, all that’s changed is the
host:port to use in the Oozie client (it’s now the
host:port for the load balancer); everything else is exactly the same. For cluster admins, there aresome extra setup prerequisites but nothing extraordinary, and there’s very little to configure on the Oozie end of things, especially when using Cloudera Manager.
We added the core of Oozie HA in CDH 5 Beta 1 and added Cloudera Manager support in CDH 5 Beta 2. We’re continuing to make further improvements, bug fixes, and add new features for Oozie HA that will go into CDH 5 and later versions. For those interested in keeping up with development, or even participating in the development yourself, the list of Oozie HA JIRAs can be found on this page.
Robert Kanter is a Software Engineer at Cloudera and an Oozie Committer/PMC Member.