What is Kerberos & SPNEGO?
Kerberos is an authentication protocol that provides mutual authentication and single sign-on capabilities.
SPNEGO is a plain text mechanism for negotiating authentication protocols between peers; one notable application of this is Kerberos authentication over HTTP.
What is Alfredo?
Alfredo is an Open Source Java library providing support for Kerberos HTTP SPNEGO authentication. By using Alfredo:
- Client applications can easily access HTTP resources protected with Kerberos HTTP SPNEGO
- Web applications can easily protect HTTP resources with Kerberos HTTP SPNEGO
One of Cloudera’s goals is to enable end-to-end security for anyone using Hadoop and other projects that work on top of Hadoop. Because these are Open Source Projects that are available under the Apache License, only software that has compatible license terms can be used with them.
We created Alfredo because we needed to add user authentication to Oozie. Since Hadoop already supports Kerberos authentication, adding support for Kerberos authentication to Oozie is an obvious choice. Oozies API is HTTP based, making the use of Kerberos HTTP SPNEGO an obvious choice as well. A benefit to supporting Kerberos HTTP SPNEGO means that tools like curl and popular browsers (Firefox and Internet Explorer) will work with HTTP resources protected by Alfredo.
We couldn’t find a Java library providing Kerberos HTTP SPNEGO support to integrate with Oozie client/server code that is Apache Licensed which is a requirement for the entire CDH platform therefore, the solution was to write the code ourselves.
Because Alfredo is a reusable component, other projects that have HTTP endpoints can also use it, such as the Hadoop web-console, HBase, and other Hadoop-based projects. Projects not related to Hadoop can also use Alfredo.
From an integration perspective, I wanted something as simple as an URL/HttpURLConnection helper for the client side and a Java Servlet Filter for the server side. This would allow existing Java client and server applications to support Kerberos HTTP SPNEGO with minimal changes.
To protect HTTP resources of a Java web application, Alfredo’s AuthenticationFilter must be deployed in front of the HTTP resources. This filter requires some minimal configuration: specifically, the name of Kerberos principal for the service, and the keytab file where the credentials for the principal are stored.
To access the protected HTTP resources, you can use tools and applications like curl and Firefox with support for Kerberos HTTP SPNEGO.
To enable you to write a Java client application that accesses HTTP resources protected with Kerberos HTTP SPNEGO, Alfredo provides the AuthenticatedURL class. This class is a simple helper class that authenticates the user using the credentials from the OS Kerberos cache.
In addition, Alfredo can be extended to support other authentication mechanisms via a client interface. An implementation equivalent to Hadoop pseudo/simple is also provided with Alfredo.
We intentionally implemented Alfredo so that it does not depend on Hadoop or Oozie projects because we wanted to avoid unwanted transitive dependencies for other projects. Alfredo’s AuthenticationFilter can easily be subclassed to obtain its configuration using the configuration mechanism of whatever project is using it.
Alfredo is distributed with an Apache License 2.0.
The source code (including examples) is available at http://github.com/cloudera/alfredo
Documentation can be found at http://cloudera.github.com/alfredo
Alfredo is already available in Cloudera’s Maven repository: https://repository.cloudera.com/content/repositories/releases:
- groupId: com.cloudera.alfredo
- artifactId: alfredo
- version: 0.1.3
- type: jar
Now that Alfredo is done, I will work on my original problem, adding support for Kerberos HTTP SPNEGO to Oozie. It should only take me about a couple of hours of coding.