How-to: Make Hadoop Accessible via LDAP
Integrating Hue with LDAP can help make your secure Hadoop apps as widely consumed as possible.
Hue, the open source Web UI that makes Apache Hadoop easier to use, easily integrates with your corporation’s existing identity management systems and provides authentication mechanisms for SSO providers. So, by changing a few configuration parameters, your employees can start analyzing Big Data in their own browsers under an existing security policy.
In this blog post, you’ll learn details about the various features and capabilities available in Hue for integrating with likely the most popular authentication mechanism, LDAP. (It is also possible to authenticate Hue users via PAM, SPNEGO, OpenID, OAuth, and SAML, but those topics are for another post.)
The typical authentication scheme for Hue takes the following form:
In the above diagram, credentials are validated against the Hue database. Often it’s easier to manage identities from a central location; with the Hue LDAP integration, users can use their LDAP credentials to authenticate and inherit their existing groups transparently. There is no need to save or duplicate any employee password in Hue:
When authenticating via LDAP, Hue validates login credentials against a directory service if configured with this authentication backend:
[desktop] [[auth]] backend=desktop.auth.backend.LdapBackend
The LDAP authentication backend will automatically create users that don’t exist in Hue by default. Hue needs to import users in order to properly perform the authentication. (The password is never imported when importing users.) However, you may want to disable automatic import at times to allow logins only by a predefined list of manually imported users. For those cases, you can use the following configuration to disable automatic import:
[desktop] [[ldap]] create_users_on_login=false
The case sensitivity of the authentication process is defined in the “Case Sensitivity” section below.
There are two different ways to authenticate with a directory service through Hue:
The search-bind mechanism for authenticating will perform an ldapsearch against the directory service and bind using the found distinguished name (DN) and password provided. This is, by default, used when authenticating with LDAP. The configurations that affect this mechanism are outlined in the “LDAP Search” section below.
The direct-bind mechanism for authenticating will bind to the LDAP server using the username and password provided at login. You can choose between two options for how Hue binds:
nt_domain– Domain component for User Principal Names (UPN) in active directory. This Active Directory-specific idiom allows Hue to authenticate with Active Directory without having to follow LDAP references to other partitions. This typically maps to the email address of the user or the user’s ID in conjunction with the domain.
ldap_username_pattern– Provides a template for the DN that will ultimately be sent to the directory service when authenticating.
nt_domain is provided, Hue will use a UPN to bind to the LDAP service:
[desktop] [[ldap]] nt_domain=example.com
ldap_username_pattern configuration is used. (The
parameter will be replaced with the username provided at login):
[desktop] [[ldap]] ldap_username_pattern="uid=<username>,ou=People,DC=hue-search,DC=ent,DC=cloudera,DC=com"
Typical attributes to search for include:
To enable direct bind authentication, the
search_bind_authentication configuration must be set to false:
[desktop] [[ldap]] search_bind_authentication=false
If an LDAP user must belong to a certain group and have a particular set of permissions, you can import this user via the Useradmin interface:
As you can see above, there are two options available when importing:
- Distinguished name – If this option is checked, the username provided must be a full distinguished name (for example: uid=hue,ou=People,dc=gethue,dc=com). Otherwise, the Username provided should be a fragment of a Relative Distinguished Name (rDN). (For example, the username “hue” maps to the rDN “uid=hue”.) Hue will perform an LDAP search using the same methods and configurations as defined in the “LDAP Search” section; essentially, Hue will take the provided username and create a search filter using the
- Create home directory – If this option is checked, when the user is imported and their home directory in HDFS will automatically be created, if it doesn’t already exist.
The case sensitivity of the search and import processes are defined in the “Case Sensitivity” section.
Groups are importable via the Useradmin interface. Then, you can add users to this group, which would provide a set of permissions (such as accessing the Impala application). This function works similarly to user importing, but has a couple of extra features.
As the above image portrays, not only can groups be discovered via DN and rDN search, but users that are members of the group and members of the group’s subordinate groups can be imported as well. Posix groups and members are automatically imported if the group found has the object class
Synchronizing Users and Groups
Users and groups can be synchronized with the directory service via the Useradmin interface or via a command-line utility. The images from the previous sections use the words “Sync” to indicate that when a name of a user or group that exists in Hue is added, it will actually be synchronized instead. In the case of importing users for a particular group, new users will be imported and existing users will be synchronized. (Note: Users who have been deleted from the directory service will not be deleted from Hue. You can manually deactivate those users from Hue via the Useradmin interface.)
Currently, only the first name, last name, and email address are synchronized. Hue looks for the LDAP attributes givenName, sn, and mail when synchronizing. Also, the
user_name_attr config is used to appropriately choose the username in Hue. For example, if
user_name_attr is set to “uid”, then the uid returned by the directory service will be used as the username of the user in Hue.
The “Sync LDAP users/groups” button in the Useradmin interface will automatically synchronize all users and groups.
Here’s a quick example of how to use the command line interface to synchronize users and groups:
<hue root>/build/env/bin/hue sync_ldap_users_and_groups
There are two configurations for restricting the search process:
user_filter– General LDAP filter to restrict the search
user_name_attr– The attribute that will be considered the username against which to search
Here is an example configuration:
[desktop] [[ldap]] [[[users]]] user_filter="objectClass=*" user_name_attr=uid
With the above configuration, the LDAP search filter will take the form:
(&(objectClass=*)(uid=<user entered usename>))
You can configure Hue to ignore the case of usernames as well as force usernames to lower case via the
force_username_lowercase configurations. These two configurations should be used in conjunction with each other. This is useful when integrating with a directory service containing usernames in capital letters and UNIX usernames in lowercase letters (which is a Hadoop requirement). Here is an example of configuring them:
[desktop] [[ldap]] ignore_username_case=true force_username_lowercase=true
Secure communication with LDAP is provided via the SSL/TLS and StartTLS protocols. It allows Hue to validate the directory service to which it’s going to converse. Practically speaking, if a Certificate Authority Certificate file is provided, Hue will communicate via LDAPS:
[desktop] [[ldap]] ldap_cert=/etc/hue/ca.crt
The StartTLS protocol can be used as well (step up to SSL/TLS):
[desktop] [[ldap]] use_start_tls=true
The Hue team is working hard to improve security. Upcoming LDAP features include: Import nested LDAP groups and multi-domain support for Active Directory. We hope this brief overview of LDAP in Hue will help you make your system more secure, more compliant with current security standards, and open up big data analysis to many more users!
Abe Elmahrek is a Software Engineer at Cloudera.