How-to: Achieve Higher Availability for Hue

Few projects within the Apache Hadoop umbrella have as much end-user visibility as Hue, the open source Web UI that makes Hadoop easier to use. Due to the great number of potential end users, it is useful to add a degree of fault tolerance to your deployment. This how-to describes how to achieve higher availability by placing several Hue instances behind a load balancer.

Tutorial

This tutorial demonstrates how to set up high availability by:

  • Installing Hue 2.3 on two nodes in a three-node RHEL 5 cluster
  • Managing all Hue instances via Cloudera Manager
  • Load balancing using HA Proxy 1.4. (In fact, any load balancer with sticky sessions should work.)

Before we begin, we suggest that you view this quick video demonstrating how to achieve HA in Hue:

Installing Hue

Hue should be installed on two of the three nodes. To have Cloudera Manager automatically install Hue, follow the “Parcel Install via Cloudera Manager” section. To install manually, follow the “Package Install” section.

Parcel Install via Cloudera Manager

For more information on Parcels, see Managing Parcels.

  1. From Cloudera Manager, click on Hosts in the menu. Then, go to the Parcels section.
  2. Find the latest CDH parcel, click Download.
  3. Once the parcel has finished downloading, click Distribute.
  4. Once the parcel has finished distributing, click Activate.

Package Install

  1. Download the yum repository RPM.
  2. Install the yum repository with sudo yum --nogpgcheck localinstall cloudera-cdh-4-0.x86_64.rpm. For more information, see Installing CDH4.
  3. Install Hue on each node with sudo yum install hue. For more information on installing Hue, see CDH documentation.

Managing Hue through Cloudera Manager

Cloudera Manager provides management of the Hue servers on each node. Add two Hue services using the directions below. For more information on managing services, see the Cloudera Manager documentation.

  1. Go to Services -> All Services in the menu.
  2. Click Actions -> Add a Service.
  3. Select “Hue” and follow the steps on the screen. NOTE: For each Hue service we choose a unique host.
  4. Ensure that the “Jobsub Examples and Templates Directory” configuration points to different directories in HDFS for each Hue service. It can be changed by going to Services -> . In the menu, go to Configuration -> View and Edit. Then, click on Hue Server. “Jobsub Examples and Templates Directory” should be at the bottom of the page.

Cloudera Manager handling two Hue services

HA Proxy Installation/Configuration

  1. Download and unzip the binary distribution of HA Proxy 1.4 on the node that doesn’t have Hue installed (called serverc.cloudera.com in the example).
  2. Add the following HA Proxy configurationto /tmp/hahue.conf:

     

  3. Start the HA Proxy with haproxy -f /tmp/hahue.conf

The key configuration options are balance and server in the listen section. When the balance parameter is set to source, a client is guaranteed to communicate with the same server every time it makes a request. If the server with which the client is communicating goes down, the request will automatically be sent to another active server. This is necessary because Hue stores session information in process memory. The server parameters define which servers will be used for load balancing and takes the form:

 

In the configuration above, the server hue1 is available at servera.cloudera.com:8888 and hue2 is available at serverb.cloudera.com:8888. Both servers have health checks every two seconds and are declared down after three failed health checks. In this example, HAProxy is configured to bind to 0.0.0.0:80. Thus, Hue should now be available at http://serverc.cloudera.com.

Conclusion

Hue can be load-balanced easily as long as the server a client is directed to is constant (that is, there are “sticky” sessions). Load balancing can improve performance, but its primary goal is HA. (Note that for true high availability, Hue needs to be configured to use HA via MySQL, PostgreSQL, or Oracle Database.) Also, multiple Hue instances can be easily managed through Cloudera Manager.

Have any suggestions? Feel free to tell us what you think through hue-user or via our new community discussion forum.

 

Filed under:

No Responses

Leave a comment


seven × 7 =