Custom Hostname for Cloud Instances

Custom Hostname for Cloud Instances

Cloudera Altus Director provides the simplest way to deploy and manage Cloudera Enterprise in the cloud. It enables customers to unlock the benefits of enterprise-grade Hadoop while leveraging the flexibility, scalability, and affordability of the cloud. It integrates seamlessly with Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure, and provides support to build custom plugins for other public or private cloud environments.

Motivation

While automating the provisioning of a cluster on the cloud using Altus Director, customers often ask how to standardize hostnames in line with internal naming conventions. Each cloud provider has a different approach to naming the instances. For example, A typical Amazon EC2 private DNS name looks something like this: ip-12-34-56-78.us-west-2.compute.internal, where the name consists of the internal domain, the service, the region, and a form of the private IPv4 address. These names cannot be registered with an Enterprise Active Directory due to the limitations discussed here, in Microsoft’s documentation. Furthermore, organizations often standardize hostnames in their Active Directory setup to help them classify servers based on type, purpose, location, or other factors.

In this blog post, we will discuss an approach that addresses the above challenge by using cloud instance metadata and an external utility to generate unique custom hostname(s) dynamically, which will comply with organizational standards. This approach also supports the dynamic sizing of the cluster, which greatly increases the adoption of cloud-based deployments.

Approach

The solution described here involves exposing a REST endpoint that is invoked by the Altus Director during the bootstrapping phase of each instance with a standardized hostname prefix (Example: edh-master). The endpoint generates a series of monotonically increasing numbers which is attached to the hostname-prefix (Example: edh-master-01) to create a unique custom hostname that is registered with the DNS. The counter used during the generation of the numbers is persisted and updated in a backend data store by the utility that serves the REST endpoint.

This solution includes three significant steps:

  1. Using Altus Director’s cluster configuration file to define instance metadata
  2. Developing a utility to generate unique hostnames
  3. Using the instance bootstrap script functionality of Altus Director to invoke the utility service and configure the hostname

custom hostname solution diagram

 

Define Instance Metadata

One of the best practices while deploying an EDH cluster is to assign hostnames based on the roles deployed on the servers in the environment. Examples: MASTER, GATEWAY, WORKER, and so on. Consequently, the first step is to classify and tag the instances appropriately using the “NodeType” in the Altus Director cluster configuration.

MASTER {
        type: r4.8xlarge
        image: "${IMAGE}"
        bootstrapScriptsPaths: ["${BOOTSTRAP_SCRIPT}"]
        tags {
            NodeType: "MASTER"
        }
}
HBASE-MASTER {
        type: r4.8xlarge
        image: "${IMAGE}"
        bootstrapScriptsPaths: ["${BOOTSTRAP_SCRIPT}"]
        tags {
            NodeType: "HBASE-MASTER"
        }
}
GATEWAY {
        type: c4.2xlarge
        image: "${IMAGE}"
        bootstrapScriptsPaths: ["${BOOTSTRAP_SCRIPT}"]
        tags {
            NodeType: "GATEWAY"
        }
}
WORKER {
        type: r4.8xlarge
        image: "${IMAGE}"
        bootstrapScriptsPaths: ["${BOOTSTRAP_SCRIPT}"]
        tags {
            NodeType: "WORKER"
        }
}

 

REST Utility Design

The utility, which is implemented using Python and Flask, stores and increments the number of invocations for each hostname prefix. It persists this information in a data store, and services it via a REST endpoint. This helps with determining the starting value to be used for increments during subsequent calls for a given hostname prefix.

@app.route("/api/<prefix>")
def main(prefix):
    global urlMap
    key = prefix
    if prefix in urlMap:
        urlMap[prefix] = urlMap[prefix] + 1 
        save_obj(urlMap, file_name)
        return '{0:03d}'.format(urlMap[prefix])
    else:
        urlMap[prefix] = 1 
        save_obj(urlMap, file_name)
        return '{0:03d}'.format(urlMap[prefix])

For example, If Altus Director is bootstrapping three master instances, each of them will invoke the endpoint as http://<rest-server>/api/edh-master. In response, each of these instances will receive a unique value between (1-3). The returned value is then attached with the hostname prefix to form edh-master-01, edh-master-02, and edh-master-03.

If the endpoint is invoked by another instance with a different prefix based on its role in the cluster configuration, edh-worker for example, the utility will maintain a separate counter and return the corresponding value to form edh-worker01, edh-worker02, and so on.

This implementation uses a JSON file for storing the counter. For the prefixes stored in the example below, the next call for edh-master would be served a value of 4, edh-hmaster would receive 3, edh-worker would receive 5 and edh-edge would receive 2. During the initial cluster build, the utility will initialize the counters to 0 and increment as needed during subsequent calls.

{
  "edh-master": 3,
  "edh-hmaster": 2, 
  "edh-worker": 4, 
  "edh-edge": 1
}

Bootstrap Script

In Altus Director’s bootstrap script, the cloud-specific CLI tools (aws cli or gcloud cli or azure cli, or other) are installed and used to query the instance metadata. The instance metadata is used to identify the value associated with the “NodeType” tag. Based on the NodeType value the respective REST endpoint is invoked with the hostname prefix. The hostname prefix and the value returned from the REST API call are merged to form a unique hostname that is now compliant with any organizational standards.

A sample mapping of the “NodeType” instance tag and its corresponding Hostname prefix/endpoint is provided below.

Node Type Hostname Prefix REST Endpoint
MASTER edh-master /api/edh-master
HBASE-MASTER edh-hmaster /api/edh-hmaster
GATEWAY edh-gateway /api/edh-gateway
WORKER edh-worker /api/edh-worker

The code snippet below is part of a bootstrap script that was used for a deployment on AWS. AWS CLI is used to retrieve instance metadata that was defined using the cluster configuration. Refer to the AWS documentation here for more information on the metadata retrieval steps used.

## Gather Instance information
INSTANCE_ID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
AVAILABILITY_ZONE=$(curl http://169.254.169.254/latest/meta-data/placement/availability-zone)
AWS_DEFAULT_REGION=${AVAILABILITY_ZONE::-1}
AWS_REGION=${AVAILABILITY_ZONE::-1}
DOMAIN=cloudera.com


## Install AWS CLI
curl -s -O https://bootstrap.pypa.io/get-pip.py
sudo python get-pip.py
sudo pip install --quiet awscli
## Retrieving NodeType tag that was associated in the Director cluster conf
NODE_TYPE=$(aws ec2 describe-instances --instance-ids ${INSTANCE_ID} --region ${AWS_REGION} --output text| grep NodeType | awk -F' ' '{print $3}')

## Identify Hostname prefix based on NodeType
if [[ "${NODE_TYPE}" == "MASTER" ]]; then
    PREFIX=edh-master
elif [[ "${NODE_TYPE}" == "HBASE-MASTER" ]]; then
    PREFIX=edh-hmaster
elif [[ "${NODE_TYPE}" == "GATEWAY" ]]; then
    PREFIX=edh-gateway
elif [[ "${NODE_TYPE}" == "WORKER"  ]]; then
    PREFIX=edh-worker
else
    PREFIX=edh-other
fi


REST_URI=http://kv-server/api/count/
## Invoking REST Endpoint to obtain a unique number for the hostname
s_number=$(curl -s ${REST_URI}/api/${PREFIX})
custom_hostname=${PREFIX}-${s_number}.${DOMAIN}

echo $custom_hostname > /etc/hostname

 

Fine Print

The code and examples provided above are for AWS, but the solution could also be implemented for Azure and Google Cloud clusters with necessary modifications. For dynamic cluster sizing, the right starting value for each hostname prefix should be updated in the data store. With unique Node Type/Hostname mappings, the solution can be extended to handle additional roles (Kafka, CDSW, and so on) as well as multiple groupings of similar roles in any high availability/multi-cluster deployment scenario.

The REST API is not a built-in feature of Altus Director, and it can be extended to run using Cloud init scripts or in any multi-threaded application.

Avinash Desireddy is a Senior Solutions Consultant at Cloudera.
Arvind Rajagopal is a Solutions Architect at Cloudera.

Leave a comment

Your email address will not be published. Links are not permitted in comments.