This new feature gives Hadoop admins the commonplace ability to replace failed DataNode drives without unscheduled downtime.

Hot swapping—the process of replacing system components without shutting down the system—is a common and important operation in modern, production-ready systems. Because disk failures are common in data centers, the ability to hot-swap hard drives is a supported feature in hardware and server operating systems such as Linux and Windows Server, and sysadmins routinely upgrade servers or replace a faulty components without interrupting business-critical services.

Even so, historically, decommissioning an individual drive on an HDFS DataNode has not been possible in Apache Hadoop. Instead, to replace a bad drive or upgrade it with a larger one, the user had to decommission the entire DataNode. That process could take hours to complete, significantly affecting system availability and operational effectiveness.

Happily, in Apache Hadoop 2.6, HDFS has added support for hot-swapping disks on a DataNode; this feature was developed by Cloudera committers under the umbrella JIRA HDFS-1362. And starting in release 5.4.0, DataNode hot-swapping is supported in Cloudera Enterprise.

To ensure that the disk changes are persistent between DataNode restarts, the hot-swap drive feature is performed using the Reconfigurable framework introduced in HADOOP-7001. In summary, the reconfigurable framework enables HDFS daemons changing configurations without restarting the daemon. So adding and/or removing disks on a DataNode involves two steps:

  1. Modifying the dfs.datanode.data.dir property in the DataNode configuration to reflect the disk changes
  2. Asking the DataNode to reload its configuration and apply the configuration changes

It’s worth mentioning that when adding new disks, if the data directory does not exist, the DataNode must have write access to create the directory. Moreover, because we anticipate that the most common use case for the hot-swap disk feature is to replace bad disks, the hot-swapping procedure does not attempt to move the blocks to other DataNodes, as is done when decommissioning a node. That being said, it is advisable to remove no more than N-1 disks simultaneously, where N is the system-wide replication factor. We also recommend running hdfs fsck after swapping a disk out.

In the future, adding a -safe flag to the reconfig command will advise the DataNode to relocate the blocks before swapping out the volume.

How to Hot Swap

Assume that you have a DataNode (dn1.example.com), which currently manages two data volumes (/dfs/dn1,/dfs/dn2). In the event of a disk failure on the disk where /dfs/dn2 is located, the sysadmin will want to replace this bad drive with a new drive. Moreover, this new drive will be mounted on the same directory /dfs/dn2:

From a user’s perspective, there are two ways to achieve this task:

From the Command Line
  1. Remove /dfs/dn2 from the DataNode configuration file (i.e., hfds-site.xml).
      dfs.datanode.data.dir
      /dfs/dn1
    
    
  2. Ask the DataNode to reload its configurations and wait for this reconfiguration task to finish.
    $ hdfs dfsadmin -reconfig datanode dn1.example.com:50020 start
    
    # query the status of the reconfiguration task
    $ hdfs dfsadmin -reconfig datanode dn1.example.com:50020 status
    
    # Once the first line of the outputs shows as below, the reconfiguration task is completed.
    Reconfiguring status for DataNode[dn1.example.com:50020]: started at Tue Feb 10 15:09:52 PST 2015 and finished at Tue Feb 10 15:09:53 PST 2015.
    …
    SUCCESS: Change property dfs.datanode.data.dir
         From: "file:///dfs/dn1,file:///dfs/dn2"
         To: "file:///dfs/dn1"
    ...
    
  3. Unmount the disk from /dfs/dn2. Mount the new disk on /dfs/dn2.
  4. Add /dfs/dn2 back to the DataNode configuration file (hdfs-site.xml).
     
    
      dfs.datanode.data.dir
      /dfs/dn1,/dfs/dn2
    
    
  5. Run Step 2 again.
From Cloudera Manager

To make this process even easier, the DataNode Hot Swap Disk feature is supported in Cloudera Manager as of 5.4.0. To use it, follow these steps:

  1. Navigate to the Configuration page for the DataNode role in question.
  2. Edit the DataNode Data Directory configuration property: remove /dfs/dn2. Click “Save Changes.” Warning: Be sure to modify this property only for the DataNode role whose disk you are swapping. Do not modify the role group value for this property, as that will affect all DataNode roles in the group.

  3. Select “Refresh Data Directories” from the Actions menu in the upper right corner of the page (which should still be the configuration page for the DataNode role you are modifying).

  4. Confirm that the window that pops up refers to the correct DataNode role, and click Refresh Data Directories. This will run the reconfiguration task.

  5. Unmount the disk /dfs/dn2. Mount the new disk to /dfs/dn2.
  6. Reset the “DataNode Data Directory” property to add back the disk. Click “Save Changes,” as illustrated in Step 2.
  7. Repeat Steps 3 and 4.

To learn more details about hot swapping disks for DataNode, please see the official docs.

Conclusion

The DataNode Hot Swap Disk feature gives users a fine-grained tool for managing the health of their clusters. Moreover, Cloudera Manager offers the most convenient approach to monitor disk health and manage disk volumes as needed.

Lei Xu is a Software Engineer at Cloudera.

Leave a comment

Your email address will not be published. Links are not permitted in comments.