Author Archives: Jimmy Xiang

Guide to Using Apache HBase Ports

Categories: Hadoop HBase

For those people new to Apache HBase (version 0.90 and later), the configuration of network ports used by the system can be a little overwhelming.

In this blog post, you will learn all the TCP ports used by the different HBase processes and how and why they are used (all in one place) — to help administrators troubleshoot and set up firewall settings, and help new developers how to debug.

A typical HBase cluster has one active master,

Read more

Apache HBase AssignmentManager Improvements

Categories: Community HBase ZooKeeper

AssignmentManager is a module in the Apache HBase Master that manages regions to RegionServers assignment. (See HBase architecture for more information.) It ensures that all regions are assigned and each region is assigned to just one RegionServer.

Although the AssignmentManager generally does a good job, the existing implementation does not handle assignments as well as it could. For example, if a region was assigned to two or more RegionServers,

Read more

Apache HBase Log Splitting

Categories: General HBase

In the recent blog post about the Apache HBase Write Path, we talked about the write-ahead-log (WAL), which plays an important role in preventing data loss should a HBase region server failure occur.  This blog post describes how HBase prevents data loss after a region server crashes, using an especially critical process for recovering lost updates called log splitting.

Log splitting

As we mentioned in the write path blog post,

Read more

Apache HBase Write Path

Categories: CDH General HBase

Apache HBase is the Hadoop database, and is based on the Hadoop Distributed File System (HDFS). HBase makes it possible to randomly access and update data stored in HDFS, but files in HDFS can only be appended to and are immutable after they are created.  So you may ask, how does HBase provide low-latency reads and writes? In this blog post, we explain this by describing the write path of HBase —

Read more

Apache HBase 0.90.6 is now available

Categories: HBase

Apache HBase 0.90.6 is now available. It is a bug fix release covering 31 bugs and 5 improvements.  Among them, 3 are blockers and 3 are critical, such as:

  • HBASE-5008: HBase can not provide services to a region when it can’t flush the region, but considers it stuck in flushing,
  • HBASE-4773: HBaseAdmin may leak ZooKeeper connections,
  • HBASE-5060: HBase client may be blocked forever when there is a temporary network failure.

Read more