Guide to Using Apache HBase Ports

For those people new to Apache HBase (version 0.90 and later), the configuration of network ports used by the system can be a little overwhelming.

In this blog post, you will learn all the TCP ports used by the different HBase processes and how and why they are used (all in one place) — to help administrators troubleshoot and set up firewall settings, and help new developers how to debug.

A typical HBase cluster has one active master, one or several backup masters, and a list of region servers. The backup masters are standby masters waiting to be the next active one. Before they are active, they do not listen on any ports. (Learn more about how HBase scalability works here.)

Each server in the cluster listens to a main port for requests from clients and/or other HBase servers. Each server also has an embedded Jetty web UI server.

The following diagram shows the communication among different components. (Blue components belong to the HBase cluster, usually behind a firewall; grey components are external clients, usually outside the HBase cluster firewall; green component is a web browser, usually outside the firewall too.)

  1. Client applications talk to Apache ZooKeeper to find out the location of the master and the meta region server (the root region is removed in HBase version 0.96).
  2. Client applications talk to region servers to read from/write to/scan a table.
  3. Client applications talk to the master to get information about an existing table, dynamically create/remove a table, add/remove a column family.
  4. The master talks to region servers to open/close/move/split/flush/compact regions.
  5. The master puts data in ZooKeeper to store the active master and meta region server location, create log splitting tasks, track region servers’ statuses.
  6. Region servers read data in ZooKeeper to do log splitting, track the master location and the cluster status.
  7. Region servers talk to the master to report region server start-ups, loads.
  8. Occasionally, region servers talk to meta region to check the status of a region, create new daughter regions in region splitting.
  9. REST clients talk to REST servers to access HBase.
  10. Thrift clients talk to Thrift servers to access HBase.
  11. Users access the master web UI from browsers.
  12. Users access region servers’ web UI from browsers.
  13. Users access REST servers’ web UI from browsers.
  14. Users access Thrift servers’ web UI from browsers.

Some HBase clusters may have a list of REST or Thrift servers. Both the REST server and the Thrift server are optional; they are needed only if you want to provide REST/Thrift access to your HBase cluster. To HBase, they are just other client applications. Just like other HBase servers, they also listen to a main port for client requests, and a web UI port.

The following table shows the ports used by client applications to talk to an HBase cluster, users to check cluster information, and different HBase components to talk to each other.

Component

Configuration parameter

Default value

Used places

ZooKeeper

hbase.zookeeper.property.clientPort

2181

1,5,6

Master

hbase.master.port

60000

3,7

Master

hbase.master.info.port

60010

11

Region server

hbase.regionserver.port

60020

2,4,8

Region server

hbase.regionserver.info.port

60030

12

REST server

hbase.rest.port**

8080

9

REST server

hbase.rest.info.port*

8085

13

Thrift server

hbase.regionserver.thrift.port**

9090

10

Thrift server

hbase.thrift.info.port*

9095

14

* Introduced in HBase version 0.94.5. They can also be specified with command line option --infoport when starting up the corresponding server.
** They can also be specified with command line option -p when starting up the corresponding server.

One port is not listed in the table — the HDFS namenode port – because here is not a separate parameter for it. It is configured as a part of “hbase.root” (for example, “hdfs://namenode.foobar.com:35802/hbase”) with the HDFS NameNode port configured to be 35802. Unless otherwise specified in the value of “hbase.root”, the default is 8020.

Besides the main port, each server in the cluster (ZooKeeper excepted) also listens to a web UI port. A web UI is an embedded Jetty server in its corresponding server. The web UI provides human-readable information about the corresponding server — for example, the thread dump and local logs. The master web UI has links to all region server web UIs, which makes it the perfect entry point for checking the current status of an HBase cluster.

The REST/Thrift servers are optional proxies to HBase. They talk to HBase the same way other HBase client applications do. However, they are usually deployed inside the HBase cluster, together with other HBase servers.

Client applications are usually deployed out of the HBase cluster. REST/Thrift clients are deployed outside the cluster too. If the HBase cluster is behind a firewall, these corresponding ports should be open by default:

To allow client application access:

  • 2181 (hbase.zookeeper.property.clientPort)
  • 60000 (hbase.master.port)
  • 60020 (hbase.regionserver.port)

To allow REST/Thrift client access:

  • 8080 (hbase.rest.port)
  • 9090 (hbase.regionserver.thrift.port)

If web UI access from out of firewall is allowed, the corresponding web UI ports should be open too:

  • 60010 (hbase.master.info.port)
  • 60030 (hbase.regionserver.info.port)
  • 8085 (hbase.rest.info.port)
  • 9095 (hbase.thrift.info.port)

Conclusion

In this post, you got a summary of the ports used by HBase internal components, client applications, and by users/administrators, organized by use case. 

However, because HBase runs on top of HDFS, it is also important to know HDFS ports. To run MapReduce with HBase, you need to know MapReduce ports too. For these Hadoop related ports, please refer to Hadoop Default Ports Quick Reference.

Jimmy Xiang is a Software Engineer on the Platform team.

> Have questions? Post them to the Community Forum for HBase.

Filed under:

No Responses

Leave a comment


− two = 5