For Cloudera, Apache HBase has grown into a stable, scalable, mature, and critical component of the Apache Hadoop stack.
HBase adds the ability to do low-latency random read/write across your big data. While it is a key piece of the Apache Hadoop ecosystem, HBase itself has an ecosystem of projects and products that use it as a storage engine for systems such as time series database (OpenTSDB), or SQL-style databases (Apache Phoenix, Apache Trafodion [incubating], Splice Machine). Partner companies such as Cask Data have HBase at the core of their offerings. Look for it as an alternative, more performant metadata store in future releases of Apache Hive. And Hadoop YARN is about to merge a scalable application timeline service based on HBase.
What also makes HBase special is its diverse developer and user community; it comprises engineers from many different companies from around the world. Many of the contributors work for companies that host HBase’s largest production clusters such as Facebook, Yahoo!, SalesForce, FlipKart, and Xiaomi. HBase is installed on tens of thousands of nodes used for back-end services that serve billions of people a day. As you can see from the agenda for HBaseCon 2016 (May 24 in San Francisco), this list expands to include giants such as AirBnB, Alibaba, Visa, and Apple.
Another of the great things about HBase and its community is that there are several commercial vendors providing HBase support for enterprise users. From my vantage point at Cloudera, HBase’s popularity continues to grow; today, more than half of Cloudera customer’s Hadoop clusters have HBase running on them. There are many use cases in financial services, ad tech, healthcare, and telco sectors. While HBase is often compared against the other NoSQL databases, in the field and in sales, the competition between these systems is generally rare since they each occupy a different niche.
For me, it has been a fun journey being part of taking young HBase (starting from 0.90.1) to the rock that today’s HBase 1.x versions have become. HBase has earned a reputation for being rock-solid when properly stood up. In Cloudera’s regular support reviews, we can see that HBase has continually reduced the number of tickets per deployed cluster. In fact, in the past year we’ve greatly increased the number of HBase deploys with no proportional increase in support incidents.
I’m looking forward to HBaseCon tomorrow, and to many more years helping HBase grow and prosper!
Jonathan Hsieh is the Tech Lead and Manager of the Apache HBase Team at Cloudera. He is an Apache HBase committer and PMC member, as well as a founder of the Apache Flume project.