Category Archives: HBase

How-to: Improve Apache HBase Performance via Data Serialization with Apache Avro

Categories: Avro HBase Performance

Taking a thoughtful approach to data serialization can achieve significant performance improvements for HBase deployments.

The question of using tall versus wide tables in Apache HBase is a commonly discussed design pattern (see reference here and here). However, there are more considerations here than making that simple choice. Because HBase stores each column of a table as an independent row in the underlying HFiles, significant storage overhead can occur when storing small pieces of information.

Read More

HBaseCon 2016 Speaker Lineup Announced

Categories: Community Events HBase

The speaker lineup for the fifth annual edition of HBaseCon reflects an amazing diversity of production deployments.

The organizers of HBaseCon, the conference for the Apache HBase community, have published the agenda for the conference (May 24, 2016, in San Francisco)—and once again, the impressive geographical and use-case diversity of HBase are on full display.

Keynotes include:

  • “State of Apache HBase” – Apache HBase PMC
  • “Facebook’s Return to (Real) Open Source” – 

Read More

HBaseCon 2016 in Full Effect: Call for Papers and Early Registration

Categories: Community Events HBase

HBaseCon 2016 will occur on May 24, 2016, at The Village in San Francisco.

HBaseCon is back, and CfP and Early Bird registration are both open for business.

hbasecon16

Now in its fifth year, HBaseCon is the premier community event for Apache HBase contributors, developers, admins, and users of all skill levels. The event is hosted and organized by Cloudera, with a Program Committee reflecting a cross-section of the HBase community (including employees of Bloomberg LP,

Read More

How-to: Create and Use a Custom Formatter in the Apache HBase Shell

Categories: Avro HBase How-to Tools

Learn how improve Apache HBase usability by creating a custom formatter for viewing binary data types in the HBase shell.

Cloudera customers are looking to store complex data types in Apache HBase to provide fast retrieval of complex information such as banking transactions, web analytics records, and related metadata associated with those records. Serialization formats such as Apache Avro, Thrift, and Protocol Buffers greatly assist in meeting this goal,

Read More