Tag Archives: memstore

Inside Apache HBase’s New Support for MOBs

Categories: HBase

Learn about the design decisions behind HBase’s new support for MOBs.

Apache HBase is a distributed, scalable, performant, consistent key value database that can store a variety of binary data types. It excels at storing many relatively small values (<10K), and providing low-latency reads and writes.

However, there is a growing demand for storing documents, images, and other moderate objects (MOBs)  in HBase while maintaining low latency for reads and writes.

Read more

New in CDH 5.4: Apache HBase Request Throttling

Categories: CDH HBase

The following post about the new request throttling feature in HBase 1.1 (now shipping in CDH 5.4) originally published in the ASF blog. We re-publish it here for your convenience.

Running multiple workloads on HBase has always been challenging, especially  when trying to execute real-time workloads while concurrently running analytical jobs. One possible way to address this issue is to throttle analytical MR jobs so that real-time workloads are less affected.

Read more

Tuning Java Garbage Collection for HBase

Categories: Guest HBase Performance

This guest post from Intel Java performance architect Eric Kaczmarek (originally published here) explores how to tune Java garbage collection (GC) for Apache HBase focusing on 100% YCSB reads.

Apache HBase is an Apache open source project offering NoSQL data storage. Often used together with HDFS, HBase is widely used across the world. Well-known users include Facebook, Twitter, Yahoo, and more. From the developer’s perspective, HBase is a “distributed,

Read more

Approaches to Backup and Disaster Recovery in HBase

Categories: Hadoop HBase Ops and DevOps

Get an overview of the available mechanisms for backing up data stored in Apache HBase, and how to restore that data in the event of various data recovery/failover scenarios

With increased adoption and integration of HBase into critical business systems, many enterprises need to protect this important business asset by building out robust backup and disaster recovery (BDR) strategies for their HBase clusters. As daunting as it may sound to quickly and easily backup and restore potentially petabytes of data,

Read more

How-to: Use HBase Bulk Loading, and Why

Categories: HBase How-to

Apache HBase is all about giving you random, real-time, read/write access to your Big Data, but how do you efficiently get that data into HBase in the first place? Intuitively, a new user will try to do that via the client APIs or by using a MapReduce job with TableOutputFormat, but those approaches are problematic, as you will learn below. Instead, the HBase bulk loading feature is much easier to use and can insert the same amount of data more quickly.

Read more