Category Archives: HBase

Offheap Read-Path in Production – The Alibaba story

Categories: Hadoop HBase Performance Use Case

This article is syndicated with permission from the Apache HBase blog and highlights a collaboration between our partners at Intel and Alibaba engineering in time for “Singles Day“, the biggest shopping day on the net. For more on HBase, mark your calendars! On June 12th, 2017 the Apache HBase community will be hosting their annual HBaseCon.

Introduction

HBase is the core storage system in Alibaba’s Search Infrastructure.

Read More

Performance comparison of different file formats and storage engines in the Apache Hadoop ecosystem

Categories: Avro Guest Hadoop HBase Kudu Parquet

Zbigniew Baranowski is a database systems specialist and a member of a group which provides and supports central database and Hadoop-based services at CERN. This blog was originally released on CERN’s “Databases at CERN” blog, and is syndicated here with CERN’s permission.

 

TOPIC

This post presents a performance comparison of few popular data formats and storage engines available in the Apache Hadoop ecosystem: Apache Avro,

Read More

New Study: Evaluating Apache HBase Performance on Modern Storage Media

Categories: Guest Hardware HBase Performance

For the first time, this new study by Intel software engineers analyzes the performance impact of using Apache HBase on various modern storage technologies.

As more “fast” storage technologies (such as SSD and NVMe SSD) emerge, organizations with big data use cases want to make better use of them to achieve better throughput and latency. But to this point, there have been no detailed analyses published about the true significance of that performance boost,

Read More

Apache HBase is Everywhere

Categories: Community Events HBase

For Cloudera, Apache HBase has grown into a stable, scalable, mature, and critical component of the Apache Hadoop stack.  

HBase adds the ability to do low-latency random read/write across your big data. While it is a key piece of the Apache Hadoop ecosystem, HBase itself has an ecosystem of projects and products that use it as a storage engine for systems such as time series database (OpenTSDB), or SQL-style databases (Apache Phoenix,

Read More

Inside Santander’s Near Real-Time Data Ingest Architecture (Part 2)

Categories: HBase Kafka Use Case

Thanks to Pedro Boado and Abel Fernandez Alfonso from Santander’s engineering team for their collaboration on this post about how Santander UK is using Apache HBase as a near real-time serving engine to power its innovative Spendlytics app.

The Spendlytics iOS app is designed to help Santander’s personal debit and credit-card customers keep on top of their spending, including payments made via Apple Pay. It uses real-time transaction data to enable customers to analyze their card spend across time periods (weekly,

Read More