Tag Archives: HBase HFile

Apache HBase I/O – HFile

Categories: HBase

Introduction

Apache HBase is the Hadoop open-source, distributed, versioned storage manager well suited for random, realtime read/write access.

Wait wait? random, realtime read/write access?
How is that possible? Is not Hadoop just a sequential read/write, batch processing system?

Yes, we’re talking about the same thing, and in the next few paragraphs, I’m going to explain to  you how HBase achieves the random I/O, how it stores data and the evolution of the HBase’s HFile format.

Read more