Tag Archives: realtime

HBaseCon 2013: "Case Studies" Track Preview

Categories: Community Events HBase

HBaseCon 2013 is this Thursday (June 13 in San Francisco), and we can hardly wait!

To complete the “preview” cycle, today we bring you a snapshot of the Case Studies track, which offers a cross-section of the many real-world use cases for Apache HBase. You will learn about how a range of companies across diverse industries use it at the heart of their IT infrastructure to run their business.

Read more

Cloudera Impala: Real-Time Queries in Apache Hadoop, For Real

Categories: CDH HBase Hive Impala

After a long period of intense engineering effort and user feedback, we are very pleased, and proud, to announce the Cloudera Impala project. This technology is a revolutionary one for Hadoop users, and we do not take that claim lightly.

When Google published its Dremel paper in 2010, we were as inspired as the rest of the community by the technical vision to bring real-time, ad hoc query capability to Apache Hadoop,

Read more

Apache HBase I/O – HFile

Categories: HBase

Introduction

Apache HBase is the Hadoop open-source, distributed, versioned storage manager well suited for random, realtime read/write access.

Wait wait? random, realtime read/write access?
How is that possible? Is not Hadoop just a sequential read/write, batch processing system?

Yes, we’re talking about the same thing, and in the next few paragraphs, I’m going to explain to  you how HBase achieves the random I/O, how it stores data and the evolution of the HBase’s HFile format.

Read more

Hadoop Graphing with Cacti

Categories: Data Ingestion Guest Hadoop

An important part of making sure Apache Hadoop works well for all users is developing and maintaining strong relationships with the folks who run Hadoop day in and day out. Edward Capriolo keeps About.com’s Hadoop cluster happy, and we frequently chew the fat with Ed on issues ranging from administrative best practices to monitoring. Ed’s been an invaluable resource as we beta test our distribution and chase down bugs before our official releases. Today’s article looks at some of Ed’s tricks for monitoring Hadoop with Cacti through JMX.

Read more