Tag Archives: monitoring

Architectural Patterns for Near Real-Time Data Processing with Apache Hadoop

Categories: Data Ingestion Flume Hadoop HBase Kafka Spark

Evaluating which streaming architectural pattern is the best match to your use case is a precondition for a successful production deployment.

The Apache Hadoop ecosystem has become a preferred platform for enterprises seeking to process and understand large-scale data in real time. Technologies like Apache Kafka, Apache Flume, Apache Spark, Apache Storm, and Apache Samza are increasingly pushing the envelope on what is possible. It is often tempting to bucket large-scale streaming use cases together but in reality they tend to break down into a few different architectural patterns,

Read more

Apache Phoenix Joins Cloudera Labs

Categories: Cloudera Labs HBase

We are happy to announce the inclusion of Apache Phoenix in Cloudera Labs.

[Update: A new package for Apache Phoenix 4.7.0 on CDH 5.7 was released in June 2016.]

Apache Phoenix is an efficient SQL skin for Apache HBase that has created a lot of buzz. Many companies are successfully using this technology, including Salesforce.com, where Phoenix first started.

Phoenix logo

With the news that Apache Phoenix integration with Cloudera’s platform has joined Cloudera Labs,

Read more

Sneak Preview: HBaseCon 2015 Use Cases Track

Categories: Community Events HBase

This year’s HBaseCon Use Cases track includes war stories about some of the world’s best examples of running Apache HBase in production.

As a final sneak preview leading up to the show next week, in this post, I’ll give you a window into the HBaseCon 2015’s (May 7 in San Francisco) Use Cases track.

hbasecon logo

Thanks, Program Committee!

  • “HBase @ Flipboard”

Read more

Text Mining with Impala

Categories: Guest Impala Use Case

Thanks to Torsten Kilias and Alexander Löser of the Beuth University of Applied Sciences in Berlin for the following guest post about their INDREX project and its integration with Impala for integrated management of textual and relational data.

Textual data is a core source of information in the enterprise. Example demands arise from sales departments (monitor and identify leads), human resources (identify professionals with capabilities in ‘xyz’), market research (campaign monitoring from the social web),

Read more

Sneak Preview: HBaseCon 2015 Operations Track

Categories: Community Events HBase

This year’s HBaseCon Operations track features some of the world’s largest and most impressive operators.

In this post, I’ll give you a window into the HBaseCon 2015’s (May 7 in San Francisco) Operations track.

hbasecon logo

Thanks, Program Committee!

  • “HBase Operations in a Flurry”

    Rahul Gidwani & Ian Friedman (Yahoo!)

    With multiple clusters of 1,000+ nodes replicated across multiple data centers,

Read more