Tag Archives: Apache HBase

Introducing Apache HBase Medium Object Storage (MOB) compaction partition policies

Categories: HBase

Introduction

The Apache HBase Medium Object Storage (MOB) feature was introduced by HBASE-11339. This feature improves low latency read and write access for moderately-sized values (ideally from 100K to 10MB based on our testing results), making it well-suited for storing documents, images, and other moderately-sized objects [1]. The Apache HBase MOB feature achieves this improvement by separating IO paths for file references and MOB objects, applying different compaction policies to MOBs and thus reducing write amplification created by HBase’s compactions.

Read More

Cloudera Enterprise 5.11 is Now Available

Categories: CDH Cloud Cloudera Manager Cloudera Navigator Hadoop

Cloudera Enterprise 5.11 is Now Available

Cloudera is pleased to announce that Cloudera Enterprise 5.11 is now generally available (GA). The highlights of this release include lineage support for Apache Spark, Apache Kudu security integration, embedded data discovery for self-service BI, and new cloud capabilities for Microsoft ADLS and Amazon S3.

As usual, there are also a number of quality enhancements, bug fixes, and other improvements across the stack. Here is a partial list of what’s included (see the Release Notes for a full list):

  • Core Platform and Cloud
    • Amazon S3 Consistency: S3Guard ensures that operations on Amazon S3 are immediately visible to other clients,

Read More

Cloudera Enterprise 5.5 is Now Generally Available

Categories: CDH Cloudera Manager

Cloudera Enterprise 5.5 (comprising CDH 5.5, Cloudera Manager 5.5, and Cloudera Navigator 2.4) has been released.

Cloudera is excited to bring you news of Cloudera Enterprise 5.5. Our persistent emphasis on quality is especially pronounced in this release, with more than 500 issues identified and triaged during its development.

A highlight of this release is the inclusion of Cloudera Navigator Optimizer (available in limited beta for select Cloudera Enterprise customers;

Read More

How-to: Build a Complex Event Processing App on Apache Spark and Drools

Categories: HBase How-to Kafka Spark Use Case

Combining CDH with a business execution engine can serve as a solid foundation for complex event processing on big data.

Event processing involves tracking and analyzing streams of data from events to support better insight and decision making. With the recent explosion in data volume and diversity of data sources, this goal can be quite challenging for architects to achieve.

Complex event processing (CEP) is a type of event processing that combines data from multiple sources to identify patterns and complex relationships across various events.

Read More