Author Archives: Brock Noland

Apache Hive 1.0.0 Has Been Released

Categories: Community Hive

The Apache Hive PMC has recently voted to release Hive 1.0.0 (formerly known as Hive 0.14.1).

This release is recognition of the work the Apache Hive community has done over the past nine years and is continuing to do. The Apache Hive 1.0.0 release is a codebase that was expected to be released as 0.14.1 but the community felt it was time to move to a 1.x.y release naming structure.

Read more

Hands-on Hive-on-Spark in the AWS Cloud

Categories: Cloud Community Hive Spark

Interested in Hive-on-Spark progress? This new AMI gives you a hands-on experience.

Nearly one year ago, the Apache Hadoop community began to embrace Apache Spark as a powerful batch-processing engine. Today, many organizations and projects are augmenting their Hadoop capabilities with Spark. As part of this shift, the Apache Hive community is working to add Spark as an execution engine for Hive. The Hive-on-Spark work is being tracked by HIVE-7292 which is one of the most popular JIRAs in the Hadoop ecosystem.

Read more

Apache Hive on Apache Spark: The First Demo

Categories: Community Hive MapReduce Spark

The community effort to make Apache Spark an execution engine for Apache Hive is making solid progress.

Apache Spark is quickly becoming the programmatic successor to MapReduce for data processing on Apache Hadoop. Over the course of its short history, it has become one of the most popular projects in the Hadoop ecosystem, and is now supported by multiple industry vendors—ensuring its status as an emerging standard.

Two months ago Cloudera,

Read more

How-to: Implement Role-based Security in Impala using Apache Sentry

Categories: General Hive How-to Impala Platform Security & Cybersecurity

This quick demo illustrates how easy it is to implement role-based access and control in Impala using Sentry.

Apache Sentry (incubating) is the Apache Hadoop ecosystem tool for role-based access control (RBAC). In this how-to, I will demonstrate how to implement Sentry for RBAC in Impala. I feel this introduction is best motivated by a use case.

Data warehouse optimization is one of the most common Hadoop use cases.

Read more

With Sentry, Cloudera Fills Hadoop’s Enterprise Security Gap

Categories: Hadoop Hive Impala Platform Security & Cybersecurity

Every day, more data, users, and applications are accessing ever-larger Apache Hadoop clusters. Although this is good news for data driven organizations overall, for security administrators and compliance officers, there are still lingering questions about how to enable end-users under existing Hadoop infrastructure without compromising security or compliance requirements.

While Hadoop has strong security at the filesystem level, it lacks the granular support needed to adequately secure access to data by users and BI applications.

Read more