Security, Hive-on-Spark, and Other Improvements in Apache Hive 1.2.0

Categories: Community Hive Spark

Apache Hive 1.2.0, although not a major release, contains significant improvements.

Recently, the Apache Hive community moved to a more frequent, incremental release schedule. So, a little while ago, we covered the Apache Hive 1.0.0 release and explained how it was renamed from 0.14.1 with only minor feature additions since 0.14.0.

Shortly thereafter, Apache Hive 1.1.0 was released (renamed from Apache Hive 0.15.0), which included more significant features—including Hive-on-Spark.

Last week, the community released Apache Hive 1.2.0. Although a more narrow release than Hive 1.1.0, it nevertheless contains improvements in the following areas:

New Functionality

  • Support for Apache Spark 1.3 (HIVE-9726), enabling dynamic executor allocation and impersonation
  • Support for integration of Hive-on-Spark with Apache HBase (HIVE-10073)
  • Support for numeric partition columns with literals (HIVE-10313, HIVE-10307)
  • Support for Union Distinct (HIVE-9039)
  • Support for specifying column list in insert statement (HIVE-9481)

Performance and Optimizations

Security

Usability and Stability

For a larger but still incomplete list of features, improvements, and bug fixes, see the release notes. (Most of the Hive-on-Spark JIRAs are missing from the list.)

The most important improvements and fixes above (such as those involving security, for example) are already available in CDH 5.4.x releases. As another example, CDH users have been testing the Hive-on-Spark public beta since its first release, as well as improvements made to that beta in CDH 5.4.0.

We’re looking forward to working with the rest of the Apache Hive community to drive the project continually forward in the areas of SQL functionality, performance, security, and stability!

Xuefu Zhang is a Software Engineer at Cloudera and a PMC member of Apache Hive.

 

Facebooktwittergoogle_pluslinkedinmailFacebooktwittergoogle_pluslinkedinmail

6 responses on “Security, Hive-on-Spark, and Other Improvements in Apache Hive 1.2.0

  1. Alex

    Is there any ETA for Hive 1.2 in CDH5? We need Parquet 1.6 support and afaik 1.2 is the first Hive release that uses Parquet 1.6.

    Thanks!

    1. Justin Kestelyn (@kestelyn) Post author

      Hive 1.1 just recently started shipping in CDH 5.4 and Hive 1.2 is only three months old, so please give us some time for proper integration testing etc. You probably won’t see it until CDH 6.x.

    1. Justin Kestelyn (@kestelyn) Post author

      You will see CDH 5.5 by end of 2015. So, no CDH 6.0 release until 2016.

  2. Michael

    What is the status on resolving this issue? Do we still have to wait till release 6.0? I am running into this error when creating tables. What is the best work around on this issue?

    create table sfdc_opportunities_sandbox_parquet like sfdc_opportunities_sandbox STORED AS PARQUET

    Error
    Parquet does not support date. See HIVE-6384

    1. Justin Kestelyn Post author

      Mike,

      For easier triage, please post your issue in the “Hive” area at community.cloudera.com.