Tag Archives: release

Apache Hadoop 3.0.0-alpha2 Released

Categories: Community Hadoop

The Apache Hadoop project announced the release of 3.0.0-alpha2 on January 25th, 2017. This is the second alpha release in the 3.0.0 release series leading up to 3.0.0 GA, and incorporates 857 new fixes, improvements, and features since 3.0.0-alpha1 last September. It’s worth reading our previous blog post about 3.0.0-alpha1; in this post, we’ll discuss the new improvements that landed in alpha2.

Classpath Isolation for Hadoop Client Jars

The pain of classpath isolation has been experienced by many Java developers.

Read More

Up and running with Apache Spark on Apache Kudu

Categories: CDH Data Ingestion Data Science General Hadoop How-to Impala Kudu Spark Training Use Case

After the GA of Apache Kudu in Cloudera CDH 5.10, we take a look at the Apache Spark on Kudu integration, share code snippets, and explain how to get up and running quickly, as Kudu is already a first-class citizen in Spark’s ecosystem.

 

As the Apache Kudu development team celebrates the initial 1.0 release launched on September 19, and the most recent 1.2.0 version now GA as part of Cloudera’s CDH 5.10 release,

Read More

New in Cloudera Enterprise 5.5: Support for Complex Types in Impala

Categories: Impala Parquet

The new support for complex types in Impala makes running analytic workloads considerably simpler.

Impala 2.3 (shipping starting in Cloudera Enterprise 5.5) contains support for querying complex types in Apache Parquet tables, specifically ARRAY, MAP, and STRUCTs. This capability enables users to query against naturally nested data sets without having to perform ETL to flatten them. This feature provides a few major benefits, including:

  • It removes additional ETL and data modeling work to flatten data sets.

Read More

Impala’s Next Step: Proposal to Join the Apache Software Foundation

Categories: Impala Kudu

The Impala project has already passed several important milestones on the way to its status as the leader and open standard for BI and SQL analytics on modern big data architecture. Today’s milestone is the submission of proposals for Impala and Kudu to join the Apache Software Foundation (ASF) Incubator.

[Update: Read the text of the Impala and Kudu proposals here and here, respectively.]

Since its initial release nearly five years ago,

Read More