Category Archives: Impala

External Hands-on Experiences with Cloudera Impala

Categories: Impala

The beta release of Cloudera Impala, the first (and open source) real-time query engine for Apache Hadoop, has been out in the wild (in binary as well as VM forms) for over a month now, and users have had time to get up-close and hands-on. Consequently, we’re beginning to see some fascinating self-published observations and guides.  

Here are just a few examples; you may know of more that we’ve missed:

Read more

Cloudera Impala: Real-Time Queries in Apache Hadoop, For Real

Categories: CDH HBase Hive Impala

After a long period of intense engineering effort and user feedback, we are very pleased, and proud, to announce the Cloudera Impala project. This technology is a revolutionary one for Hadoop users, and we do not take that claim lightly.

When Google published its Dremel paper in 2010, we were as inspired as the rest of the community by the technical vision to bring real-time, ad hoc query capability to Apache Hadoop,

Read more

Cloudera, The Platform for Big Data

Categories: CDH Hadoop Impala

Today we’re proud to announce a new addition to the Apache Hadoop ecosystem: Cloudera Impala, a parallel SQL engine that runs natively on Hadoop storage. The salient points are:

  • Hive compatible
  • 10x the performance of Hive/MapReduce, on average
  • 100% open source, under the Apache License v2 – just like Hadoop
  • Tested to run on CDH4.1 or higher

There’s a blog post that follows mine that provides more details about Impala and how it works.

Read more