Editor’s note (added Feb. 2, 2014): You can review the latest (and exciting) Impala performance benchmark results by Cloudera here.
In the presentation below, Scott Leberknight of Near Infinity has done such a good and thorough job of dissecting Cloudera Impala, we want to share it with you here.
Notably, Scott has run unscientific but revealing benchmarks based on the current version (1.0.1) inside the QuickStart VM compared to Apache Hive 0.11. (Spoiler: Impala queries were up to 39x faster for interactive queries.) See here for a set of more scientific benchmarks based on concurrent interactive queries run by Cloudera recently (Impala up to 68x faster in that case).
Conclusion: Hive continues to improve as a batch processing/MapReduce framework with Cloudera’s help. But for interactive SQL for Hadoop, Impala is the solution. View for yourself below!