Dynamic Progress Reports in the Impala Shell

Categories: Impala

Live updates about your query’s progress in the Impala Shell? That’s a win!

The Impala Shell is a great tool for quickly running exploratory queries, or testing new features in Impala. While Impala is pretty fast, some queries can still take several seconds or longer to complete. It’s therefore useful to be able to see how much progress the query has made and to get an idea of how long the query will take. You can get at a lot of this information through Impala’s debug webpages (http:::25000), but not every user has access to these pages, and besides, it’s more useful to have this feedback directly in the tool that you’re using to issue queries.

A better way to get an overview about query progress in the Impala shell will be shipped in Impala 2.3 and was implemented as part of IMPALA-80. (See the patch here.) This gives you live updates on query progress—either as a simple progress bar, or a detailed breakdown of the progress of each operator in the query plan.

There are two new command line flags for the Impala Shell, and two corresponding new options for the shell’s SET command. The two new command line flags are:

When you want to change the variables from within the Impala shell you can use the new shell options. The shell options are similar to query options, but are only evaluated in the context of the shell.

Both options can be controlled by setting them to True or False. For example: SETLIVE_PROGRESS TRUE;

The live progress percentage bar is based on the number of completed vs. issued scan ranges. So if your query is dominated by non-scan based operations it can show 100% while continuing to run. In this case the live query summary can give a better indication of the query progress.