Category Archives: Impala

Transparent Hierarchical Storage Management with Apache Kudu and Impala

Categories: CDH Impala Kudu Parquet

When picking a storage option for an application it is common to pick a single storage option which has the most applicable features to your use case. For mutability and real-time analytics workloads you may want to use Apache Kudu, but for massive scalability at a low cost you may want to use HDFS. For that reason, there is a need for a solution that allows you to leverage the best features of multiple storage options.

Read more

Scalability Improvement of Apache Impala 2.12.0 in CDH 5.15.0

Categories: CDH Impala

Key Takeaways

We have significantly improved Impala in CDH 5.15.0 to address some of the scalability bottlenecks in query execution. 64 concurrent streams of TPC-DS queries at 10TB scale in a 135-node cluster now run at 6x query throughput compared to previous releases. In addition to running faster, the query success rate also improved from 73% to 100%. Overall, Impala in CDH 5.15.0 provides massive improvements in throughput and reliability while reducing the resource usage significantly.

Read more

Assessment of Apache Impala Performance using Cloudera Manager Metrics – Part 1 of 3

Categories: CDH Cloudera Manager Impala Performance

For a user-facing system like Apache Impala, bad performance and downtime can have serious negative impacts on your business. Given the complexity of the system and all the moving parts, troubleshooting can be time-consuming and overwhelming.

In this blog post series, we are going to show how the charts and metrics on Cloudera Manager (CM) can help troubleshoot Impala performance issues. They can also help to monitor the system to predict and prevent future outages.

Read more

New in Cloudera 5.15: Simplifying the end user Data Catalog for the Self Service Analytic Database

Categories: Analytic Database CDH Cloud Cloudera Navigator Hue Impala

Self-service BI and exploratory analytics are some of the most common use cases we see our customers running on Cloudera’s analytic database solution. Over the past year, we made significant advancements to provide a simpler user experience for SQL developers and make them more productive for their everyday self-service BI tasks and workflows by leveraging Hue as the SQL development workbench.

With the recent release of Cloudera 5.15,

Read more