When picking a storage option for an application it is common to pick a single storage option which has the most applicable features to your use case. For mutability and real-time analytics workloads you may want to use Apache Kudu, but for massive scalability at a low cost you may want to use HDFS. For that reason, there is a need for a solution that allows you to leverage the best features of multiple storage options.
We have significantly improved Impala in CDH 5.15.0 to address some of the scalability bottlenecks in query execution. 64 concurrent streams of TPC-DS queries at 10TB scale in a 135-node cluster now run at 6x query throughput compared to previous releases. In addition to running faster, the query success rate also improved from 73% to 100%. Overall, Impala in CDH 5.15.0 provides massive improvements in throughput and reliability while reducing the resource usage significantly.
For a user-facing system like Apache Impala, bad performance and downtime can have serious negative impacts on your business. Given the complexity of the system and all the moving parts, troubleshooting can be time-consuming and overwhelming.
In this blog post series, we are going to show how the charts and metrics on Cloudera Manager (CM) can help troubleshoot Impala performance issues. They can also help to monitor the system to predict and prevent future outages.
The motivation behind Cloudera Altus SDX is to enable multiple clusters to share the same consistent view of enterprise data hosted on Amazon S3 and Microsoft ADLS. At the heart of Altus SDX is a repository of attributes describing locations and structure of data, access rights, business glossary definitions, lineage and more.
We often hear from our customers about use cases where data is in the cloud and clusters are created on demand to ingest new datasets.
Self-service BI and exploratory analytics are some of the most common use cases we see our customers running on Cloudera’s analytic database solution. Over the past year, we made significant advancements to provide a simpler user experience for SQL developers and make them more productive for their everyday self-service BI tasks and workflows by leveraging Hue as the SQL development workbench.
With the recent release of Cloudera 5.15,