Over the past year (and through several releases), Apache Impala (incubating) has added numerous new features and performance enhancements better enabling high-performance SQL analytics over big data. Thus, it is time again for an update to the Impala cookbook, which contains best practices for these new features, updated guidelines, and more detailed examples.
Note: This cookbook does not yet capture best practices for the major new advancements available with the recent GA of Kudu.
In Parts 1 and 2, we covered the basics of YARN resource allocation. In this installment, we’ll provide an overview of cluster scheduling and introduce the Fair Scheduler, one of the scheduler choices available in YARN.
A standalone computer can have several CPU cores, each running a single process, but there can be as many as a few hundred processes running simultaneously. The scheduler is a part of the desktop’s operating system that assigns a process to a CPU core to run for a short period of time.
[Update: A new package for Apache Phoenix 4.7.0 on CDH 5.7 was released in June 2016.]
New Cloudera Labs packages for Apache Phoenix 4.5.2 (which includes Apache Spark integration) is now available for CDH 5.4.x and CDH 5.5.x.
Earlier this year, Cloudera announced the inclusion of Apache Phoenix in Cloudera Labs.
To recap: Phoenix adds SQL to Apache HBase,
Cloudera Enterprise 5.5 (comprising CDH 5.5, Cloudera Manager 5.5, and Cloudera Navigator 2.4) has been released.
Cloudera is excited to bring you news of Cloudera Enterprise 5.5. Our persistent emphasis on quality is especially pronounced in this release, with more than 500 issues identified and triaged during its development.
A highlight of this release is the inclusion of Cloudera Navigator Optimizer (available in limited beta for select Cloudera Enterprise customers;
Learn the details about using Impala alongside Kudu.
Kudu (currently in beta), the new storage layer for the Apache Hadoop ecosystem, is tightly integrated with Impala, allowing you to insert, query, update, and delete data from Kudu tablets using Impala’s SQL syntax, as an alternative to using the Kudu APIs to build a custom Kudu application. In addition, you can use JDBC or ODBC to connect existing or new applications written in any language,