Category Archives: CDH

Quality Assurance at Cloudera: Running/Upgrading to New Releases on Our Own EDH Cluster

Categories: CDH Cloudera Manager Testing

Learn why running real workloads on Cloudera’s internal EDH cluster is an important step in the overall QA process before releases.

At Cloudera, we strive to deliver a stable, reliable Apache Hadoop-based platform without sacrificing cutting-edge features. (See this post for an introduction to that process.)

In the past, we have written about how the Cloudera Support organization’s internal cluster helps improve the customer experience via CDH components such as Apache Impala (incubating) and Cloudera Search.

Read More

Apache Impala (incubating) in CDH 5.7: 4x Faster for BI Workloads on Apache Hadoop

Categories: CDH Impala Performance

Impala 2.5, now shipping in CDH 5.7, brings significant performance improvements and some highly requested features.

Impala has proven to be a high-performance analytics query engine since the beginning. Even as an initial production release in 2013, it demonstrated performance 2x faster than a traditional DBMS, and each subsequent release has continued to demonstrate the wide performance gap between Impala’s analytic-database architecture and SQL-on-Apache Hadoop alternatives.

Read More

Check Out Those New and Improved Cloudera Docs

Categories: CDH Cloudera Manager General

Cloudera has given its documentation set a facelift, and we think you’ll like the new look. We use more whitespace and a font that is easier to read and skim, and your pages load much faster. But the improvements go beyond the merely aesthetic.

While electronic documentation has been around for decades, most online documentation is still presented as if it were printed in books. There is a table of contents that assumes you will read the content from start to finish.

Read More

Quality Assurance at Cloudera: Fault Injection and Elastic Partitioning

Categories: CDH Testing

In this installment of our series about how quality assurance is done at Cloudera, learn about the important role of fault injection in the overall process.

Apache Hadoop is the consummate example of a scalable distributed system (SDS); such systems are designed to provide 24/7 services reliably and to scale elastically with the addition of industry-standard hardware cost-effectively. They must be resilient and fault-tolerant to various environmental anomalies.

As you would expect,

Read More

Cloudera Enterprise 5.7 is Released

Categories: CDH Cloudera Manager Cloudera Navigator Hive Spark

Cloudera Enterprise 5.7 is now generally available (comprising CDH 5.7, Cloudera Manager 5.7, and Cloudera Navigator 2.6).

Cloudera is excited to announce the general availability of Cloudera Enterprise 5.7! Main highlights of this release include production-ready Hive-on-Spark functionality, which will help users accelerate their use of Apache Spark as a data processing standard; 4x performance gains for Apache Impala (incubating); easier cluster configuration and utilization reporting; and end-to-end encryption for Apache Spark data.

Read More