Category Archives: Testing

Cloudera Director and Spot Instances: Resilience and Repair

Categories: CDH Cloud Testing

Cloudera Director enables self-service provisioning and management of CDH and Cloudera Enterprise Data Hub in the cloud. Running Cloudera Enterprise on top of public cloud infrastructure allows you to pay only for the resources you need to meet your data processing demands.

Amazon Web Services (AWS) provides the ability to bid on spare Amazon EC2 computing capacity at a discount through Amazon EC2 Spot instances. With Cloudera Director, you can configure clusters to use Spot instances to improve workload execution time and save costs.

Read more

Quality Assurance at Cloudera: Highly-Controlled Disk Injection

Categories: CDH Testing Tools

Recently installed fault-injection techniques are making quality assurance processes yet more rigorous.

In a previous installment of our series about quality assurance inside Cloudera, we described the fault-injection frameworks (AgenTEST and Sapper) that Cloudera Engineering has devised. The fault-injection framework starts and stops injections, to determine when and how they should occur, respectively.

On that occasion, we presented a number of disk-related injections implemented in AgenTEST, including:

  • BurnIO: Runs disk-intensive processes,

Read more

Resolving Java Lock Contention in Apache Solr: A Performance-Analysis Detective Story

Categories: Performance Search Testing

This case study is an instructive example of how performance analysis is a multi-faceted process that often leads one in surprising directions. 

Apache Solr Near Real Time (NRT)  Search allows Solr users to search documents indexed just seconds ago. It’s a critical feature in many real-time analytics applications. As Solr indexes more and more documents in near real time, end-user expectations for performance get higher and higher.

However,

Read more

Quality Assurance at Cloudera: Distributed Unit Testing

Categories: Kudu Testing Tools

Cloudera Engineering has developed (and recently open sourced) a distributed unit testing framework that cuts testing time from multiple hours to just 10 minutes.

Upstream unit tests are Cloudera’s first line of defense for finding and fixing software bugs, as part of a multidimensional process that also includes static/dynamic code analysis, fault injection, integration/scale/endurance testing, and validation on real workloads. However, running a full unit test suite for Apache Hadoop ecosystem components can take hours,

Read more

Quality Assurance at Cloudera: Running/Upgrading to New Releases on Our Own EDH Cluster

Categories: CDH Cloudera Manager Testing

Learn why running real workloads on Cloudera’s internal EDH cluster is an important step in the overall QA process before releases.

At Cloudera, we strive to deliver a stable, reliable Apache Hadoop-based platform without sacrificing cutting-edge features. (See this post for an introduction to that process.)

In the past, we have written about how the Cloudera Support organization’s internal cluster helps improve the customer experience via CDH components such as Apache Impala (incubating) and Cloudera Search.

Read more