Category Archives: Hardware

How-to: Select the Right Hardware for Your New Hadoop Cluster

Categories: Hadoop Hardware How-to Performance Use Case

One of the first questions Cloudera customers raise when getting started with Apache Hadoop is how to select appropriate hardware for their new Hadoop clusters.

Although Hadoop is designed to run on industry-standard hardware, recommending an ideal cluster configuration is not as easy as delivering a list of hardware specifications. Selecting hardware that provides the best balance of performance and economy for a given workload requires testing and validation. (For example, users with IO-intensive workloads will invest in more spindles per core.)

In this blog post,

Read More

Configuring Impala and MapReduce for Multi-tenant Performance

Categories: CDH Hardware Impala

Cloudera Impala has many exciting features, but one of the most impressive is the ability to analyze data in multiple formats, with no ETL needed, in HDFS and Apache HBase. Furthermore, you can use multiple frameworks, such as MapReduce and Impala, to analyze that same data. Consequently, Impala will often run side-by-side with MapReduce on the same physical hardware, with both supporting business-critical workloads. For such multi-tenant clusters, Impala and MapReduce both need to perform well despite potentially conflicting demands for cluster resources.

Read More