Category Archives: Performance

The Truth About MapReduce Performance on SSDs

Categories: Hadoop Hardware MapReduce Performance

Cost-per-performance, not cost-per-capacity, turns out to be the better metric for evaluating the true value of SSDs.

In the Big Data ecosystem, solid-state drives (SSDs) are increasingly considered a viable, higher-performance alternative to rotational hard-disk drives (HDDs). However, few results from actual testing are available to the public.

Recently, Cloudera engineers did such a study based on a combination of SSDs and HDDs, with the goal of determining to what extent SSDs accelerate different MapReduce workloads,

Read More

How-to: Select the Right Hardware for Your New Hadoop Cluster

Categories: Hadoop Hardware How-to Performance Use Case

One of the first questions Cloudera customers raise when getting started with Apache Hadoop is how to select appropriate hardware for their new Hadoop clusters.

Although Hadoop is designed to run on industry-standard hardware, recommending an ideal cluster configuration is not as easy as delivering a list of hardware specifications. Selecting hardware that provides the best balance of performance and economy for a given workload requires testing and validation. (For example, users with IO-intensive workloads will invest in more spindles per core.)

In this blog post,

Read More