Author Archives: Todd Lipcon

Quorum-based Journaling in CDH4.1

Categories: CDH General HDFS

A few weeks back, Cloudera announced CDH 4.1, the latest update release to Cloudera’s Distribution including Apache Hadoop. This is the first release to introduce truly standalone High Availability for the HDFS NameNode, with no dependence on special hardware or external software. This post explains the inner workings of this new feature from a developer’s standpoint. If, instead, you are seeking information on configuring and operating this feature, please refer to the CDH4 High Availability Guide.

Read More

Avoiding Full GCs in Apache HBase with MemStore-Local Allocation Buffers: Part 3

Categories: General HBase

This is the third and final post in a series detailing a recent improvement in Apache HBase that helps to reduce the frequency of garbage collection pauses. Be sure you’ve read part 1 and part 2 before continuing on to this post.

Recap

It’s been a few days since the first two posts, so let’s start with a quick refresher. In the first post we discussed Java garbage collection algorithms in general and explained that the problem of lengthy pauses in HBase has only gotten worse over time as heap sizes have grown.

Read More

Avoiding Full GCs in HBase with MemStore-Local Allocation Buffers: Part 2

Categories: HBase

This is the second post in a series detailing a recent improvement in Apache HBase that helps to reduce the frequency of garbage collection pauses. Be sure you’ve read part 1 before continuing on to this post.

Recap from Part 1

In last week’s post, we noted that HBase has had problems coping with long garbage collection pauses, and we summarized the different garbage collection algorithms commonly used for HBase on the Sun/Oracle Java 6 JVM.

Read More

Avoiding Full GCs in Apache HBase with MemStore-Local Allocation Buffers: Part 1

Categories: General HBase

Today, rather than discussing new projects or use cases built on top of CDH, I’d like to switch gears a bit and share some details about the engineering that goes into our products. In this post, I’ll explain the MemStore-Local Allocation Buffer, a new component in the guts of Apache HBase which dramatically reduces the frequency of long garbage collection pauses. While you won’t need to understand these details to use Apache HBase, I hope it will provide an interesting view into the kind of work that engineers at Cloudera do.

CDH3 Beta 4 Now Available

Categories: CDH General

Cloudera is happy to announce the fourth beta release of Cloudera’s Distribution for Apache Hadoop version 3 — CDH3b4. As usual, we’d like to share a few highlights from this release.

Since this will be the last beta before we designate CDH3 stable, our focuses for this release have been on stability, security, and scalability.

Stability and ease of use
Since we released CDH3 Beta 3 in October,

Read More