Tag Archives: system administrators

Sneak Preview: HBaseCon 2015 Development & Internals Track

Categories: Community Events HBase

This year’s HBaseCon Development & Internals track covers new features in HBase 1.0, what’s to come in 2.0, best practices for tuning, and more.

In this post, I’ll give you a window into the HBaseCon 2015’s (May 7 in San Francisco) Development & Internals track.

hbasecon logo

Thanks, Program Committee!

  • “Meet HBase 1.0”

    Enis Söztutar (Hortonworks) &

Read More

5 Pitfalls of Benchmarking Big Data Systems

Categories: Hadoop Performance

Benchmarking Big Data systems is nontrivial. Avoid these traps!

Here at Cloudera, we know how hard it is to get reliable performance benchmarking results. Benchmarking matters because one of the defining characteristics of Big Data systems is the ability to process large datasets faster. “How large” and “how fast” drive technology choices, purchasing decisions, and cluster operations. Even with the best intentions, performance benchmarking is fraught with pitfalls—easy to get numbers,

Read More

How Improved Short-Circuit Local Reads Bring Better Performance and Security to Hadoop

Categories: Hadoop HDFS

One of the key principles behind Apache Hadoop is the idea that moving computation is cheaper than moving data — we prefer to move the computation to the data whenever possible, rather than the other way around. Because of this, the Hadoop Distributed File System (HDFS) typically handles many “local reads” reads where the reader is on the same node as the data:

Initially, local reads in HDFS were handled the same way as remote reads: the client connected to the DataNode via a TCP socket and transferred the data via DataTransferProtocol.

Read More

New E-Learning for Parcels

Categories: Cloudera Manager Training

Cloudera’s new Parcels installation format has been released, and I’m excited to highlight just how useful (and mind-blowingly cool) it is to system administrators and anyone responsible for maintaining a CDH cluster.

If you haven’t read about or played with Parcels, they make components of the distribution significantly easier to manage, install, and upgrade. The new Parcel distribution format works with Cloudera Manager 4.5 and later. When you perform installations and upgrades using Parcels,

Read More

Get a Free Hadoop Operations Ebook with Administrator Training

Categories: Books General Hadoop Training

Start the year off with bigger questions by taking advantage of Cloudera University’s special offer for aspiring Hadoop administrators. All participants who complete a Cloudera Administrator Training for Apache Hadoop public course by the end of March 2013 will receive a free digital copy of Hadoop Operations by Eric Sammer. If you’ve been asked to maintain large and complex Hadoop clusters, this book is a must.