Cloudera Blog · Books Posts
It’s always a great thing for everybody when the experts are willing and eager to share.
So, it’s with special pleasure that I can point you toward a new three-part series by Cloudera’s own Tom White (@tom_e_white) to be published in Dr Dobb’s, which has long been one of the publications of record in the mainstream developer world – from which many original programmers learned basics like BASIC. Now, Dobb’s turns its attention to Apache Hadoop, which says a lot about Hadoop’s continuing adoption.
Tom, of course, is the author of the O’Reilly best-seller Hadoop: The Definitive Guide, and few people have a better record of being both knowledgeable and helpful for those who want to learn “how to Hadoop”.
Start the year off with bigger questions by taking advantage of Cloudera University’s special offer for aspiring Hadoop administrators. All participants who complete a Cloudera Administrator Training for Apache Hadoop public course by the end of March 2013 will receive a free digital copy of Hadoop Operations by Eric Sammer. If you’ve been asked to maintain large and complex Hadoop clusters, this book is a must. In addition to providing practical guidance from an expert, Hadoop Operations is also a terrific companion reference to the full Cloudera Administrator course.
Cloudera’s three-day course provides administrators a comprehensive understanding of all the steps necessary to operate and manage Hadoop clusters. From installation and configuration through load balancing and tuning your cluster, Cloudera’s administration course has you covered. This course is appropriate for system administrators and others who will be setting up or maintaining a Hadoop cluster. Basic Linux experience is a prerequisite, but prior knowledge of Hadoop is not required.
Upon completion of the course, attendees also receive a voucher for a Cloudera Certified Administrator for Apache Hadoop (CCAH) exam. Certification is a great differentiator; it helps establish individuals as leaders in their field, providing customers with tangible evidence of skills and expertise.
Today we bring you a brief interview with Alex Holmes, author of the new book, Hadoop in Practice (Manning). You can learn more about the book and download a free sample chapter here.
There are a few good Hadoop books on the market right now. Why did you decide to write this book, and how is it complementary to them?
When I started working with Hadoop I leaned heavily on Tom White’s excellent book, Hadoop: The Definitive Guide (O’Reilly Media), to learn about MapReduce and how the internals of Hadoop worked. As my experience grew and I started working with Hadoop in production environments I had to figure out how to solve problems such as moving data in and out of Hadoop, using compression without destroying data locality, performing advanced joining techniques and so on. These items didn’t have a lot of coverage in existing Hadoop books, and that’s really the idea behind Hadoop in Practice – it’s a collection of real-world recipes that I learned the hard way over the years.
Hadoop in Practice covers more advanced aspects of working with Hadoop such as MapReduce and HDFS patterns, performance tuning and debugging. The book also looks at how Hadoop can be used as a platform for data science and for data warehousing by studying R integration techniques, and intermediary Pig and Hive recipes. Data mining is another important topic today, and a book on Hadoop isn’t complete without a look at how Mahout lets you run your favorite algorithms at scale.
Apache HBase junkies, this one’s for you: I had an opportunity recently for a quick chat with the authors of HBase in Action (Manning Publications – download sample chapter PDF), by Nick Dimiduk and Cloudera’s Amandeep Khurana.
Why did you write HBase in Action?