In Building and Deploying MR2, we presented a brief introduction to MapReduce in Apache Hadoop 0.23 and focused on the steps to setup a single-node cluster. In MapReduce 2.0 in Hadoop 0.23, we discussed the new architectural aspects of the MapReduce 2.0 design. This blog post highlights the main issues to consider when migrating from MapReduce 1.0 to MapReduce 2.0. Note that both MapReduce 1.0 and MapReduce 2.0 are included in CDH4.
At 5 pm PDT on June 30, a leap second was added to the Universal Coordinated Time (UTC). Within an hour, Cloudera Support started receiving reports of systems running at 100% CPU utilization. The Support Team worked quickly to understand and diagnose the problem and soon published a solution. Bugs due to the leap second coupled with the Amazon Web Services outage would make this Cloudera’s busiest support weekend to date.
Since Hadoop is written in Java and closely interoperates with the underlying OS,
Apache HBase is the Hadoop database, and is based on the Hadoop Distributed File System (HDFS). HBase makes it possible to randomly access and update data stored in HDFS, but files in HDFS can only be appended to and are immutable after they are created. So you may ask, how does HBase provide low-latency reads and writes? In this blog post, we explain this by describing the write path of HBase —
On Tuesday, June 12th The Churchill Club of Silicon Valley hosted a panel discussion on Hadoop’s evolution from an open-source project to becoming a standard component of today’s enterprise computing fabric. The lively and dynamic discussion was moderated by Cade Metz, Editor, Wired Enterprise.
Michael Driscoll, CEO, Metamarkets
Andrew Mendelsohn, SVP, Oracle Server Technologies
Mike Olson, CEO, Cloudera
I’m very pleased to announce the immediate General Availability of CDH4 and Cloudera Manager 4 (part of the Cloudera Enterprise 4.0 subscription). These releases are an exciting milestone for Cloudera customers, Cloudera users and the open source community as a whole.
Both CDH4 and Cloudera Manager 4 are chock full of new features. Many new features will appeal to enterprises looking to move more important workloads onto the Apache Hadoop platform.