Welcome to our fourth edition of “This Month in the Ecosystem,” a digest of highlights from October 2013 (never intended to be comprehensive; for completeness, see Hadoop Weekly).
For generating sheer excitement, that month installed a high bar to meet in the future:
- Hadoop 2 Became Stable with the 2.2 (“GA”) Release
The idea that this news could be done justice in a few sentences strains credulity. (Furthermore, it has been covered elsewhere widely and thoroughly at this point.) Suffice to say here that this release will transform the Big Data ecosystem by enabling Hadoop to serve as DNA for a wide variety of new Big Data applications and use cases. The community deserves sincere thanks and congratulations for working together to achieve this result.
Read more about the Hadoop 2 release | See advice for migrating (for users) | See advice for migrating (for operators)
- Strata + Hadoop World 2013 Rocked the Big Data Ecosystem
More than 3,000 people attended the conference this year — making the conference now bigger than several other mainstream tech conferences. Of particular note was Cloudera CSO Mike Olson’s prophecy of an Enterprise Data Hub premised on an open source data platform (Hadoop) and complementary security/data/systems management infrastructure. (Protip: Mike hasn’t been wrong yet.)
Read Mike Olson’s description of the Enterprise Data Hub | See Cloudera keynotes + presenter slides
- Cloudera Released Cloudera Enterprise 5 Beta
The perfect companion announcement to that prophecy, of course, was the release of a Cloudera Enterprise 5 (in which Hadoop 2 is default) beta — which when released into GA will allow users to breathe life into their own Enterprise Data Hub.
Read more about/download the Cloudera Enterprise 5 Beta
- Cloudera Announced Support for Apache Spark on CDH
Spark, the in-memory processing framework that sits on top of HDFS (complementing or replacing MapReduce), is getting a lot of looks from Hadoop users who depend on fast data analytics. As part of a new program in which Cloudera will partner with companies commercializing interesting new open source projects (in this case, with Databricks), Spark will be formally supported on CDH.
Learn more about the Cloudera + Databricks announcement | See the Spark Summit 2013 Agenda
- Cascading and Spring for Apache Hadoop Were Verified for Compatibility/Certified (Respectively) with CDH
The more mainstream developers who have access to CDH as a platform, and the more APIs that can help them get it, the better. +1.
Read more about the new certifications
- Apache HBase Became an 0.96
As HBase VP/0.96 release manager Michael Stack describes in the blog post referenced below, this new release is packed with new functionality for improved scalability, availability, operability, and more. Massive props to the diverse HBase community for reaching this milestone.
Read all the details about HBase 0.96 | Read the 0.94-0.96 upgrade guide
- The Cloudera Impala E-book from O’Reilly Hit the Internets
If you feel the need to digest the use cases for and inner workings of Impala via a relatively quick read — and yet get enough examples and technical detail to make it stick — you can’t do better than this 30-page e-book from O’Reilly Media (authored by Cloudera’s John Russell).
Read John Russell’s blog post | Download the e-book
- Apache Curator Became a Top-Level Apache Project
Curator is an interesting effort from some Netflix engineers to dis-intermediate developers from the gory details of building distributed systems on top of Apache ZooKeeper. In October, it joined the ranks of top-level Apache projects. Congrats to the Curator team!
Learn more about Curator
The next installment of “This Month in the Ecosystem” will publish in early December.
Justin Kestelyn is Cloudera’s developer outreach director.