This Month in the Ecosystem (February 2014)

Categories: Community Hadoop

Welcome to our sixth edition of “This Month in the Ecosystem,” a digest of highlights from February 2014 (never intended to be comprehensive; for completeness, see the excellent Hadoop Weekly).

February being a short month, the list is relatively short — but never confuse quantity with quality!

  • Hadoop 2.3.0 was released — with HDFS caching among the most prominent new features. (The imminent Cloudera Enterprise 5 GA release will be based on 2.3.0.)
  • Apache Spark graduated into a Top Level Project. Congrats to Sparkies everywhere!
  • Native support for Parquet, the open source, general-purpose columnar storage format for Apache Hadoop (co-founded by Cloudera and Twitter), became official in Apache Hive. Parquet is well on its way to becoming an ecosystem standard, with support now available in Impala, Hive, Spark, Apache Pig, Apache Crunch, Cascading, and more to come.
  • The speakers and schedule for ApacheCon 2014 (April 7-9, in Denver) were announced. Clouderans representing the Hadoop ecosystem include Jarek Jarcec Cecho, Abe Elmahrek, Colin McCabe, Sean Mackrory, Mark Miller, Brock Noland, and Hari Shreedharan.
  • Apache HBase 0.98.0 was released. Some nice security sugar in there.

That’s all for this month, folks!

Justin Kestelyn is Cloudera’s developer outreach director.