This Month in the Ecosystem (April 2014)

Categories: Community

Welcome to our eighth edition of “This Month in the Ecosystem,” a digest of highlights from April 2014 (never intended to be comprehensive; for completeness, see the excellent Hadoop Weekly).

More good news!

  • HBaseCon 2014 wrapped up on May 5, and the Apache HBase ecosystem made a showing as being in its strongest shape ever — with Apache Phoenix, OpenTSDB, Kiji, and now Tasmo going stronger than ever.
  • Apache Hadoop 2.4.0 was released by the community. Among other things, it includes ACL Support in HDFS and Automatic Failover for ResourceManager HA in YARN. (Read more about YARN HA here.)
  • Cloudera’s chief architect (and Apache Hadoop co-founder), Doug Cutting, kicked off a public conversation about the future of data privacy. Cutting argues that it is important that “we define abuse of collected data, ensure that we can detect it when it occurs, and punish it effectively.”
  • Sprunch, a new Scala API on top of Apache Crunch, appeared. Sprunch is designed to “remove boilerplate from Java + Crunch whilst adding as little complexity as possible.”
  • Cloudera introduced Cloudera Live, a public, on-demand instance of CDH 5 + sample queries and examples. Cloudera Live provides a lot of what you would get from a VM sandbox, only with no downloads, installs, or waiting.
  • Apache Tajo, a SQL-on-Hadoop project, graduated from the ASF Incubator. The Tajo community also recently added native Parquet support.
  • Uri Laserson, a data scientist at Cloudera, described his impyla Python client for Cloudera Impala.

That’s all for this month, folks!

Justin Kestelyn is Cloudera’s developer outreach director.