Cloudera Engineering Blog · Community Posts

Community Meetups during Strata + Hadoop World 2014

The meetup opportunities during the conference week are more expansive than ever — spanning Impala, Spark, HBase, Kafka, and more.

Strata + Hadoop World 2014 is a kaleidoscope of experiences for attendees, and those experiences aren’t contained within the conference center’s walls. For example, the meetups that occur during the conf week (which is concurrent with NYC DataWeek) are a virtual track for developers — and with Strata + Hadoop World being bigger than ever, so is the scope of that track.

This Month in the Ecosystem (August 2014)

Welcome to our 12th (first annual!) edition of “This Month in the Ecosystem,” a digest of highlights from August 2014 (never intended to be comprehensive; for that, see the excellent Hadoop Weekly).

Running CDH 5 on GlusterFS 3.3

The following post was written by Jay Vyas (@jayunit100) and originally published in the Gluster.org Community.

I have recently spent some time getting Cloudera’s CDH 5 distribution of Apache Hadoop to work on GlusterFS 3.3 using Distributed Replicated 2 Volumes. This is made possible by the fact that Apache Hadoop has a pluggable filesystem architecture that allows the computational components within the CDH 5 distribution to be configured to use alternative filesystems to HDFS. In this case, one can configure CDH 5 to use the Hadoop FileSystem plugin for GlusterFS (glusterfs-hadoop), which allows it to run on GlusterFS 3.3. I’ve provided a diagram below that illustrates the CDH 5 core processes and how they interact with GlusterFS.

Progress Report: Cloudera Community Forums After One Year

Cloudera Community forums are proving their value as an important contributor to a rich user experience.

It’s been almost exactly one year since the debut of the Cloudera Community forums. In addition to doing the birthday shout-out, I thought it would be interesting to bring you up to date about adoption and usage patterns.

This Month in the Ecosystem (June 2014)

Welcome to our 10th edition of “This Month in the Ecosystem,” a digest of highlights from June 2014 (never intended to be comprehensive; for that, see the excellent Hadoop Weekly).

Pretty busy for early Summer:

Apache Hive on Apache Spark: Motivations and Design Principles

Two of the most vibrant communities in the Apache Hadoop ecosystem are now working together to bring users a Hive-on-Spark option that combines the best elements of both.

Apache Hive is a popular SQL interface for batch processing and ETL using Apache Hadoop. Until recently, MapReduce was the only execution engine in the Hadoop ecosystem, and Hive queries could only run on MapReduce. But today, alternative execution engines to MapReduce are available — such as Apache Spark and Apache Tez (incubating).

Where to Find Cloudera Tech Talks (Through September 2014)

Find Cloudera tech talks in Texas, Oregon, Washington DC, Illinois, Georgia, Japan, and across the SF Bay Area during the next calendar quarter.

Below please find our regularly scheduled quarterly update about where to find tech talks by Cloudera employees – this time, for the third calendar quarter of 2014 (July through September; traditionally, the least active quarter of the year). Note that this list will be continually curated during the period; complete logistical information may not be available yet. And remember, many of these talks are in “free” venues (no cost of entry).

This Month in the Ecosystem (April 2014)

Welcome to our eighth edition of “This Month in the Ecosystem,” a digest of highlights from April 2014 (never intended to be comprehensive; for completeness, see the excellent Hadoop Weekly).

More good news!

HBaseCon 2014 is a Wrap!

HBaseCon 2014 is in the books. Thanks to all attendees, speakers, and sponsors!

HBaseCon 2014, much like a butterfly, lived for a short number of hours on Monday — but it certainly was beautiful while it lasted! (See photos here.)

Sneak Preview: "Case Studies" Track at HBaseCon 2014

The HBaseCon 2014 “Case Studies” track surfaces some of the most interesting (and diverse) use cases in the HBase ecosystem — and in the world of NoSQL overall — today.

The HBaseCon 2014 (May 5, 2014 in San Francisco) is not just about internals and best practices — it’s also a place to explore use cases that you not have even considered before.

Older Posts