Category Archives: Cloudera Labs

How Edmunds.com Used Spark Streaming to Build a Near Real-Time Dashboard

Categories: Cloudera Labs Flume Guest Spark Use Case

Thanks to Sam Shuster, Software Engineer at Edmunds.com, for the guest post below about his company’s use case for Spark Streaming, SparkOnHBase, and Morphlines.

Every year, the Super Bowl brings parties, food and hopefully a great game to appease everyone’s football appetites until the fall. With any event that brings in around 114 million viewers with larger numbers each year, Americans have also grown accustomed to commercials with production budgets on par with television shows and with entertainment value that tries to rival even the game itself.

Read more

Download the Hive-on-Spark Beta

Categories: Cloudera Labs Hive Spark

A Hive-on-Spark beta is now available via CDH parcel. Give it a try!

The Hive-on-Spark project (HIVE-7292) is one of the most watched projects in Apache Hive history. It has attracted developers from across the ecosystem, including from organizations such as Intel, MapR, IBM, and Cloudera, and gained critical help from the Spark community.

Many anxious users have inquired about its availability in the last few months.

Read more

New in Cloudera Labs: Google Cloud Dataflow on Apache Spark

Categories: Cloudera Labs Spark

Cloudera and Google are collaborating to bring Google Cloud Dataflow to Apache Spark users (and vice-versa). This new project is now incubating in Cloudera Labs!

“The future is already here—it’s just not evenly distributed.” —William Gibson

For the past decade, a lot of the future has been concentrated at Google’s headquarters in Mountain View. Because of the scale of its operations, Google usually bumped up against the limitations of the current state-of-the-art before anyone else,

Read more

New in Cloudera Labs: SparkOnHBase

Categories: Cloudera Labs HBase Spark

As we progressively move from MapReduce to Spark, we shouldn’t have to give up good HBase integration. Hence the newest Cloudera Labs project, SparkOnHBase!

[Ed. Note: In Aug. 2015, SparkOnHBase was committed to the Apache HBase trunk in the form of a new HBase-Spark module.]

Apache Spark is making a huge impact across our industry, changing the way we think about batch processing and stream processing.

Read more