Cloudera Developer Blog · Spark Posts

Letting It Flow with Spark Streaming

Our thanks to Russell Cardullo and Michael Ruggiero, Data Infrastructure Engineers at Sharethrough, for the guest post below about its use case for Spark Streaming.

At Sharethrough, which offers an advertising exchange for delivering in-feed ads, we’ve been running on CDH for the past three years (after migrating from Amazon EMR), primarily for ETL. With the launch of our exchange platform in early 2013 and our desire to optimize content distribution in real time, our needs changed, yet CDH remains an important part of our infrastructure.

Apache Spark: A Delight for Developers

Sure, Spark is fast, but it also gives developers a positive experience they won’t soon forget.

Apache Spark is well known today for its performance benefits over MapReduce, as well as its versatility. However, another important benefit – the elegance of the development experience – gets less mainstream attention.

Why Apache Spark is a Crossover Hit for Data Scientists

Spark is a compelling multi-purpose platform for use cases that span investigative, as well as operational, analytics.

Data science is a broad church. I am a data scientist — or so I’ve been told — but what I do is actually quite different from what other “data scientists” do. For example, there are those practicing “investigative analytics” and those implementing “operational analytics.” (I’m in the second camp.)

Spark is Now Generally Available for Cloudera Enterprise

Cloudera is announcing the general availability of support for Spark, bringing interactive machine learning and stream processing to enterprise data hubs.

Cloudera is pleased to announce the immediate availability of its first release of Apache Spark for Cloudera Enterprise (comprising CDH and Cloudera Manager).

This Month (and Year) in the Ecosystem (December 2013)

Welcome to our sixth edition of “This Month in the Ecosystem,” a digest of highlights from December 2013 (never intended to be comprehensive; for completeness, see the excellent Hadoop Weekly).

With the close of 2013, we also thought it appropriate to include some high points from across the year (not listed in any particular order):

A New Web UI for Spark

The team behind Hue, the open source Web UI that makes Apache Hadoop easier to use, strikes again with a new Spark app.

Editor’s note: This post was recently published on the Hue blog. We republish it here for your convenience.

Putting Spark to Use: Fast In-Memory Computing for Your Big Data Applications

Our thanks to Databricks, the company behind Apache Spark (incubating), for providing the guest post below. Cloudera and Databricks recently announced that Cloudera will distribute and support Spark in CDH. Look for more posts describing Spark internals and Spark + CDH use cases in the near future.

Newer Posts