Category Archives: CDH

Sneak Peek into Skybox Imaging’s Cloudera-powered Satellite System

Categories: CDH Use Case

This is a guest post by Oliver Guinan, VP Ground Software, at Skybox Imaging. Oliver is a 15-year veteran of the internet industry and is responsible for all ground system design, architecture and implementation at Skybox.

One of the great promises of the big data movement is using networks of ubiquitous sensors to deliver insights about the world around us. Skybox Imaging is attempting to do just that for millions of locations across our planet.

Read more

What’s New in CDH4.1 Pig

Categories: CDH Pig

Apache Pig is a platform for analyzing large data sets that provides a high-level language called Pig Latin. Pig users can write complex data analysis programs in an intuitive and compact manner using Pig Latin.

Among many other enhancements, CDH4.1, the newest release of Cloudera’s open-source Hadoop distro, upgrades Pig from version 0.9 to version 0.10. This post provides a summary of the top seven new features introduced in CDH4.1 Pig.

Read more

Analyzing Twitter Data with Apache Hadoop, Part 2: Gathering Data with Flume

Categories: CDH Flume Hadoop How-to Oozie Use Case

This is the second article in a series about analyzing Twitter data using some of the components of the Hadoop ecosystem available in CDH, Cloudera’s open-source distribution of Apache Hadoop and related projects. In the first article, you learned how to pull CDH components together into a single cohesive application, but to really appreciate the flexibility of each of these components, we need to dive deeper.

Every story has a beginning,

Read more

How-to: Set Up an Apache Hadoop/Apache HBase Cluster on EC2 in (About) an Hour

Categories: CDH Cloud Cloudera Manager How-to

Note (added July 8, 2013): The information below is deprecated; we suggest that you refer to this post for current instructions.

Today we bring you one user’s experience using Apache Whirr to spin up a CDH cluster in the cloud. This post was originally published here by George London (@rogueleaderr) based on his personal experiences; he has graciously allowed us to bring it to you here as well in a condensed form.

Read more