Now that Apache Hadoop is seven years old, use-case patterns for Big Data have emerged. In this post, I’m going to describe the three main ones (reflected in the post’s title) that we see across Cloudera’s growing customer base.
Transformations (T, for short) are a fundamental part of BI systems: They are the process through which data is converted from a source format (which can be relational or otherwise) into a relational data model that can be queried via BI tools.
Last Thursday, I had the pleasure of attending the Crunchies with Alan Saldich (our VP of Marketing) and Sarah Mustarde (our Senior Director of Corporate Marketing). The Crunchies is an awards event a la “the Oscars” but for startups.
Cloudera was nominated for the “Sexiest Enterprise Startup” award. This was the 6th Annual Crunchies but the first time that TechCrunch had a slot for enterprise software,
(guest blog post by Pete Skomoroch)
In a previous post, I outlined how to build a basic trend tracking site called trendingtopics.org with Cloudera’s Distribution for Hadoop and Hive. TrendingTopics uses Hadoop to identify the top articles trending on Wikipedia and displays related news stories and charts. The data powering the site was pulled from an Amazon EBS Wikipedia Public Dataset containing 8 months of hourly pageview logfiles.
At Cloudera, we frequently work with leading Hadoop developers to produce guest blog posts of general interest to the community. We started a project with Pete Skomoroch a while back, and we were so impressed with his work, we’ve decided to bring Pete on as a regular guest blogger. Pete can show you how to do some pretty amazing things with Hadoop, Pig and Hive and has a particular bias towards Amazon EC2.
(guest blog post by Dmitriy Ryaboy)
A number of organizations donate server space and bandwidth to the Apache Foundation; when you download Apache Hadoop, Tomcat, Maven, CouchDB, or any of the other great Apache projects, the bits are sent to you from a large list of mirrors. One of the ways in which Cloudera supports the open source community is to host such a mirror.