Author Archives: Omer

How I found Apache Hadoop

Categories: Careers Community Guest

This is a guest post contributed by Loren Siebert. Loren is a San Francisco entrepreneur and software developer, and is currently the technical lead for the USASearch program.

A year ago I rolled my first Apache Hadoop system into production. Since then, I’ve spoken to quite a few people who are eager to try Hadoop themselves in order to solve their own big data problems. Despite having similar backgrounds and data problems,

Read More

My Summer Internship at Cloudera

Categories: Careers General

This post was written by Daniel Jackoway following his internship at Cloudera during the summer of 2011.

When I started my internship at Cloudera, I knew almost nothing about systems programming or Apache Hadoop, so I had no idea what to expect. The most important lesson I learned is that structured data is great as long as it is perfect, with the addendum that it is rarely perfect.

My project was to develop a unified view of our customer data.

Read More

Apache Hadoop Applied

Categories: General Use Case

BusinessWeek recently published a fascinating article on Apache Hadoop and Big Data, interviewing several Cloudera customers as well as our CEO Mike Olson. One of the things that has consistently exceeded our expectations is the diversity of industries that are adopting Hadoop to solve impressive business challenges and create real value for their organizations. Two distinct use cases that Hadoop is used to tackle have emerged across these industries. Though these have different names in each industry,

Read More

Apache HBase Do’s and Don’ts

Categories: CDH Community HBase

I recently gave a talk at the LA Hadoop User Group about Apache HBase Do’s and Don’ts. The audience was excellent and had very informed and well articulated questions. Jody from Shopzilla was an excellent host and I owe him a big thanks for giving the opportunity to speak with over 60 LA Hadoopers. Since not everyone lives in LA or could make it to the meetup, I’ve summarized some of the salient points here.

Read More

Pushing the Limits of Distributed Processing

Categories: General

Lately we’ve been sharing stories about customers and how they’re using and benefiting from Hadoop.  For example, last week we saw how Raytheon Researchers are using Hadoop to build a scalable, distributed triple store.  This week’s war story comes from the inventor of MapReduce, Google, who is using MapReduce to reduce their map tile image files.

Apache Hadoop has been making waves of excitement in the industry for several years,

Read More