Cloudera Engineering Blog · Careers Posts

Top 10 Blog Posts of 2010

We blogged about 104 different topics in 2010 and we recently decided to take a look back and see what folks were most interested in reading.  The topics that were featured ranged from Cloudera’s Distribution for Apache Hadoop technical updates (CDH3b3 being the most recent) to highlighting upcoming Hadoop related events and activities to sharing practical insights for implementing Hadoop. We also featured a number of guest blog posts.

Here are the top 10 blog posts from 2010:

  1. How to Get a Job at Cloudera
    Cloudera is hiring around the clock, and this blog highlights the best course of action to increase your chances of becoming a Clouderan.
  2. Why Europe’s Largest Ad Targeting Platform Uses Hadoop
    “As data volumes increased and performance suffered, we recognized a new approach was needed (Hadoop).” –Richard Hutton, Nugg.ad CTO
  3. What’s New in CDH3b2 Flume
    Flume, our data movement platform, was introduced to the world and into the open source environment.
  4. What’s New in CDH3b2 Hue
    Hue, a web UI for Hadoop, is a suite of web applications as well as a platform for building custom applications with a nice UI library.
  5. Natural Language Processing with Hadoop and Python
    Data volumes are increasing naturally from text (blogs) and speech (YouTube videos) posing new questions for Natural Language Processing. This involves making sense of lots of data in different forms and extracting useful insights.
  6. How Raytheon BBN Technologies Researchers are Using Hadoop to Build a Scalable, Distributed Triple Store
    Raytheon BBN Technologies built a cloud-based triple-store technology, known as SHARD, to address scalability issues in the processing and analysis of Semantic Web data.
  7. Cloudera’s Support Team Shares Some Basic Hardware Recommendations
    The Cloudera support team discusses workload evaluation and the critical role it plays in hardware selection.
  8. Integrating Hive and HBase
    Facebook explains integrating Hive and HBase to keep their warehouse up to date with the latest information published by users.
  9. Pushing the Limits of Distributed Processing
    Google built a 100,000 node Hadoop cluster running on Nexus One mobile phone hardware and powered by Android. The environmental cost of this solution is 1/100th the equivalent of running it within their data center. (April Fools)
  10. Using Flume to Collect Apache 2 Web Server Logs
    This post presents the common use case of using a Flume node to collect Apache 2 web server logs and deliver them to HDFS.

Cloudera Fun & Frightful Halloween Festivities

Here at Cloudera we embraced the holiday spirit with the light heartedness that is Halloween by hosting several activities including an engineering hack-a-thon, a hack-a-pumpkin-a-thon, and a costume competition.

What is in our Kitchen?

If there is one thing that chefs are proud of, it’s their kitchens. Whether cavernous top-of-the-line affairs or cramped New York apartments, kitchens are the place where raw ingredients are combined with talent and hard work to produce results. The only difference in the world of software is what you will find in our kitchens.  (more…)

How to Get a Job at Cloudera

We’re doing a lot of hiring at Cloudera — we have jobs open in operations, sales, engineering and elsewhere. Hiring well is hard work. We spend a lot of time on it, and have learned a lot about the kind of people we want to bring in. One of the best ways for us to do a good job of hiring is to help you do a good job of applying for a job here.

I’ll begin the post, though, by telling you what doesn’t work. Several times a day, we get an unsolicited email or phone message from a contingency recruiter like this one:

Newer Posts