Tag Archives: mac

How-to: Install Hue on a Mac

Categories: How-to Hue

Learn how to set up Hue, the open source GUI that makes Apache Hadoop easier to use, on your Mac.

You might have already all the prerequisites installed but we are going to show how to start from a fresh Yosemite (10.10) install and end up with running Hue on your Mac in almost no time!

We are going to be using the official Quickstart VM from Cloudera that already packs all the Apache Hadoop ecosystem components your Hue will talk to.

Read more

How To: Use Oozie Shell and Java Actions

Categories: General How-to Oozie Pig

Ed. Note (Oct. 16, 2015): This post has been updated for CDH 5.x; some external links have been updated as well.

Apache Oozie, the workflow coordinator for Apache Hadoop, has actions for running MapReduce, Apache Hive, Apache Pig, Apache Sqoop, and Distcp jobs; it also has a Shell action and a Java action. These last two actions allow us to execute any arbitrary shell command or Java code,

Read more

CDH 3 Demo VM installation on Mac OS X using VirtualBox

Categories: Guest Hadoop

The first task is to ensure that your system is up-to-date.

This procedure has been tested on the following configuration:

  • Fully up-to-date Snow Leopard 10.6.7
  • Update or install Oracle VM VirtualBox for Mac OS X to version 4.0.8 (Virtualbox 4.0.8-71778-OSX)


  • The browser used is Safari.
  • The Demo VM has been downloaded to the default download location for Safari (i.e. the “Downloads”

Read more

Setting up CDH3 Hadoop on my new Macbook Pro

Categories: Community Hadoop

This is a guest re-post courtesy of Arun Jacob, Data Architect at Disney, prior to that he was an engineer at RichRelevance and Evri. For the last couple of years, Arun has been focused on data mining/information extraction, using a mix of custom and open source technologies.

A New Machine

I’m fortunate enough to have recently received a Macbook Pro, 2.8 GHz Intel dual core, with 8GB RAM. This is the third time I’ve turned a vanilla mac into a ninja coding machine,

Read more

Hadoop at Twitter (part 1): Splittable LZO Compression

Categories: General

This summer I sent the following tweet, “Had lunch today at Twitter HQ. Thanks for the invite, @kevinweil! Great lunch conversation. Smart, friendly and fun team.” Kevin Weil leads the analytics team at Twitter and is an active member of the Hadoop community, and his colleague Eric Maland leads Operations.  Needless to say, Twitter is doing amazing things with Hadoop.  This guest blog from Kevin and Eric covers one of Twitter’s open-source projects which provides a solution for splittable LZO for Hadoop.

Read more