Tag Archives: install hadoop

Exploring Compression for Hadoop: One DBA’s Story

Categories: General Guest HDFS Ops and DevOps

This guest post comes to us courtesy of Gwen Shapira (@gwenshap), a database consultant for The Pythian Group (and an Oracle ACE Director).

Most western countries use street names and numbers to navigate inside cities. But in Japan, where I live now, very few streets have them.

Sometimes solving technical problems is similar to navigating a city without many street names: Once you arrive at the desired location,

Read more

Update on Apache Bigtop (incubating)

Categories: Bigtop General

Introduction

Ever since Cloudera decided to contribute the code and resources for what would later become Apache Bigtop (incubating), we’ve been answering a very basic question: what exactly is Bigtop and why should you or anyone in the Apache (or Hadoop) community care? The earliest and the most succinct answer (the one used for the Apache Incubator proposal) simply stated that “Bigtop is a project for the development of packaging and tests of the Hadoop ecosystem”.

Read more

Setting up CDH3 Hadoop on my new Macbook Pro

Categories: Community Hadoop

This is a guest re-post courtesy of Arun Jacob, Data Architect at Disney, prior to that he was an engineer at RichRelevance and Evri. For the last couple of years, Arun has been focused on data mining/information extraction, using a mix of custom and open source technologies.

A New Machine

I’m fortunate enough to have recently received a Macbook Pro, 2.8 GHz Intel dual core, with 8GB RAM. This is the third time I’ve turned a vanilla mac into a ninja coding machine,

Read more

Map-Reduce With Ruby Using Apache Hadoop

Categories: Hadoop MapReduce

Guest re-post from Phil Whelan, a large-scale web-services consultant based in Vancouver, BC.

Map-Reduce With Hadoop Using Ruby
Here I demonstrate, with repeatable steps, how to fire-up a Hadoop cluster on Amazon EC2, load data onto the HDFS (Hadoop Distributed File-System), write map-reduce scripts in Ruby and use them to run a map-reduce job on your Hadoop cluster. You will not need to ssh into the cluster, as all tasks are run from your local machine.

Read more

Lessons learned putting Hadoop into production

Categories: Hadoop

Webinar : December 8th, 10-11:00am PT, 1-2:00pm ET

Presenter: Eric Sammer, Cloudera Solution Architect

Many Apache Hadoop deployments begin as small test clusters as either an electronic sandbox for analyzing data in new ways or solving a small specific business problem. Typically, as more use cases are discovered more data is loaded into the cluster. Consequently, the clusters grow to provide expanded capacity to the organization.

Read more