Author Archives: Philip Zeyliger

How Does Cloudera Manager Work?

Categories: CDH Cloudera Manager Hadoop Ops and DevOps

At Cloudera, we believe that Cloudera Manager is the best way to install, configure, manage, and monitor your Apache Hadoop stack. Of course, most users prefer not to take our word for it — they want to know how Cloudera Manager works under the covers, first. 

In this post, I’ll explain some of its inner workings. 

The Vocabulary of Cloudera Manager

The image below illustrates the basic nouns and relationships of Cloudera Manager:

A “deployment”

Read more

How Raytheon BBN Technologies Researchers are Using Hadoop to Build a Scalable, Distributed Triple Store

Categories: Guest

This post was contributed by Kurt Rohloff, a researcher in the Information and Knowledge Technologies group of Raytheon BBN Technologies, a wholly owned subsidiary of Raytheon Company.

Using Hadoop to Build a Scalable, Distributed Triple Store

The driving idea behind Semantic Web is to provide a web-scale information sharing model and platform.  One of the singular advancements over the past several years in the Semantic Web domain has been the explosion of data available in semantic formats

Read more

Trip Report: Utah Java User’s Group

Categories: General

One of the fun things about working at a company that’s involved in an exciting open-source project like Hadoop is that, surprisingly often, you get invited to talk about it. On February 18th, I presented Hadoop (slides) to the folks at the Utah Java User’s Group. A little over one hundred people were in attendance. There was pizza, a talk about Android development, my talk about Hadoop,

Read more

Hadoop Default Ports Quick Reference

Categories: General Hadoop

Editor’s note (Oct. 3, 2013): The information below is now deprecated. We recommend that you consult this documentation for ports info instead.

Is it 50030 or 50300 for that JobTracker UI? I can never remember!

Hadoop’s daemons expose a handful of ports over TCP. Some of these ports are used by Hadoop’s daemons to communicate amongst themselves (to schedule jobs, replicate blocks, etc.). Others ports are listening directly to users,

Read more

Configuring Eclipse for Apache Hadoop Development (a screencast)

Categories: Data Ingestion General HDFS Training

Update (added 5/15/2013): The information below is dated; see this post for current instructions about configuring Eclipse for Hadoop contributions.

One of the perks of using Java is the availability of functional, cross-platform IDEs.  I use vim for my daily editing needs, but when it comes to navigating, debugging, and coding large Java projects, I fire up Eclipse.

Typically, when you’re developing Map-Reduce applications,

Read more