Category Archives: Oozie

What’s New in Hue 2.2?

Categories: CDH General Hue Oozie

This post is about the new release of Hue, an open source web-based interface that makes Apache Hadoop easier to use, that’s included in CDH4.2.

Hue lets you interact with Hadoop services from within your browser without having to go to a command-line interface. It features a file browser for HDFS, an Apache Oozie Application for creating workflows of data processing jobs, a job designer/browser for MapReduce,

Read More

How-To: Schedule Recurring Hadoop Jobs with Apache Oozie

Categories: Guest Hive Oozie

Our thanks to guest author Jon Natkins (@nattyice) of WibiData for the following post!

Today, many (if not most) companies have ETL or data enrichment jobs that are executed on a regular basis as data becomes available. In this scenario it is important to minimize the lag time between data being created and being ready for analysis.

CDH, Cloudera’s open-source distribution of Apache Hadoop and related projects,

Read More

Apache Hadoop in 2013: The State of the Platform

Categories: Avro CDH Flume Hadoop HBase HDFS Hive Hue Impala Mahout MapReduce Oozie Pig Sqoop YARN ZooKeeper

For several good reasons, 2013 is a Happy New Year for Apache Hadoop enthusiasts.

In 2012, we saw continued progress on developing the next generation of the MapReduce processing framework (MRv2), work that will bear fruit this year. HDFS experienced major progress toward becoming a lights-out, fully enterprise-ready distributed filesystem with the addition of high availability features and increased performance. And a hint of the future of the Hadoop platform was provided with the Beta release of Cloudera Impala,

Read More

How-to: Use the ShareLib in Apache Oozie

Categories: How-to MapReduce Oozie

Ed. Note: The post below pertains to CDH 4.x only. Read this post for updates concerning CDH 5.x.

As Apache Oozie, the workflow engine for Apache Hadoop, continues to receive wider adoption from our customers and the community, we’re seeing patterns with respect to the biggest challenges for users. One such point of difficulty is setting up and using Oozie’s ShareLib for allowing JARs to be shared by different workflows.

Read More