The Activity Monitoring feature in Cloudera Manager consolidates all Hadoop cluster activities into a single, real-time view. This capability lets you see who is running what activities on the Hadoop cluster, both at the current time and through historical activity views. Activities are either individual MapReduce jobs or those that are part of larger workflows (via Oozie, Hive or Pig).
Activity Monitoring provides many statistics – both in tabular displays and charts – about the resources used by individual Hadoop jobs and at the aggregate cluster level. The Comparison feature in Activity Monitoring shows the performance of the selected Hadoop job compared with the performance of other similar Hadoop jobs.
The Task Distribution chart creates a map of the performance of task attempts based on a number of different measures (on the Y-axis) and the length of time taken to complete the task on the X-axis. This view helps you identify problems with the user code, data skew or slow running TaskTrackers.
Operational Reports provide a visualization of current and historical disk utilization by user, user groups and directory. In addition, it tracks MapReduce activity on the Hadoop cluster by job, user, group or job ID. These reports are aggregated over selected time periods (hourly, daily, weekly, etc.) and can be exported as XLS or CSV files.
Operational Reports are useful in presentations to explain how your Hadoop cluster is being used, for capacity planning and for chasing down irreverent users not cleaning up their temporary files. Here’s a previous blog post that explains how Operational Reports can aid in capacity planning: http://blog.cloudera.com/blog/2012/01/capacity-planning-with-cloudera-manager/
More information on Cloudera Manager is available here:
Previous Cloudera Manager demo videos blog posts: