Hadoop Log Location and Retention

This post is a follow up to an earlier one called Apache Hadoop Log Files: Where to find them in CDH, and what info they contain. It lists nicely the various places Hadoop uses to store log and other info files while it is running. Over time we have seen changes though in different Hadoop and CDH releases that affect where files are stored or how long they are retained. Below strives to document the status quo (no, not the band) of all the log and info files found while running a Hadoop cluster.

First some general notes: the logs are stored in a few different places and retained for a certain amount of time (or forever). Below are the categories and the various settings that control them. All of the parameters named go into the mapred-site.xml unless noted otherwise.

If necessary the following codes are used to identify changes in various releases (ordered by Hadoop version):

If not noted otherwise the description is for vanilla Hadoop 0.20.2 (H20) or in case of configuration keys it is for all releases but the ones named specifically. Note: CDH2 (Hadoop-0.20.1-169.89) works like CDH3b1 and CDH3b2, so all notes applying to those two versions also apply to CDH2.

Daemon Logs

The actual Java daemons use log4j and DailyRollingFileAppender, which does not have retention settings (other Appenders have though, for example the RollingFileAppender). The only way to keep them in check is either change the log4j.properties to a different Appender or use a cron job to remove older logs daily (e.g. “find /var/log/hadoop/ -type f -mtime +3 -name "hadoop-hadoop-*" -delete“).

Job Files

There are two sets of files per job, which are created when the job is launched. One is the Job Configuration XML file and the other the Job Status File. The Job Configuration XML files contain the job configuration as specified when the job is launched, i.e. they contain the merged defaults with the user settings for the job that override the defaults. Each job generates a second file, next to the Job Configuration XML file, called Job Status File. It is created when the job starts but continually written to until the job finishes. They contain what is also visible after a job has finished, i.e. the counters, status, start/stop time, task attempt details etc. The two files are stored or kept in four locations (usually they are stored as pairs unless noted otherwise):

1) JobTracker Local

When a job is initialized the configuration is saved in what is called a “local directory”:

Notes:
- Only the Job Configuration XML is stored here, not the Job Status File.

- There seems to be a “bug” in the code that the check is not forced when the JobTracker is shut down and when it restarts it gets a new ID assigned (the current date) and will then only check for matching jobs to clean up subsequently. What that means is that when you restart your JobTracker and you had Job Configuration XML files in the “local” directory that they will never be cleaned up. They can safely be removed and are rather small. Another simple cronjob could be used for this purpose, executing something like:

2) In the Job History

Note: The value for this key is treated as a URI, in other words you can store the job files in HDFS or on the local file system (which is the default).

3) Per Job

You can print the info contained in those files using the hadoop command line script like so:

This implies that the above command expects the path to be on HDFS. In other words you cannot use it to display the other job files stored on the local disk for example.

4) In Memory

In addition to the on-disk storage of these files jobs are also retained in memory (for job restarts). But they are kept in memory only for a limited time. This is controlled by:

The check if the in-memory list is too large is only run when a job completes, so on an idle cluster the jobs remain in memory until the next rule possibly kicks in (see Job Retirement Policy below).

Job Retirement Policy (pre-H21 only!)

Once a job is complete it is kept in memory (up to mapred.jobtracker.completeuserjobs.maximum) and on disk as per the above. There is a configuration value that controls the overall retirement policy of completed jobs:

In other words, completed jobs are retired after one day by default. The check for jobs to be retired is done by default every minute and can be controlled with:

The check runs continually while the JobTracker is running. If a job is retired it is simply removed from the in-memory list of the JobTracker (it also removes all Tasks for the job etc.). Jobs are not retired under at least 1 minute (hardcoded in JobTracker.java) of their finish time. The retire call also removes the JobTracker Local (see above) file for the job. All that is left are the two files per retired job in the history directory (hadoop.job.history.location) plus – if enabled – the Per Job files (hadoop.job.history.user.location).

General Job File Notes:

- When the JobTracker cannot create the job history directory at startup (in the JobTracker constructor, calling JobHistory.init()) or creating the files for a job fails (see JobHistory.JobInfo.logSubmitted() for details, it is invoked during the initialization of a task attempt) every subsequent job is not storing any details on disk until the problem is fixed and the JobTracker restarted (MAPREDUCE-1699 is meant to fix this but is not yet applied to any Hadoop 0.20.x based release).

- There is an option to control the low-level block size used by Hadoop to write the file to disk:

Setting it lower has the following advantage (taken verbatim from mapred-default.xml):

Since the job recovery uses job history, its important to dump job history to disk as soon as possible.”

- The Job History page (jobhistory.jsp) in the JobTracker UI always reads the stored job history from hadoop.job.history.location (see above) and not from memory (that is what the default JobTracker page does).

- Finally both of the above, the Job Configuration XML (with the exception of what is noted under “JobTracker Local”) and Job Status File get eventually deleted after 30 days (see H21 related notes below). This cannot be changed (hardcoded in JobHistory.HistoryCleaner.java) and the check for this is run once per day. Also note that it deletes any file older than the threshold unconditionally, i.e. do not leave files there you need or they will be deleted.

CDH3b1, CDH3b2, CDH3b3, Hadoop 0.21.0 (H21) and up:

There is now an additional directory which is used to move completed jobs into (MAPREDUCE-814). As opposed to leave their files all in hadoop.job.history.location directly all files of completed jobs are moved to this new location. This way you can distinguish between running and completed jobs when looking at the history directory. Here is how the path for the completed jobs is specified:

CDH3b3, Hadoop 0.21.0 (H21) and up:

These releases also add the notion of a “retired jobs cache” (MAPREDUCE-817). This is solely kept and handled in memory and is controlled with:

As far as physical log files are concerned this is negligible.

There is a hardcoded maximum of 100 jobs being displayed in the main JobTracker UI page (jobtracker.jsp). Those are pulled from the above internal memory cache. If the user wants to see all completed jobs they have to use the Job History page (jobhistory.jsp) which loads the job details from the file system (see above under General Job File Notes). But note though that for CDH3b1 and up this is loaded from mapred.job.tracker.history.completed.location instead.

Hadoop 0.21.0 (H21):

In Hadoop 0.21.0 the whole cleanup code has been completely refactored. For starters, the above hardcoded 30 days clean up of the job history is now configurable and has been reduced to one week (in JobHistory.HistoryCleaner.java):

In addition the frequency of the check to remove the job files has been made variable (but not user configurable) in that it is either 1 day or less. The latter is the case when mapreduce.jobtracker.jobhistory.maxage is set to something less than a day.

With Hadoop 0.21.0 you may also find files with the extension “.old” in the mapreduce.jobtracker.jobhistory.completed.location directory. Those are the left over job files from a previous instance of JobTracker. So when a JobTracker is restarted it moves and renames the files it finds in mapreduce.jobtracker.jobhistory.location into the mapreduce.jobtracker.jobhistory.completed.location directory. Note that when jobs are removed after the above mapreduce.jobtracker.jobhistory.maxage they will also be removed (and in fact as noted above, any file in that directory older than the maximum age will be deleted). Also note that the “Job History” UI page does include those “.old” job details.

A big change is the JobTracker.RetireJobs class which before Hadoop 0.21.0 was a Thread instance that ran according to mapred.jobtracker.retirejob.interval and mapred.jobtracker.retirejob.check. This has been changed to a simple method that is called at the end of each job and is run synchronous from the main thread. That also means that completed jobs are not removed from memory after the usual 24 hours but stay there solely based on the size of the new retiredjobs.cache.size (defaults to 1000, see above). Older jobs are removed from that cache when a newly completed job adds itself.

Finally, the various keys have been renamed to match each other (see “H21″ throughout this document).

Job Status Store

There is an additional place to store job details, which always assumes HDFS to be the storage. It is called Job Status Store (HADOOP-1876) and is used by JobClients to query details about jobs. By default it is disabled and for persistent job status storage in HDFS the following options need to be adjusted:

The status files are stored directly under the above job status directory and are created when the job finishes. Here their location, file name pattern and retention:

The check for the above jobstatus.hours is done every hour (hardcoded in CompletedJobStatusStore.java) and runs whenever the JobTracker is running.

Task Attempt Logs

These include the Standard Out and Standard Error logs as well as custom user logs created by your code using the logging framework within your MapReduce jobs. The location is hardcoded to be ${hadoop.log.dir}/userlogs.

The following controls how large the logs can become. Please note that it has to maintain a linked list in memory until the end of the job to only write the last “limit / event_size” (event_size is set to 100) events. So this needs to be accounted for in terms of the per task memory consumption. If you leave it at the default “0” then it is disabled and logs straight to disk.

By default the userlogs are deleted every day. It is triggered by the TaskTracker when a child process is launched (in other words, when the cluster is idle, then nothing is cleaned up). It can be changed with:

An note about the log4j.properties. You will find these lines:

The first few lines define default values that can be overridden by system properties (refer to “-D...” parameters). The only lines really handled by the code (see TaskRunner.java) are

The former is always set in the TaskRunner.java code to the current task ID and the latter to whatever is defined in the mapred-site.xml (or specified at the JobTracker startup on the command line using “-D...“).

In other words the above totalLogFileSize=100 is never used. The other lines are obsolete and are either controlled by the above configuration keys (mapred.userlog.retain.hours) or not used at all (and they should be removed).

Filed under:

1 Response

Leave a comment


eight × 3 =