Untangling Apache Hadoop YARN, Part 3: Scheduler Concepts

Categories: YARN

In Parts 1 and 2, we covered the basics of YARN resource allocation. In this installment, we’ll provide an overview of cluster scheduling and introduce the Fair Scheduler, one of the scheduler choices available in YARN.

A standalone computer can have several CPU cores, each running a single process, but there can be as many as a few hundred processes running simultaneously. The scheduler is a part of the desktop’s operating system that assigns a process to a CPU core to run for a short period of time.

As described previously in Part 1: YARN and Cluster Basics, an application consists of multiple tasks (usually on different hosts) on the cluster. A cluster scheduler essentially has to address:

  • Multi-tenancy: On a cluster, many users launch many different applications, on behalf of multiple organizations. A cluster scheduler allows varying workloads to run simultaneously.
  • Scalability: A cluster scheduler needs to scale to large clusters running many applications. This means that increasing the size of the cluster should improve overall performance without negatively affecting system latencies.

Scheduling in YARN

The ResourceManager (RM) tracks resources on a cluster, and assigns them to applications that need them. The scheduler is that part of the RM that does this matching honoring organizational policies on sharing resources. Please note that:

  • YARN uses queues to share resources among multiple tenants. This will be covered in more detail in “Introducing Queues” below.
  • The ApplicationMaster (AM) tracks each task’s resource requirements and coordinates container requests. This approach allows better scaling since the RM/scheduler doesn’t need to track all containers running on the cluster.

Fair Scheduler

The Fair Scheduler is a popular choice (recommended by Cloudera) among the schedulers YARN supports. In its simplest form, it shares resources fairly among all jobs running on the cluster. The next few sections explain scheduler internals in the context of Fair Scheduler and elaborate on the commonly used controls the scheduler offers.

Introducing Queues

Queues are the organizing structure for YARN schedulers, allowing multiple tenants to share the cluster. As applications are submitted to YARN, they are assigned to a queue by the scheduler. The root queue is the parent of all queues. All other queues are each a child of the root queue or another queue (also called hierarchical queues). It is common for queues to correspond to users, departments, or priorities.

Figure 1 presents a basic Fair Scheduler example of fair-scheduler.xml and a graphical representation of each queue’s share of the cluster.

  • There are four departmental queues (marketing, sales, datascience, and admin).
  • There is a special admin queue that that users fred or greg, who are assumed to be administrators, can use.

untangling-yarn-3-f1

Figure 1: Example part of fair-scheduler.xml and corresponding fair shares of each queue

Using Hierarchical Queues

One level of queues allows sharing the cluster along one dimension (e.g. one queue per team), but it is common to share clusters among multiple dimensions (e.g. per-team and priority). Fair Scheduler allows nesting queues to form a hierarchical queue structure, where each level could correspond to a dimension under its parent queue.

For example, Figure 2 below shows resources shared along team and priority dimensions. The first level corresponds to the team (e.g. root.datascience), and each first-level queue could have a high- and low-priority child queues for jobs from that specific team (e.g. root.datascience.short_jobs and root.datascience.best_effort_jobs).

Queue Naming

In the rest of this post and subsequent parts, we refer to queue names in a different font like root or root.sales.northamerica. Where unambiguous, we use the shortest version possible, i.e. northamerica instead of root.sales.northamerica.

Example #1: Weights in a single queue

In the sales queue, there are two child queues: northamerica and europe. Each has a weight of 30.0, so the fair share for each of the child queues within sales is effectively 50%.

In the marketing queue, there are two child queues of nonequal weight: reports and website. This means that jobs in the reports queue are allocated twice as many resources as jobs in the website queue. However, together, their weight is still governed by the marketing queue’s weight.

Queue Weights and Top-Down Scheduling

Queue weights are used to determine the fair share for a queue. The Fair Scheduler starts from the root queue and looks at the weights of all immediate child queues to determine their fair share. Each child queue’s fair share is further evaluated for its set of child queues.

Example #2: Top-down view of weights

The marketing queue has a weight of 3.0, the sales queue has a weight of 4.0, and the datascience queue has a weight of 13.0. So, the allocation from the root will be 15% to marketing, 20% to sales, and 65% to datascience.

Of the share that goes to datascience, all of the queue’s allocation goes to the short_jobs queue. If there are no jobs assigned to the short_jobs queue, then the jobs in the best_effort_jobs queue are allocated resources.

untangling-yarn-3-f2

Figure 2: Example part of fair-scheduler.xml and corresponding fair shares of hierarchical queues

Extra Reading

To get more information about Fair Scheduler, take a look at the online documentation.

Conclusion

  • The scheduler is a part of a computer operating system that allocates resources to active processes as needed.
  • A cluster scheduler allocates resources to an application running on the cluster. The cluster scheduler is designed for multi-tenancy and scalability.
  • YARN allows you to choose from a set of schedulers. Fair Scheduler is widely used. In its simplest form, it shares resources fairly among all jobs running on the cluster.
  • Fair Scheduler assigns applications to queues. You can set properties on queues to adjust scheduling behavior, based on application and departmental needs. Queue weights are one way to control the fair share for applications in the queue. Shares are allocated top-down from the root, evaluated one level at a time.

Next Time…

Part 4 will cover Fair Scheduler queue properties in greater detail. You can configure these properties to fully customize Fair Scheduler for your needs.

Ray Chiang is a Software Engineer at Cloudera.

Dennis Dawson is a Senior Technical Writer at Cloudera.

facebooktwittergoogle_pluslinkedinmailfacebooktwittergoogle_pluslinkedinmail

9 responses on “Untangling Apache Hadoop YARN, Part 3: Scheduler Concepts

  1. Benjamin Ruland

    Well-written post, thanks!

    I think you got a small error in Figure 2: Shouldn’t the descendant of the Datascience queue be the short_jobs queue instead of the best_efforts_jobs queue, as it has the complete weight?

  2. Subhasish

    Hi , If my application needed a 10 % of resources of a cluster and its the only job running in the cluster.
    How much percent of resources will be given to it , if scheduler is fair scheduler?

    1. Ray Chiang

      The answer is your application will get 10% of the cluster. The “fair share” calculation mainly comes into play if the total resources needed exceeds the current capacity of the cluster.