Apache Impala (incubating) includes several features that allow you to restrict or allocate resources so as to maximize stability and performance for your Impala workloads. You can limit both CPU and memory resources used by Impala to manage and prioritize jobs on CDH clusters. This blog post describes the techniques a typical Impala deployment can use to manage its resources.
Static Service Pools
Static service pools isolate services from one another, so that a high load on one service has limited impact on other services. You can use static service pools to allocate a static percentage of dedicated resources — CPU, memory, and I/O weight — for Impala to allow for predictable resource availability. These resources will not be shared with other services on the cluster.
Because static service pools are not enabled by default, you will need to use Cloudera Manager to configure static service pools. The following screenshot shows a sample configuration for static service pools in Cloudera Manager:
Recommended split of resources for each service:
- HDFS always needs to have a minimum of 5-10% of the resources.
- Generally, YARN and Impala split the rest of the resources.
- For mostly batch workloads, you might allocate YARN 60%, Impala 30%, and HDFS 10%.
- For mostly ad hoc query workloads, you might allocate Impala 60%, YARN 30%, and HDFS 10%.
Admission control is an Impala feature that allows you to enforce restrictions over the queries running on your cluster, and the resources being used by them. The goal behind admission control is to make sure Impala’s resources are allocated to different users or workloads on a cluster in a way that results in high throughput and stability.
In order to successfully implement admission control, a cluster administrator must know details about the different kinds of Impala workloads running on the cluster, the amount of memory per node the different queries might need, and the optimal number of queries that can run concurrently while still resulting in high throughput. You can use admission control to avoid out-of-memory conditions on busy clusters by limiting the amount of memory available to a query on each cluster node. You can also limit resource usage through admission control by setting an upper limit on the number of queries allowed to run concurrently.
New queries will be accepted and executed until certain conditions are met, such as too many queries or too much total memory used across the cluster. Any additional queries are queued until the earlier ones finish, rather than being cancelled or running slowly and causing contention. As other queries finish, the queued queries are allowed to proceed.
In addition to configuring threshold values for currently executing queries, you can also place limits on the maximum number of queries that are queued (waiting) and a limit on the amount of time they might wait before returning with an error.
You should use admission control to restrict the number of concurrently running or queued queries even when you don’t have extensive information about the memory usage for your workload. Once you have a stable workload with well-understood memory requirements, you should also restrict the maximum memory available to each query. The settings for limiting the number of concurrently running or queued queries can be specified separately for each dynamic resource pool.
On a cluster running Impala alongside other data management components, you can define static service pools to split the available resources between Impala and other components. Then within the area allocated for Impala, you can use Impala’s admission control feature to further subdivide Impala’s resources by creating dynamic service pools.
Refer the Impala documentation for more details on implementing Admission Control in Impala.
Per-query Memory Limits
Impala allows you to set per-query memory limits to prevent queries from consuming excessive memory resources that impact other queries. Using Admission Control to limit the number of concurrent queries, and setting per-query memory limits should effectively ensure that your Impala workloads remain stable and do not run out of resources. If either the maximum number of concurrent queries, or the memory usage of the admitted queries is exceeded, subsequent queries will be queued until the running queries complete and resources become available.
Cloudera recommends that you set memory limits for queries whenever possible. Default memory limits can be set on all queries in a pool (see below), and explicitly specifying the query memory limit will override the default. If you are using admission control pools with restrictions on the maximum memory available to queries in a pool, setting default per-query memory limits is required.
See the Impala documentation for more details on Memory Limits and Admission Control.
Dynamic Resource Pools
Note: Impala Dynamic Resource Pools are different from the default YARN Dynamic Resource Pools. You can turn on Dynamic Resource Pools that are exclusively for use by Impala. In CDH 5.8 / Impala 2.6 and higher, admission control and dynamic resource pools are enabled by default.
Dynamic resource pools are a set of policies for scheduling and allocating Impala’s resources among the queries running in a pool. Dynamic resource pools should be used in conjunction with static service pools to make sure that resources being shared among the Impala pools are subject to the restrictions configured for the Impala service on the cluster.
Resource allocation is based on a user’s access to specific pools and the resources available to those pools. You can use access control lists (ACLs) to restrict which users and groups can submit work to dynamic resource pools and administer them. If a pool’s allocation is not in use, it can be preempted and distributed to other pools.
Placement rules are used to determine how queries are mapped to resource pools. The standard settings are to use a specified pool only when one is explicitly specified; otherwise, the default pool is used. If there are multiple applications submitting queries to the different Impala daemons on the cluster, each of those Impala daemons will serve as coordinators for any queries it receives. Note that admission control currently offers only soft limits when multiple coordinators are being used. This means if multiple coordinators concurrently submit queries targeting the same pool, too many queries may be admitted into the pool. To ensure that the set resource limits are being observed, use only a single coordinator.
As shown in the image, you can configure the following resource limits for a pool:
- Max Running Queries: Maximum number of concurrently executing queries in a pool before incoming queries are queued.
- Max Memory Resources: Maximum memory used by queries in a pool before incoming queries are queued. This value is used at the time of admission and is not enforced at query runtime.
- Default Query Memory Limit: Defines the maximum amount of memory a query can allocate on each node. If a query attempts to use more memory than the allocated amount, it will spill to disk, if possible. Otherwise, the query will be cancelled. This value is enforced at runtime.
- Max Queued Queries: Maximum number of queries that can be queued in a pool before additional incoming queries are rejected.
- Queue Timeout: Specifies how long queries can wait in the queue before they are cancelled with a timeout error.
For instructions on creating dynamic resource pools and setting access control lists (ACLs) to restrict access to the pools, see the Impala documentation on dynamic resource pools.
Sample Use-case for Impala Resource Management
Here’s an example with a user, Anne, who is the administrator for an enterprise data hub that runs a number of workloads, including Impala.
Anne has a 20-node cluster that uses Cloudera Manager static partitioning. Because of the heavy Impala workload, Anne needs to make sure Impala gets enough resources. While the best configuration values might not be known in advance, she decides to start by allocating 50% of resources to Impala. Each node has 128 GiB dedicated to each impalad. Impala has 2560 GiB in aggregate that can be shared across the resource pools she creates.
Next, Anne studies the workload in more detail. After some research, she might choose to revisit these initial values for static partitioning.
To figure out how to further allocate Impala’s resources, Anne needs to consider the workloads and users, and determine their requirements. There are a few main sources of Impala queries:
- Large reporting queries executed by an external process/tool. These are critical business intelligence queries that are important for business decisions. It is important that they get the resources they need to run. There typically are not many of these queries at a given time.
- Frequent, small queries generated by a web UI. These queries scan a limited amount of data and do not require expensive joins or aggregations. These queries are important, but not as critical, perhaps the client tries re-sending the query or the end user refreshes the page.
- Occasionally, expert users might run ad-hoc queries. The queries can vary significantly in their resource requirements. While Anne wants a good experience for these users, it is hard to control what they do (for example, submitting inefficient or incorrect queries by mistake). Anne restricts these queries by default and tells users to reach out to her if they need more resources.
To set up admission control for this workload, Anne first runs the workloads independently, so that she can observe the workload’s resource usage in Cloudera Manager. If they could not easily be run manually, but had been run in the past, Anne uses the history information from Cloudera Manager. It can be helpful to use other search criteria (for example, user) to isolate queries by workload. Anne uses the Cloudera Manager chart for Per-Node Peak Memory usage to identify the maximum memory requirements for the queries.
From this data, Anne observes the following about the queries in the groups above:
- Large reporting queries use up to 32 GiB per node. There are typically 1 or 2 queries running at a time. On one occasion, she observed that 3 of these queries were running concurrently. Queries can take 3 minutes to complete.
- Web UI-generated queries use between 100 MiB per node to usually less than 4 GiB per node of memory, but occasionally as much as 10 GiB per node. Queries take, on average, 5 seconds, and there can be as many as 140 incoming queries per minute.
- Anne has little data on ad hoc queries, but some are trivial (approximately 100 MiB per node), others join several tables (requiring a few GiB per node), and one user submitted a huge cross join of all tables that used all system resources (that was likely a mistake).
Based on these observations, Anne creates the admission control configuration with the following pools:
This pool is for large reporting queries. In order to support running 2 large queries at a time, the pool memory resources are set to 1280 GiB (aggregate cluster memory). This is for 2 queries, each with 32 GiB per node, across 20 nodes. Anne sets the pool’s Default Query Memory Limit to 32 GiB so that no query uses more than 32 GiB on any given node. She sets Max Running Queries to 2 (though it is not necessary she do so). She increases the pool’s queue timeout to 5 minutes in case a third query comes in and has to wait. She does not expect more than 3 concurrent queries, and she does not want them to wait that long anyway, so she does not increase the queue timeout. If the workload increases in the future, she might choose to adjust the configuration or buy more hardware.
This pool is used for the small, high throughput queries generated by the web tool. Anne sets the Default Query Memory Limit to 4 GiB per node, and sets Max Running Queries to 12. This implies a maximum amount of memory per node used by the queries in this pool: 48 GiB per node (12 queries * 4 GiB per node memory limit). If queries in this pool execute on the 20-node cluster, the total amount of memory available to each query, cluster-wide, would be 80 GiB (4 GiB per node memory limit * 20 nodes).
Notice that Anne does not set the pool memory resources, but does set the pool’s Default Query Memory Limit. This is intentional: admission control processes queries faster when a pool uses the Max Running Queries limit instead of the peak memory resources.
This should be enough memory for most queries, since only a few go over 4 GiB per node. For those that do require more memory, they can probably still complete with less memory (spilling if necessary). If, on occasion, a query cannot run with this much memory and it fails, Anne might reconsider this configuration later, or perhaps she does not need to worry about a few rare failures from this web UI.
With regard to throughput, since these queries take around 5 seconds and she is allowing 12 concurrent queries, the pool should be able to handle approximately 144 queries per minute, which is enough for the peak maximum expected of 140 queries per minute. In case there is a large burst of queries, Anne wants them to queue. The default maximum size of the queue is already 200, which should be more than large enough. Anne does not need to change it.
The default pool (which already exists) is a catch all for ad-hoc queries. Anne wants to use the remaining memory not used by the first two pools, 16 GiB per node (XL_Reporting uses 64 GiB per node, High_Throughput_UI uses 48 GiB per node). For the other pools to get the resources they expect, she must still set the Max Memory resources and the Default Query Memory Limit. She sets the Max Memory resources to 320 GiB (16 * 20). She sets the Default Query Memory Limit to 4 GiB per node for now.
The per-query memory estimate is somewhat arbitrary, but satisfies some of the ad hoc queries she has observed. If someone writes a bad query by mistake, Anne does not actually want it using all the system resources. If a user has a large query to submit, an expert user can override the Default Query Memory Limit (up to 16 GiB per node, since that is bound by the pool Max Memory resources). If that is still insufficient for this user’s workload, the user should work with Anne to adjust the settings and perhaps create a dedicated pool for the workload.
Resource Management Overview- http://www.cloudera.com/documentation/enterprise/latest/topics/admin_rm.html