Couchbase Analytics is optimized to perform ad-hoc analytical queries, which typically process more data than one can fit in memory. It employs a massively parallel processing (MPP) engine that attempts to fully utilize the available processing power in any given node that is running the Analytics service in a Couchbase cluster. At the same time, the engine ensures that it operates within the memory budget allocated to the Analytics service in each node. To achieve that, the Analytics service has a query admission control feature that decides which queries can be executed concurrently. In this article, we explain the basics of how the Analytics query admission control mechanism works and give some examples.

Analytics query admission controller makes decisions whether a newly received query can be executed immediately, needs to be queued, or should be rejected. The decisions are made based on 1) the resources available for the Analytics service and 2) the resources required by any given query. How each is calculated is explained next.

Analytics Available Resources:

Analytics maintains a resource pool of the following:

Total Memory Available for Query Processing:
By default, Analytics dedicates 50% of the memory allocated for the Analytics service on each node for query processing. The other 50% is used for other areas like its storage buffer cache and its ingestion pipeline. Adding up the 50% of memory in each node will give us the total memory available for query processing in Analytics.
For example, if a cluster has three nodes running the Analytics service, each with 16GB of memory, then:

Total Query Workers:
Analytics uses the number of cores reported by the operating system to determine the number of cores in each Analytics node. Adding up the number of cores on each node then multiplying by the coresMultiplier factor gives us the total number of query workers available for query processing. Since ad-hoc analytical queries tend to be IO bound, a CPU core can participate in processing other concurrent queries while another query is waiting for IO. The coresMultiplier, which is a configurable parameter in the Analytics service that has a default value of 3, gives you the ability to specify that level of concurrency per core in order to ensure that each core has enough work to do to stay busy.
For example, if a cluster has three nodes with the Analytics service and each has 8 cores and coresMultiplier is set to 3, then:

Query Required Resources:

When a new query is received by the Analytics query processor, the query compiler determines the resources required to process the query in terms of memory and query workers:

Required Memory:
The required memory for any given query differs based on the nature of the query. For example, a query that requires sorting the result would typically require more memory than a simple query that only counts the number of documents in a collection or retrieves documents in a small range of values for some field. The Analytics query compiler examines all the operations involved in a query and determines the maximum memory required by each query.

Required Query Workers:
A typical Analytics deployment has a data partition (shard) per core in each node. Since Analytics attempts to process data in all data partitions in a cluster in parallel, the required query workers for any query that requires data access equals the number of data partitions.

For example, a query that attempts to count all the documents in a collection in a cluster that has 24 data partitions will require the use of 24 query workers.

Examples:

In the examples below, we assume a Couchbase cluster that has 3 nodes running the Analytics service with 16 GB of memory and 8 cores in each.

Example 1:
coresMultiplier = 3
Total memory available = 48 GB
Total query workers available = (8 + 8 + 8) * 3 = 72 workers
Number of received concurrent queries = 5 (each requiring 2 GB of memory and 24 workers)

Analytics will execute 3 queries concurrently and will queue queries 4 and 5 since the cluster only has 72 query workers available and each query requires 24 workers. Once one of the first 3 queries completes, one of the queued queries will be executed. (Note: It wouldn’t help performance to immediately admit rather than queue queries 4 and 5, as the system has enough work to stay busy with the first three. Trying to do more all at once wouldn’t speed things up – and in fact, could lead to counter-productive resource contention.)

Example 2:
coresMultiplier = 3
Total memory available = 48 GB
Total query workers available = (8 + 8 + 8) * 3 = 72 workers
Number of received concurrent queries = 3 (each requiring 20 GB of memory and 24 workers)

Analytics will execute two queries concurrently and queue the third query since we don’t have enough memory to execute all three concurrently. (Note: Trying to do more by cutting the memory allocated to each query would likely lead to more data spilling, higher IO cost, and thus reduced performance.)

Example 3:
coresMultiplier = 3
Total memory available = 48 GB
Total query workers available = (8 + 8 + 8) * 3 = 72 workers
Number of received concurrent queries = 1 (requiring 50 GB of memory and 24 workers)

Analytics will reject the query since it doesn’t have enough memory to execute it. Note that adding an additional Analytics node to the cluster will increase the available resources and might make executing such a query possible. (Note: One can also use a query hint to direct a query to allocate less memory to the query’s operators, allowing the query to execute albeit with a higher IO cost.)

Example 4:
coresMultiplier = 5
Total memory available = 48 GB
Total query workers available = (8 + 8 + 8) * 5 = 120 workers
Number of received concurrent queries = 5 (each requiring 2 GB of memory and 24 workers)

Analytics will execute all 5 queries concurrently since we have enough memory and query workers. It is important to note that while increasing the value of coresMultiplier will allow more queries to be executed concurrently, it might result in a lower throughput overall if the underlying storage cannot handle the concurrent IO requests or if the CPU cores start thrashing, as mentioned above. Therefore, a careful tuning exercise may need to be done based on the available resources and the workload’s nature when adjusting the coresMultiplier parameter.

Conclusion:

In this article, we have explained how the Analytics service’s query admission control logic works and the factors that it uses to make query admission decisions. We also explained how the coresMultiplier parameter can be used to allow more queries to be executed concurrently and the consequences of changing its value.

Author

Posted by Madhuram Gupta

One Comment

  1. very well-written blog and nicely articulated with examples and math behind query admission control! nice work.

Leave a reply