Skip to content

Commit 33792b6

Browse files
authored
DOC-723 | Query plan caching (#641)
* WIP: Query plan caching * Add cross-references, complete release notes API descriptions * Change a few code block languages in README * Finalize plan cache docs * Add query options to HTTP and JS API reference * Review, examples, adjust to changed behavior * Move (results) cache option to query sub options * Apply to 3.13 * Fix link * Rework query results caching docs, add generated examples Partially apply changes to 3.11 * Add cross-reference and version remark * Add another version remark * Add some missing options that influence the query explain response * Only affected plan cache entries may get removed in single server * [skip ci] Automatic commit of generated files from CircleCI
1 parent 073e9ed commit 33792b6

File tree

31 files changed

+2550
-532
lines changed

31 files changed

+2550
-532
lines changed

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -169,7 +169,7 @@ Apple silicon like M1).
169169

170170
Run the `docker compose` services using the `docker-compose.pain-build.yml` file.
171171

172-
```shell
172+
```sh
173173
docs-hugo/toolchain/docker/amd64> docker compose -f docker-compose.plain-build.yml up --abort-on-container-exit
174174
```
175175

@@ -179,7 +179,7 @@ To make the documentation tooling not start a live server in watch mode but
179179
rather create a static build and exit, set the environment variable `ENV` to
180180
any value other than `local` before calling `docker compose ...`:
181181

182-
```shell
182+
```sh
183183
export ENV=static # Bash
184184
set -xg ENV static # Fish
185185
$Env:ENV='static' # PowerShell
@@ -215,7 +215,7 @@ The generators entry is a space-separated string.
215215
If `metrics` or `error-codes` is in the `generators` string, the following
216216
environment variable has to be exported:
217217

218-
```shell
218+
```sh
219219
export ARANGODB_SRC_{VERSION}=path/to/arangodb/source
220220
```
221221

@@ -246,7 +246,7 @@ Apple silicon like M1).
246246

247247
Run the `docker compose` services without specifying a file:
248248

249-
```shell
249+
```sh
250250
docs-hugo/toolchain/docker/arm64> docker compose up --abort-on-container-exit
251251
```
252252

site/content/3.11/aql/execution-and-performance/caching-query-results.md

Lines changed: 76 additions & 73 deletions
Original file line numberDiff line numberDiff line change
@@ -21,48 +21,48 @@ are not part of a cluster setup.
2121

2222
The cache can be operated in the following modes:
2323

24-
- `off`: the cache is disabled. No query results will be stored
25-
- `on`: the cache will store the results of all AQL queries unless their `cache`
26-
attribute flag is set to `false`
27-
- `demand`: the cache will store the results of AQL queries that have their
28-
`cache` attribute set to `true`, but will ignore all others
24+
- `off`: The cache is disabled. No query results are stored.
25+
- `on`: The cache stores the results of all AQL queries unless the `cache`
26+
query option is set to `false`.
27+
- `demand`: The cache stores the results of AQL queries that have the
28+
`cache` query option set to `true` but ignores all others.
2929

30-
The mode can be set at server startup and later changed at runtime.
30+
The mode can be set at server startup as well as at runtime, see
31+
[Global configuration](#global-configuration).
3132

3233
## Query eligibility
3334

34-
The query results cache will consider two queries identical if they have exactly the
35+
The query results cache considers two queries identical if they have exactly the
3536
same query string and the same bind variables. Any deviation in terms of whitespace,
36-
capitalization etc. will be considered a difference. The query string will be hashed
37-
and used as the cache lookup key. If a query uses bind parameters, these will also be hashed
38-
and used as part of the cache lookup key.
39-
40-
That means even if the query strings of two queries are identical, the query results
41-
cache will treat them as different queries if they have different bind parameter
42-
values. Other components that will become part of a query's cache key are the
43-
`count`, `fullCount` and `optimizer` attributes.
44-
45-
If the cache is turned on, the cache will check at the very start of execution
46-
whether it has a result ready for this particular query. If that is the case,
47-
the query result will be served directly from the cache, which is normally
48-
very efficient. If the query cannot be found in the cache, it will be executed
37+
capitalization etc. is considered a difference. The query string is hashed
38+
and used as the cache lookup key. If a query uses bind parameters, these are also
39+
hashed and used as part of the cache lookup key.
40+
41+
Even if the query strings of two queries are identical, the query results cache
42+
treats them as different queries if they have different bind parameter
43+
values. Other components that become part of a query's cache key are the
44+
`count`, `fullCount`, and `optimizer` attributes.
45+
46+
If the cache is enabled, it is checked whether it has a result ready for a
47+
particular query at the very start of processing the query request. If this is
48+
the case, the query result is served directly from the cache, which is normally
49+
very efficient. If the query cannot be found in the cache, it is executed
4950
as usual.
5051

51-
If the query is eligible for caching and the cache is turned on, the query
52-
result will be stored in the query results cache so it can be used for subsequent
52+
If the query is eligible for caching and the cache is enabled, the query
53+
result is stored in the query results cache so it can be used for subsequent
5354
executions of the same query.
5455

5556
A query is eligible for caching only if all of the following conditions are met:
5657

57-
- the server the query executes on is a single server (i.e. not part of a cluster)
58-
- the query string is at least 8 characters long
59-
- the query is a read-only query and does not modify data in any collection
60-
- no warnings were produced while executing the query
61-
- the query is deterministic and only uses deterministic functions whose results
62-
are marked as cacheable
63-
- the size of the query result does not exceed the cache's configured maximal
64-
size for individual cache results or cumulated results
65-
- the query is not executed using a streaming cursor
58+
- The server the query executes on is a single server (i.e. not part of a cluster).
59+
- The query is a read-only query and does not modify data in any collection.
60+
- No warnings were produced while executing the query.
61+
- The query is deterministic and only uses deterministic functions whose results
62+
are marked as cacheable.
63+
- The size of the query result does not exceed the cache's configured maximal
64+
size for individual cache results or cumulated results.
65+
- The query is not executed using a streaming cursor (`"stream": true` query option).
6666

6767
The usage of non-deterministic functions leads to a query not being cacheable.
6868
This is intentional to avoid caching of function results which should rather
@@ -85,8 +85,8 @@ remove, truncate operations as well as AQL data-modification queries).
8585
**Example**
8686

8787
If the result of the following query is present in the query results cache,
88-
then either modifying data in collection `users` or in collection `organizations`
89-
will remove the already computed result from the cache:
88+
then either modifying data in the `users` or `organizations` collection
89+
removes the already computed result from the cache:
9090

9191
```aql
9292
FOR user IN users
@@ -95,42 +95,42 @@ FOR user IN users
9595
RETURN { user: user, organization: organization }
9696
```
9797

98-
Modifying data in other collections than the named two will not lead to this
98+
Modifying data in other unrelated collections does not lead to this
9999
query result being removed from the cache.
100100

101101
## Performance considerations
102102

103103
The query results cache is organized as a hash table, so looking up whether a query result
104-
is present in the cache is relatively fast. Still, the query string and the bind
105-
parameter used in the query will need to be hashed. This is a slight overhead that
106-
will not be present if the cache is turned off or a query is marked as not cacheable.
104+
is present in the cache is fast. Still, the query string and the bind
105+
parameter used in the query need to be hashed. This is a slight overhead that
106+
is not present if the cache is disabled or a query is marked as not cacheable.
107107

108108
Additionally, storing query results in the cache and fetching results from the
109-
cache requires locking via an R/W lock. While many thread can read in parallel from
109+
cache requires locking via a read/write lock. While many thread can read in parallel from
110110
the cache, there can only be a single modifying thread at any given time. Modifications
111111
of the query cache contents are required when a query result is stored in the cache
112112
or during cache invalidation after data-modification operations. Cache invalidation
113-
will require time proportional to the number of cached items that need to be invalidated.
113+
requires time proportional to the number of cached items that need to be invalidated.
114114

115-
There may be workloads in which enabling the query results cache will lead to a performance
115+
There may be workloads in which enabling the query results cache leads to a performance
116116
degradation. It is not recommended to turn the query results cache on in workloads that only
117-
modify data, or that modify data more often than reading it. Turning on the cache
118-
will also provide no benefit if queries are very diverse and do not repeat often.
119-
In read-only or read-mostly workloads, the cache will be beneficial if the same
117+
modify data, or that modify data more often than reading it. Enabling the cache
118+
also provides no benefit if queries are very diverse and do not repeat often.
119+
In read-only or read-mostly workloads, the cache is beneficial if the same
120120
queries are repeated lots of times.
121121

122-
In general, the query results cache will provide the biggest improvements for queries with
122+
In general, the query results cache provides the biggest improvements for queries with
123123
small result sets that take long to calculate. If query results are very big and
124124
most of the query time is spent on copying the result from the cache to the client,
125-
then the cache will not provide much benefit.
125+
then the cache does not provide much benefit.
126126

127127
## Global configuration
128128

129-
The query results cache can be configured at server start using the configuration parameter
130-
`--query.cache-mode`. This will set the cache mode according to the descriptions
131-
above.
129+
The query results cache can be configured at server start with the
130+
[`--query.cache-mode`](../../components/arangodb-server/options.md#--querycache-mode)
131+
startup option.
132132

133-
After the server is started, the cache mode can be changed at runtime as follows:
133+
The cache mode can also be changed at runtime using the JavaScript API as follows:
134134

135135
```js
136136
require("@arangodb/aql/cache").properties({ mode: "on" });
@@ -139,10 +139,10 @@ require("@arangodb/aql/cache").properties({ mode: "on" });
139139
The maximum number of cached results in the cache for each database can be configured
140140
at server start using the following configuration parameters:
141141

142-
- `--query.cache-entries`: maximum number of results in query result cache per database
143-
- `--query.cache-entries-max-size`: maximum cumulated size of results in query result cache per database
144-
- `--query.cache-entry-max-size`: maximum size of an individual result entry in query result cache
145-
- `--query.cache-include-system-collections`: whether or not to include system collection queries in the query result cache
142+
- `--query.cache-entries`: The maximum number of results in the query results cache per database
143+
- `--query.cache-entries-max-size`: The maximum cumulated size of results in the query results cache per database
144+
- `--query.cache-entry-max-size`: The maximum size of an individual result entry in query results cache
145+
- `--query.cache-include-system-collections`: Whether to include system collection queries in the query results cache
146146

147147
These parameters can be used to put an upper bound on the number and size of query
148148
results in each database's query cache and thus restrict the cache's memory consumption.
@@ -158,44 +158,47 @@ require("@arangodb/aql/cache").properties({
158158
});
159159
```
160160

161-
The above will limit the number of cached results in the query results cache to 200
162-
results per database, and to 8 MB cumulated query result size per database. The maximum
163-
size of each query cache entry is restricted to 8MB. Queries that involve system
161+
The above settings limit the number of cached results in the query results cache to 200
162+
results per database, and to 8 MiB cumulated query result size per database. The maximum
163+
size of each query cache entry is restricted to 1 MiB. Queries that involve system
164164
collections are excluded from caching.
165165

166+
You can also change the configuration at runtime with the
167+
[HTTP API](../../develop/http-api/queries/aql-query-results-cache.md).
168+
166169
## Per-query configuration
167170

168171
When a query is sent to the server for execution and the cache is set to `on` or `demand`,
169-
the query executor will look into the query's `cache` attribute. If the query cache mode is
170-
`on`, then not setting this attribute or setting it to anything but `false` will make the
171-
query executor consult the query cache. If the query cache mode is `demand`, then setting
172-
the `cache` attribute to `true` will make the executor look for the query in the query cache.
173-
When the query cache mode is `off`, the executor will not look for the query in the cache.
172+
the query executor checks the query's `cache` option. If the query cache mode is
173+
`on`, then not setting this query option or setting it to anything but `false` makes the
174+
query executor consult the query results cache. If the query cache mode is `demand`, then setting
175+
the `cache` option to `true` makes the executor look for the query in the query results cache.
176+
When the query cache mode is `off`, the executor does not look for the query in the cache.
174177

175178
The `cache` attribute can be set as follows via the `db._createStatement()` function:
176179

177180
```js
178181
var stmt = db._createStatement({
179182
query: "FOR doc IN users LIMIT 5 RETURN doc",
180-
cache: true /* cache attribute set here */
181-
});
183+
options: {
184+
cache: true
185+
}
186+
});
182187

183188
stmt.execute();
184189
```
185190

186191
When using the `db._query()` function, the `cache` attribute can be set as follows:
187192

188193
```js
189-
db._query({
190-
query: "FOR doc IN users LIMIT 5 RETURN doc",
191-
cache: true /* cache attribute set here */
192-
});
194+
db._query("FOR doc IN users LIMIT 5 RETURN doc", {}, { cache: true });
193195
```
194196

195-
The `cache` attribute can be set via the HTTP REST API `POST /_api/cursor`, too.
197+
You can also set the `cache` query option in the
198+
[HTTP API](../../develop/http-api/queries/aql-queries.md#create-a-cursor).
196199

197-
Each query result returned will contain a `cached` attribute. This will be set to `true`
198-
if the result was retrieved from the query cache, and `false` otherwise. Clients can use
200+
Each query result returned contain a `cached` attribute. It is set to `true`
201+
if the result was retrieved from the query results cache, and `false` otherwise. Clients can use
199202
this attribute to check if a specific query was served from the cache or not.
200203

201204
## Query results cache inspection
@@ -207,7 +210,7 @@ The contents of the query results cache can be checked at runtime using the cach
207210
require("@arangodb/aql/cache").toArray();
208211
```
209212

210-
This will return a list of all query results stored in the current database's query
213+
This returns a list of all query results stored in the current database's query
211214
results cache.
212215

213216
The query results cache for the current database can be cleared at runtime using the
@@ -221,5 +224,5 @@ require("@arangodb/aql/cache").clear();
221224

222225
Query results that are returned from the query results cache may contain execution statistics
223226
stemming from the initial, uncached query execution. This means for a cached query results,
224-
the *extra.stats* attribute may contain stale data, especially in terms of the *executionTime*
225-
and *profile* attribute values.
227+
the `extra.stats` attribute may contain stale data, especially in terms of the `executionTime`
228+
and `profile` attribute values.

site/content/3.11/aql/how-to-invoke-aql/with-arangosh.md

Lines changed: 24 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -195,26 +195,6 @@ db._query(
195195
).toArray(); // Each batch needs to be fetched within 5 seconds
196196
```
197197

198-
#### `cache`
199-
200-
Whether the AQL query results cache shall be used. If set to `false`, then any
201-
query cache lookup is skipped for the query. If set to `true`, it leads to the
202-
query cache being checked for the query **if** the query cache mode is either
203-
set to `on` or `demand`.
204-
205-
```js
206-
---
207-
name: 02_workWithAQL_cache
208-
description: ''
209-
---
210-
db._query(
211-
'FOR i IN 1..20 RETURN i',
212-
{},
213-
{ cache: true },
214-
{}
215-
); // result may get taken from cache
216-
```
217-
218198
#### `memoryLimit`
219199

220200
To set a memory limit for the query, pass `options` to the `_query()` method.
@@ -274,12 +254,30 @@ don't need to set it on a per-query level.
274254

275255
#### `cache`
276256

277-
If you set `cache` to `true`, this puts the query result into the query result cache
278-
if the query result is eligible for caching and the query cache is running in demand
279-
mode. If set to `false`, the query result is not inserted into the query result
280-
cache. Note that query results are never inserted into the query result cache if
281-
the query result cache is disabled, and that they are automatically inserted into
282-
the query result cache if it is active in non-demand mode.
257+
Whether the [AQL query results cache](../execution-and-performance/caching-query-results.md)
258+
shall be used for adding as well as for retrieving results.
259+
260+
If the query cache mode is set to `demand` and you set the `cache` query option
261+
to `true` for a query, then its query result is cached if it's eligible for
262+
caching. If the query cache mode is set to `on`, query results are automatically
263+
cached if they are eligible for caching unless you set the `cache` option to `false`.
264+
265+
If you set the `cache` option to `false`, then any query cache lookup is skipped
266+
for the query. If you set it to `true`, the query cache is checked a cached result
267+
**if** the query cache mode is either set to `on` or `demand`.
268+
269+
```js
270+
---
271+
name: 02_workWithAQL_cache
272+
description: ''
273+
---
274+
var resultCache = require("@arangodb/aql/cache");
275+
resultCache.properties({ mode: "demand" });
276+
~resultCache.clear();
277+
db._query("FOR i IN 1..5 RETURN i", {}, { cache: true }); // Adds result to cache
278+
db._query("FOR i IN 1..5 RETURN i", {}, { cache: true }); // Retrieves result from cache
279+
db._query("FOR i IN 1..5 RETURN i", {}, { cache: false }); // Bypasses the cache
280+
```
283281

284282
#### `fillBlockCache`
285283

0 commit comments

Comments
 (0)