You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We see some customers repetitively sending the same requests (with the same parameters) over and over to the GET /api/v1/label/<label_name>/values endpoint (doc).
Such requests can be heavy, in particular when the start and end timerange are missing because by default Prometheus queries the whole time range (since the beginning of the universe until the end of it).
Mimir supports the -store.max-labels-query-length limit, which limits the time range (end - start time) of series, label names and values queries. The default is 0, which means "no limit". Moreover this limit is currently enforced only when querying the store-gateways, but not ingesters. At Grafana Labs we set this limit to 768h (32 days).
Proposal
Similarly to #5212, I propose to add the support for a (short-lived) query results cache for the label values API endpoint.
match[]: Repeated series selector argument that selects the series from which to read the label values. Optional.
I propose to add all parameters to the cache key.
Both TSDB (so the Mimir ingester) and Mimir store-gateway query the label values out of blocks overlapping within the start and end time. This means that for maximum granularity is the block. Since the smallest blocks we have are 2h blocks, I propose to align the start and end time to 2h boundaries when computing the cache key. This means that, if we receive two requests with different start/end time but within the same 2h range (aligned to block boundaries) the cache key for the two requests is the same (assuming the matchers are the same as well).
Cache TTL
Similarly to #5212, I propose to add a new per-tenant configuration option to set the caching TTL for "label query".
Extension to label names API endpoint
The same exact approach can be adopted for the "label names" API endpoint (docs).
The text was updated successfully, but these errors were encountered:
pracucci
changed the title
Proposal: cache label values API response in the query results cache
Proposal: cache label names/values API response in the query results cache
Jul 1, 2023
Since the smallest blocks we have are 2h blocks, I propose to align the start and end time to 2h boundaries when computing the cache key.
So, I understand that without start/end, this cached value will only be available during 2h? What about recent data, are we going to cache the data relative to the head block? And what would happen to the OOO writes (especially for customers with month of OOO?)
Also see #457 as this cache will overlap with the implemented in #590
What about recent data, are we going to cache the data relative to the head block?
We're going to cache the whole response (being done by the query-frontend). Think about this cache as a cache with 1m TTL, so may return stale results for 1m. Same as #5212.
Problem
We see some customers repetitively sending the same requests (with the same parameters) over and over to the
GET /api/v1/label/<label_name>/values
endpoint (doc).Such requests can be heavy, in particular when the
start
andend
timerange are missing because by default Prometheus queries the whole time range (since the beginning of the universe until the end of it).Mimir supports the
-store.max-labels-query-length
limit, which limits the time range (end - start time) of series, label names and values queries. The default is 0, which means "no limit". Moreover this limit is currently enforced only when querying the store-gateways, but not ingesters. At Grafana Labs we set this limit to 768h (32 days).Proposal
Similarly to #5212, I propose to add the support for a (short-lived) query results cache for the label values API endpoint.
Cache key
The "label values" API parameters are (doc):
start
: Start timestamp. Optional.end
: End timestamp. Optional.match[]
: Repeated series selector argument that selects the series from which to read the label values. Optional.I propose to add all parameters to the cache key.
Both TSDB (so the Mimir ingester) and Mimir store-gateway query the label values out of blocks overlapping within the
start
andend
time. This means that for maximum granularity is the block. Since the smallest blocks we have are 2h blocks, I propose to align thestart
andend
time to 2h boundaries when computing the cache key. This means that, if we receive two requests with different start/end time but within the same 2h range (aligned to block boundaries) the cache key for the two requests is the same (assuming the matchers are the same as well).Cache TTL
Similarly to #5212, I propose to add a new per-tenant configuration option to set the caching TTL for "label query".
Extension to label names API endpoint
The same exact approach can be adopted for the "label names" API endpoint (docs).
The text was updated successfully, but these errors were encountered: