Proposal: cache label names/values API response in the query results cache #5395

pracucci · 2023-07-01T09:01:39Z

Problem

We see some customers repetitively sending the same requests (with the same parameters) over and over to the GET /api/v1/label/<label_name>/values endpoint (doc).

Such requests can be heavy, in particular when the start and end timerange are missing because by default Prometheus queries the whole time range (since the beginning of the universe until the end of it).

Mimir supports the -store.max-labels-query-length limit, which limits the time range (end - start time) of series, label names and values queries. The default is 0, which means "no limit". Moreover this limit is currently enforced only when querying the store-gateways, but not ingesters. At Grafana Labs we set this limit to 768h (32 days).

Proposal

Similarly to #5212, I propose to add the support for a (short-lived) query results cache for the label values API endpoint.

Cache key

The "label values" API parameters are (doc):

start: Start timestamp. Optional.
end: End timestamp. Optional.
match[]: Repeated series selector argument that selects the series from which to read the label values. Optional.

I propose to add all parameters to the cache key.

Both TSDB (so the Mimir ingester) and Mimir store-gateway query the label values out of blocks overlapping within the start and end time. This means that for maximum granularity is the block. Since the smallest blocks we have are 2h blocks, I propose to align the start and end time to 2h boundaries when computing the cache key. This means that, if we receive two requests with different start/end time but within the same 2h range (aligned to block boundaries) the cache key for the two requests is the same (assuming the matchers are the same as well).

Cache TTL

Similarly to #5212, I propose to add a new per-tenant configuration option to set the caching TTL for "label query".

Extension to label names API endpoint

The same exact approach can be adopted for the "label names" API endpoint (docs).

The text was updated successfully, but these errors were encountered:

colega · 2023-07-04T07:20:06Z

Since the smallest blocks we have are 2h blocks, I propose to align the start and end time to 2h boundaries when computing the cache key.

So, I understand that without start/end, this cached value will only be available during 2h? What about recent data, are we going to cache the data relative to the head block? And what would happen to the OOO writes (especially for customers with month of OOO?)

Also see #457 as this cache will overlap with the implemented in #590

pracucci · 2023-07-05T07:44:36Z

What about recent data, are we going to cache the data relative to the head block?

We're going to cache the whole response (being done by the query-frontend). Think about this cache as a cache with 1m TTL, so may return stale results for 1m. Same as #5212.

colega · 2023-07-05T10:33:18Z

Oh okay, it's 1 minute TTL cache. Fine! 👍

pracucci changed the title ~~Proposal: cache label values API response in the query results cache~~ Proposal: cache label names/values API response in the query results cache Jul 1, 2023

pracucci self-assigned this Jul 1, 2023

pracucci mentioned this issue Jul 5, 2023

Add query results cache support for label names and label values APIs #5426

Merged

3 tasks

pracucci closed this as completed in #5426 Jul 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: cache label names/values API response in the query results cache #5395

Proposal: cache label names/values API response in the query results cache #5395

pracucci commented Jul 1, 2023 •

edited

Loading

colega commented Jul 4, 2023

pracucci commented Jul 5, 2023 •

edited

Loading

colega commented Jul 5, 2023

Proposal: cache label names/values API response in the query results cache #5395

Proposal: cache label names/values API response in the query results cache #5395

Comments

pracucci commented Jul 1, 2023 • edited Loading

Problem

Proposal

Cache key

Cache TTL

Extension to label names API endpoint

colega commented Jul 4, 2023

pracucci commented Jul 5, 2023 • edited Loading

colega commented Jul 5, 2023

pracucci commented Jul 1, 2023 •

edited

Loading

pracucci commented Jul 5, 2023 •

edited

Loading