-
Notifications
You must be signed in to change notification settings - Fork 885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Constify cagg_watermark
not working using window functions when querying a CAgg
#6722
Comments
Just a thought, maybe it's too naiive: Instead of manipulating the query by replacing the function call with a constant value, would it be feasible introduce another function This would keep the possibility to query the latest value via Query results should be always correct because the watermark is just an optimization - if an older one is used, more real-time values have to be calculated. The only risk would be that some values have already been materialized but wouldn't be used. The impact of this depends on how long the value is cached for |
CREATE OR REPLACE FUNCTION cagg_watermark_const(int4)
RETURNS int8
IMMUTABLE
AS $$ SELECT _timescaledb_functions.cagg_watermark($1) $$
LANGUAGE SQL; Then take the generated view definition for a real-time continuous aggregate and create a new view from it which uses the immutable function. Queries using that view with window functions are much faster. The question is just for how long and for which scope the result of the function will be cached. If it's within the same session or even only for the same prepared statement, then the potential performance hit of using a too-low watermark should be neglectable. |
Looks like this is a bug. We're preventing constify the cagg_watemark function call if there are Window Functions in the query tree: https://github.com/timescale/timescaledb/blob/main/tsl/src/continuous_aggs/planner.c#L196 The assumption for doing that is because nowadays we don't allow window functions in the cagg definition but in your example is when querying a cagg using window functions that should be fine to constify it. Here a reproducible test case: CREATE TABLE metrics(time timestamptz NOT NULL, device_id int, value float);
SELECT create_hypertable('metrics', 'time');
-- 10 chunks
INSERT INTO metrics
SELECT generate_series('2024-02-01 00:00:00-03'::timestamptz, '2024-02-01 00:00:00-03'::timestamptz + interval '9 months', interval '1 month'), 1, 1;
CREATE MATERIALIZED VIEW metrics_by_day WITH (timescaledb.continuous, timescaledb.materialized_only=false) AS
SELECT time_bucket('1 day', time) AS bucket, device_id, count(*), avg(value) FROM metrics GROUP BY 1, 2;
EXPLAIN SELECT avg(value) FROM metrics_by_day;
-- Constify is not working here
EXPLAIN SELECT bucket, lead(avg) OVER (ORDER BY bucket) FROM metrics_by_day; |
cagg_watermark
when using window functionscagg_watermark
not working using window functions when querying a CAgg
When querying a realtime Continuous Aggregate using window functions the new planner optimization to constify the `cagg_watermark` function call is not working because we're checking for window function usage on the query tree but actually this is a current limitation when creating a new Continuous Aggregate, but users can use window functions when querying it. Fixed it by removing the check for `query->hasWindowFuncs` to prevent the process of constification of the `cagg_watermak` function call. Fixes timescale#6722
When querying a realtime Continuous Aggregate using window functions the new planner optimization to constify the `cagg_watermark` function call is not working because we're checking for window function usage on the query tree but actually this is a current limitation when creating a new Continuous Aggregate, but users can use window functions when querying it. Fixed it by removing the check for `query->hasWindowFuncs` to prevent the process of constification of the `cagg_watermak` function call. Fixes timescale#6722
When querying a realtime Continuous Aggregate using window functions the new planner optimization to constify the `cagg_watermark` function call is not working because we're checking for window function usage on the query tree but actually this is a current limitation when creating a new Continuous Aggregate, but users can use window functions when querying it. Fixed it by removing the check for `query->hasWindowFuncs` to prevent the process of constification of the `cagg_watermak` function call. Fixes timescale#6722
Thanks for the fast fix! Looking forward to using it in one of the next releases.
Do you know if there are any plans to support that in the future? My use-case is that I store a counter values (energy meter reading) and would like to calculate the usage per hour, defined as |
When querying a realtime Continuous Aggregate using window functions the new planner optimization to constify the `cagg_watermark` function call is not working because we're checking for window function usage on the query tree but actually this is a current limitation when creating a new Continuous Aggregate, but users can use window functions when querying it. Fixed it by removing the check for `query->hasWindowFuncs` to prevent the process of constification of the `cagg_watermak` function call. Fixes #6722
When querying a realtime Continuous Aggregate using window functions the new planner optimization to constify the `cagg_watermark` function call is not working because we're checking for window function usage on the query tree but actually this is a current limitation when creating a new Continuous Aggregate, but users can use window functions when querying it. Fixed it by removing the check for `query->hasWindowFuncs` to prevent the process of constification of the `cagg_watermak` function call. Fixes #6722 (cherry picked from commit a1cd7c4)
You're welcome!
Supporting window functions are in our plans but I don't have an ETA yet. |
When querying a realtime Continuous Aggregate using window functions the new planner optimization to constify the `cagg_watermark` function call is not working because we're checking for window function usage on the query tree but actually this is a current limitation when creating a new Continuous Aggregate, but users can use window functions when querying it. Fixed it by removing the check for `query->hasWindowFuncs` to prevent the process of constification of the `cagg_watermak` function call. Fixes #6722 (cherry picked from commit a1cd7c4)
Hey @fabriziomello! I'm still seeing issues where this is really fast: EXPLAIN ANALYSE SELECT
rollup(heartbeat_agg)
FROM vp_heartbeats_daily_agg_2m
WHERE b >= '2023-05-01 00:00:00'
AND b < '2024-04-07 23:59:59'
AND vision_program_id = 6149; but this is really slow: EXPLAIN ANALYSE SELECT
live_ranges(rollup(heartbeat_agg))
FROM vp_heartbeats_daily_agg_2m
WHERE b >= '2023-05-01 00:00:00'
AND b < '2024-04-07 23:59:59'
AND vision_program_id = 6149; i.e. the addition of the I'm wondering if the above fix would address this or if I should open a new issue? |
What type of enhancement is this?
Performance
What subsystems and features will be improved?
Continuous aggregate
What does the enhancement do?
The changes done in #6325 have greatly improved performance for various of my use-cases. However, at least one of them seems to have been missed as of 2.14.2. I hope it can be added in the same way.
Suppose there's a continuous real-time aggregate
test
. This trivial query shows that the watermark is replaced by a constant:The output contains:
But if a window function is used:
The output says that the function is called:
Implementation challenges
No response
The text was updated successfully, but these errors were encountered: