Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix option "timescaledb.create_group_indexes" #4255

Merged
merged 1 commit into from
Apr 26, 2022

Conversation

konskov
Copy link
Contributor

@konskov konskov commented Apr 20, 2022

Previously this option was ignored when creating a
continuous aggregate, even when explicitly set to true.

Fixes #4249

@konskov konskov self-assigned this Apr 20, 2022
@codecov
Copy link

codecov bot commented Apr 20, 2022

Codecov Report

Merging #4255 (8323ce2) into main (fca865c) will decrease coverage by 0.02%.
The diff coverage is 97.67%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4255      +/-   ##
==========================================
- Coverage   90.59%   90.57%   -0.03%     
==========================================
  Files         215      215              
  Lines       39740    39758      +18     
==========================================
+ Hits        36004    36009       +5     
- Misses       3736     3749      +13     
Impacted Files Coverage Δ
src/hypertable.c 87.83% <ø> (ø)
src/process_utility.c 94.47% <ø> (ø)
tsl/src/remote/txn.c 88.77% <ø> (ø)
tsl/test/src/remote/async.c 100.00% <ø> (ø)
tsl/src/nodes/decompress_chunk/decompress_chunk.c 94.90% <94.73%> (-0.09%) ⬇️
tsl/src/compression/compress_utils.c 94.81% <100.00%> (ø)
tsl/src/compression/compression.c 95.51% <100.00%> (+0.02%) ⬆️
tsl/src/continuous_aggs/create.c 87.23% <100.00%> (+0.01%) ⬆️
tsl/src/nodes/decompress_chunk/exec.c 96.51% <100.00%> (+0.12%) ⬆️
tsl/test/src/remote/txn_resolve.c 100.00% <100.00%> (ø)
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3e042bd...8323ce2. Read the comment docs.

@konskov konskov marked this pull request as ready for review April 20, 2022 14:11
@konskov konskov requested a review from a team as a code owner April 20, 2022 14:11
@konskov konskov requested review from afiskon, duncan-tsdb and mfundul and removed request for a team April 20, 2022 14:11
@mfundul
Copy link
Contributor

mfundul commented Apr 20, 2022

You should fix the commit message.

Copy link
Member

@svenklemm svenklemm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dont forget to add CHANGELOG entry

Comment on lines 1285 to 1287
SELECT format('%I.%I', materialization_hypertable_schema,
materialization_hypertable_name) AS mat_hyper_cagg_index_false
FROM timescaledb_information.continuous_aggregates where view_name like 'cagg_index_false';

\gset
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SELECT format('%I.%I', materialization_hypertable_schema,
materialization_hypertable_name) AS mat_hyper_cagg_index_false
FROM timescaledb_information.continuous_aggregates where view_name like 'cagg_index_false';
\gset
SELECT format('%I.%I', materialization_hypertable_schema,
materialization_hypertable_name) AS mat_hyper_cagg_index_false
FROM timescaledb_information.continuous_aggregates where view_name like 'cagg_index_false' \gset

To speed up the queries, instead of executing them twice.

Comment on lines 1754 to 1760
insert into test_group_idx
(select generate_series(
'2020-01-01',
'2020-02-25', INTERVAL '30 sec'
),
round(random()*10),
random()*5);
Copy link
Contributor

@fabriziomello fabriziomello Apr 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
insert into test_group_idx
(select generate_series(
'2020-01-01',
'2020-02-25', INTERVAL '30 sec'
),
round(random()*10),
random()*5);
insert into test_group_idx
select t, round(random()*10), random()*5
from generate_series('2020-01-01', '2020-02-25', INTERVAL '12 hours') t;

Why too many rows produced for this test?? Do we really need it??? What about use 12 hours so it will produce two rows for each time bucket.

Comment on lines 1790 to 1804
\d+ :mat_hyper_cagg_index_true
Table "_timescaledb_internal._materialized_hypertable_49"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
----------+--------------------------+-----------+----------+---------+----------+--------------+-------------
bucket | timestamp with time zone | | not null | | plain | |
agg_2_2 | bytea | | | | extended | |
symbol | integer | | | | plain | |
chunk_id | integer | | | | plain | |
Indexes:
"_materialized_hypertable_49_bucket_idx" btree (bucket DESC)
"_materialized_hypertable_49_symbol_bucket_idx" btree (symbol, bucket DESC)
Triggers:
ts_insert_blocker BEFORE INSERT ON _timescaledb_internal._materialized_hypertable_49 FOR EACH ROW EXECUTE FUNCTION _timescaledb_internal.insert_blocker()
Child tables: _timescaledb_internal._hyper_49_106_chunk,
_timescaledb_internal._hyper_49_107_chunk
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm always wondering about it and why we don't define a standard because there are several places using test support functions like test.show_columns and test.show_columnsp (test/sql/utils/testsupport.sql).

@konskov konskov force-pushed the create_group_idx_ignored branch 3 times, most recently from 701975e to 93f826b Compare April 21, 2022 12:12
@mfundul
Copy link
Contributor

mfundul commented Apr 21, 2022

You should fix the commit message to show Fixes #4249. Also please update the title of the PR and the body to reflect the new commit message.

@konskov konskov changed the title Stop ignoring option "create_group_indexes" on continuous aggregate c… Fix option "timescaledb.create_group_indexes" Apr 21, 2022
@mfundul
Copy link
Contributor

mfundul commented Apr 26, 2022

I think we need an extra test case. In the last test case, you create a CAGG with default parameters. We should make sure that ALTER MATERIALIZED VIEW also works. E.g. default => true => false

@konskov
Copy link
Contributor Author

konskov commented Apr 26, 2022

Hmm I'm not sure I understand how to test that. According to our documentation (and implementation) setting the option create_group_indexes is not possible with ALTER MATERIALIZED VIEW, it's an option that can be set only upon cagg creation. If I try and set this option with ALTER I will get the error: ERROR: cannot alter create_group_indexes option for continuous aggregates. So I didn't understand what the extra test case is meant to check?

Previously this option was ignored when creating a
continuous aggregate, even when explicitly set to true.

Fixes timescale#4249
Copy link
Contributor

@fabriziomello fabriziomello left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an failing test for ALTER MATERIALIZED VIEW trying to change the create_group_indexes?

@konskov
Copy link
Contributor Author

konskov commented Apr 26, 2022

@fabriziomello yes it looks like we do check that in cagg_errors.sql.

@konskov konskov merged commit 687e7c7 into timescale:main Apr 26, 2022
@charislam
Copy link

Thanks @konskov! 💖 Quick clarification so I can improve the docs: is the default behavior true?

@konskov
Copy link
Contributor Author

konskov commented Apr 26, 2022

@charislam yes, the default behavior is true.

svenklemm added a commit to svenklemm/timescaledb that referenced this pull request May 23, 2022
This release adds major new features since the 2.6.1 release.
We deem it moderate priority for upgrading.

This release includes these noteworthy features:

* Optimize continuous aggregate query performance and storage
* The following query clauses and functions can now be used in a continuous
  aggregate: FILTER, DISTINCT, ORDER BY as well as [Ordered-Set Aggregate](https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE)
  and [Hypothetical-Set Aggregate](https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-HYPOTHETICAL-TABLE)
* Optimize now() query planning time
* Improve COPY insert performance
* Improve performance of UPDATE/DELETE on PG14 by excluding chunks

This release also includes several bug fixes.

If you are upgrading from a previous version and were using compression
with a non-default collation on a segmentby-column you should recompress
those hypertables.

**Features**
* timescale#4045 Custom origin's support in CAGGs
* timescale#4120 Add logging for retention policy
* timescale#4158 Allow ANALYZE command on a data node directly
* timescale#4169 Add support for chunk exclusion on DELETE to PG14
* timescale#4209 Add support for chunk exclusion on UPDATE to PG14
* timescale#4269 Continuous Aggregates finals form
* timescale#4301 Add support for bulk inserts in COPY operator
* timescale#4311 Support non-superuser move chunk operations
* timescale#4330 Add GUC "bgw_launcher_poll_time"
* timescale#4340 Enable now() usage in plan-time chunk exclusion

**Bugfixes**
* timescale#3899 Fix segfault in Continuous Aggregates
* timescale#4225 Fix TRUNCATE error as non-owner on hypertable
* timescale#4236 Fix potential wrong order of results for compressed hypertable with a non-default collation
* timescale#4249 Fix option "timescaledb.create_group_indexes"
* timescale#4251 Fix INSERT into compressed chunks with dropped columns
* timescale#4255 Fix option "timescaledb.create_group_indexes"
* timescale#4259 Fix logic bug in extension update script
* timescale#4269 Fix bad Continuous Aggregate view definition reported in timescale#4233
* timescale#4289 Support moving compressed chunks between data nodes
* timescale#4300 Fix refresh window cap for cagg refresh policy
* timescale#4315 Fix memory leak in scheduler
* timescale#4323 Remove printouts from signal handlers
* timescale#4342 Fix move chunk cleanup logic
* timescale#4349 Fix crashes in functions using AlterTableInternal
* timescale#4358 Fix crash and other issues in telemetry reporter

**Thanks**
* @abrownsword for reporting a bug in the telemetry reporter and testing the fix
* @jsoref for fixing various misspellings in code, comments and documentation
* @yalon for reporting an error with ALTER TABLE RENAME on distributed hypertables
* @zhuizhuhaomeng for reporting and fixing a memory leak in our scheduler
@svenklemm svenklemm mentioned this pull request May 23, 2022
svenklemm added a commit that referenced this pull request May 23, 2022
This release adds major new features since the 2.6.1 release.
We deem it moderate priority for upgrading.

This release includes these noteworthy features:

* Optimize continuous aggregate query performance and storage
* The following query clauses and functions can now be used in a continuous
  aggregate: FILTER, DISTINCT, ORDER BY as well as [Ordered-Set Aggregate](https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE)
  and [Hypothetical-Set Aggregate](https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-HYPOTHETICAL-TABLE)
* Optimize now() query planning time
* Improve COPY insert performance
* Improve performance of UPDATE/DELETE on PG14 by excluding chunks

This release also includes several bug fixes.

If you are upgrading from a previous version and were using compression
with a non-default collation on a segmentby-column you should recompress
those hypertables.

**Features**
* #4045 Custom origin's support in CAGGs
* #4120 Add logging for retention policy
* #4158 Allow ANALYZE command on a data node directly
* #4169 Add support for chunk exclusion on DELETE to PG14
* #4209 Add support for chunk exclusion on UPDATE to PG14
* #4269 Continuous Aggregates finals form
* #4301 Add support for bulk inserts in COPY operator
* #4311 Support non-superuser move chunk operations
* #4330 Add GUC "bgw_launcher_poll_time"
* #4340 Enable now() usage in plan-time chunk exclusion

**Bugfixes**
* #3899 Fix segfault in Continuous Aggregates
* #4225 Fix TRUNCATE error as non-owner on hypertable
* #4236 Fix potential wrong order of results for compressed hypertable with a non-default collation
* #4249 Fix option "timescaledb.create_group_indexes"
* #4251 Fix INSERT into compressed chunks with dropped columns
* #4255 Fix option "timescaledb.create_group_indexes"
* #4259 Fix logic bug in extension update script
* #4269 Fix bad Continuous Aggregate view definition reported in #4233
* #4289 Support moving compressed chunks between data nodes
* #4300 Fix refresh window cap for cagg refresh policy
* #4315 Fix memory leak in scheduler
* #4323 Remove printouts from signal handlers
* #4342 Fix move chunk cleanup logic
* #4349 Fix crashes in functions using AlterTableInternal
* #4358 Fix crash and other issues in telemetry reporter

**Thanks**
* @abrownsword for reporting a bug in the telemetry reporter and testing the fix
* @jsoref for fixing various misspellings in code, comments and documentation
* @yalon for reporting an error with ALTER TABLE RENAME on distributed hypertables
* @zhuizhuhaomeng for reporting and fixing a memory leak in our scheduler
mfundul pushed a commit to mfundul/timescaledb that referenced this pull request May 24, 2022
This release adds major new features since the 2.6.1 release.
We deem it moderate priority for upgrading.

This release includes these noteworthy features:

* Optimize continuous aggregate query performance and storage
* The following query clauses and functions can now be used in a continuous
  aggregate: FILTER, DISTINCT, ORDER BY as well as [Ordered-Set Aggregate](https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE)
  and [Hypothetical-Set Aggregate](https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-HYPOTHETICAL-TABLE)
* Optimize now() query planning time
* Improve COPY insert performance
* Improve performance of UPDATE/DELETE on PG14 by excluding chunks

This release also includes several bug fixes.

If you are upgrading from a previous version and were using compression
with a non-default collation on a segmentby-column you should recompress
those hypertables.

**Features**
* timescale#4045 Custom origin's support in CAGGs
* timescale#4120 Add logging for retention policy
* timescale#4158 Allow ANALYZE command on a data node directly
* timescale#4169 Add support for chunk exclusion on DELETE to PG14
* timescale#4209 Add support for chunk exclusion on UPDATE to PG14
* timescale#4269 Continuous Aggregates finals form
* timescale#4301 Add support for bulk inserts in COPY operator
* timescale#4311 Support non-superuser move chunk operations
* timescale#4330 Add GUC "bgw_launcher_poll_time"
* timescale#4340 Enable now() usage in plan-time chunk exclusion

**Bugfixes**
* timescale#3899 Fix segfault in Continuous Aggregates
* timescale#4225 Fix TRUNCATE error as non-owner on hypertable
* timescale#4236 Fix potential wrong order of results for compressed hypertable with a non-default collation
* timescale#4249 Fix option "timescaledb.create_group_indexes"
* timescale#4251 Fix INSERT into compressed chunks with dropped columns
* timescale#4255 Fix option "timescaledb.create_group_indexes"
* timescale#4259 Fix logic bug in extension update script
* timescale#4269 Fix bad Continuous Aggregate view definition reported in timescale#4233
* timescale#4289 Support moving compressed chunks between data nodes
* timescale#4300 Fix refresh window cap for cagg refresh policy
* timescale#4315 Fix memory leak in scheduler
* timescale#4323 Remove printouts from signal handlers
* timescale#4342 Fix move chunk cleanup logic
* timescale#4349 Fix crashes in functions using AlterTableInternal
* timescale#4358 Fix crash and other issues in telemetry reporter

**Thanks**
* @abrownsword for reporting a bug in the telemetry reporter and testing the fix
* @jsoref for fixing various misspellings in code, comments and documentation
* @yalon for reporting an error with ALTER TABLE RENAME on distributed hypertables
* @zhuizhuhaomeng for reporting and fixing a memory leak in our scheduler
@mfundul mfundul mentioned this pull request May 24, 2022
mfundul pushed a commit that referenced this pull request May 24, 2022
This release adds major new features since the 2.6.1 release.
We deem it moderate priority for upgrading.

This release includes these noteworthy features:

* Optimize continuous aggregate query performance and storage
* The following query clauses and functions can now be used in a continuous
  aggregate: FILTER, DISTINCT, ORDER BY as well as [Ordered-Set Aggregate](https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE)
  and [Hypothetical-Set Aggregate](https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-HYPOTHETICAL-TABLE)
* Optimize now() query planning time
* Improve COPY insert performance
* Improve performance of UPDATE/DELETE on PG14 by excluding chunks

This release also includes several bug fixes.

If you are upgrading from a previous version and were using compression
with a non-default collation on a segmentby-column you should recompress
those hypertables.

**Features**
* #4045 Custom origin's support in CAGGs
* #4120 Add logging for retention policy
* #4158 Allow ANALYZE command on a data node directly
* #4169 Add support for chunk exclusion on DELETE to PG14
* #4209 Add support for chunk exclusion on UPDATE to PG14
* #4269 Continuous Aggregates finals form
* #4301 Add support for bulk inserts in COPY operator
* #4311 Support non-superuser move chunk operations
* #4330 Add GUC "bgw_launcher_poll_time"
* #4340 Enable now() usage in plan-time chunk exclusion

**Bugfixes**
* #3899 Fix segfault in Continuous Aggregates
* #4225 Fix TRUNCATE error as non-owner on hypertable
* #4236 Fix potential wrong order of results for compressed hypertable with a non-default collation
* #4249 Fix option "timescaledb.create_group_indexes"
* #4251 Fix INSERT into compressed chunks with dropped columns
* #4255 Fix option "timescaledb.create_group_indexes"
* #4259 Fix logic bug in extension update script
* #4269 Fix bad Continuous Aggregate view definition reported in #4233
* #4289 Support moving compressed chunks between data nodes
* #4300 Fix refresh window cap for cagg refresh policy
* #4315 Fix memory leak in scheduler
* #4323 Remove printouts from signal handlers
* #4342 Fix move chunk cleanup logic
* #4349 Fix crashes in functions using AlterTableInternal
* #4358 Fix crash and other issues in telemetry reporter

**Thanks**
* @abrownsword for reporting a bug in the telemetry reporter and testing the fix
* @jsoref for fixing various misspellings in code, comments and documentation
* @yalon for reporting an error with ALTER TABLE RENAME on distributed hypertables
* @zhuizhuhaomeng for reporting and fixing a memory leak in our scheduler
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: create_group_indexes option in caggs does not work properly
5 participants