Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add chunk exclusion for UPDATE for PG14 #4209

Merged
merged 1 commit into from
Apr 6, 2022

Conversation

svenklemm
Copy link
Member

@svenklemm svenklemm commented Apr 3, 2022

Currently only IMMUTABLE constraints will exclude chunks from an UPDATE plan,
with this patch STABLE expressions will be used to exclude chunks as well.
This is a big performance improvement as chunks not matching partitioning
column constraints don't have to be scanned for UPDATEs.
Since the codepath for UPDATE is different for PG < 14 this patch only adds
the optimization for PG14.

With this patch the plan for UPDATE on hypertables looks like this:

 Custom Scan (HypertableModify) (actual rows=0 loops=1)
   ->  Update on public.metrics_int2 (actual rows=0 loops=1)
         Update on public.metrics_int2 metrics_int2_1
         Update on _timescaledb_internal._hyper_1_1_chunk metrics_int2
         Update on _timescaledb_internal._hyper_1_2_chunk metrics_int2
         Update on _timescaledb_internal._hyper_1_3_chunk metrics_int2
         ->  Custom Scan (ChunkAppend) on public.metrics_int2 (actual rows=0 loops=1)
               Output: '123'::text, metrics_int2.tableoid, metrics_int2.ctid
               Startup Exclusion: true
               Runtime Exclusion: false
               Chunks excluded during startup: 3
               ->  Seq Scan on public.metrics_int2 metrics_int2_1 (actual rows=0 loops=1)
                     Output: metrics_int2_1.tableoid, metrics_int2_1.ctid
                     Filter: (metrics_int2_1."time" = length(version()))

@svenklemm svenklemm requested a review from a team as a code owner April 3, 2022 12:50
@svenklemm svenklemm requested review from RafiaSabih and akuzm and removed request for a team April 3, 2022 12:50
@svenklemm svenklemm self-assigned this Apr 3, 2022
@codecov
Copy link

codecov bot commented Apr 3, 2022

Codecov Report

Merging #4209 (83dc529) into main (ff945a7) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #4209   +/-   ##
=======================================
  Coverage   90.79%   90.79%           
=======================================
  Files         215      215           
  Lines       39530    39535    +5     
=======================================
+ Hits        35892    35897    +5     
  Misses       3638     3638           
Impacted Files Coverage Δ
src/nodes/hypertable_modify.c 70.10% <100.00%> (+0.51%) ⬆️
src/planner.c 95.19% <100.00%> (+0.01%) ⬆️
src/bgw/scheduler.c 83.04% <0.00%> (-0.88%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ff945a7...83dc529. Read the comment docs.

Copy link
Contributor

@gayyappan gayyappan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me. Minor comment.

tsl/test/sql/update_exclusion.sql Outdated Show resolved Hide resolved
@svenklemm svenklemm force-pushed the update_exclusion branch 7 times, most recently from e78b4e0 to ffcc4c5 Compare April 6, 2022 09:59
Currently only IMMUTABLE constraints will exclude chunks from an UPDATE plan,
with this patch STABLE expressions will be used to exclude chunks as well.
This is a big performance improvement as chunks not matching partitioning
column constraints don't have to be scanned for UPDATEs.
Since the codepath for UPDATE is different for PG < 14 this patch only adds
the optimization for PG14.

With this patch the plan for UPDATE on hypertables looks like this:

 Custom Scan (HypertableModify) (actual rows=0 loops=1)
   ->  Update on public.metrics_int2 (actual rows=0 loops=1)
         Update on public.metrics_int2 metrics_int2_1
         Update on _timescaledb_internal._hyper_1_1_chunk metrics_int2
         Update on _timescaledb_internal._hyper_1_2_chunk metrics_int2
         Update on _timescaledb_internal._hyper_1_3_chunk metrics_int2
         ->  Custom Scan (ChunkAppend) on public.metrics_int2 (actual rows=0 loops=1)
               Output: '123'::text, metrics_int2.tableoid, metrics_int2.ctid
               Startup Exclusion: true
               Runtime Exclusion: false
               Chunks excluded during startup: 3
               ->  Seq Scan on public.metrics_int2 metrics_int2_1 (actual rows=0 loops=1)
                     Output: metrics_int2_1.tableoid, metrics_int2_1.ctid
                     Filter: (metrics_int2_1."time" = length(version()))
@svenklemm svenklemm merged commit ae50a53 into timescale:main Apr 6, 2022
svenklemm added a commit to svenklemm/timescaledb that referenced this pull request May 23, 2022
This release adds major new features since the 2.6.1 release.
We deem it moderate priority for upgrading.

This release includes these noteworthy features:

* Optimize continuous aggregate query performance and storage
* The following query clauses and functions can now be used in a continuous
  aggregate: FILTER, DISTINCT, ORDER BY as well as [Ordered-Set Aggregate](https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE)
  and [Hypothetical-Set Aggregate](https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-HYPOTHETICAL-TABLE)
* Optimize now() query planning time
* Improve COPY insert performance
* Improve performance of UPDATE/DELETE on PG14 by excluding chunks

This release also includes several bug fixes.

If you are upgrading from a previous version and were using compression
with a non-default collation on a segmentby-column you should recompress
those hypertables.

**Features**
* timescale#4045 Custom origin's support in CAGGs
* timescale#4120 Add logging for retention policy
* timescale#4158 Allow ANALYZE command on a data node directly
* timescale#4169 Add support for chunk exclusion on DELETE to PG14
* timescale#4209 Add support for chunk exclusion on UPDATE to PG14
* timescale#4269 Continuous Aggregates finals form
* timescale#4301 Add support for bulk inserts in COPY operator
* timescale#4311 Support non-superuser move chunk operations
* timescale#4330 Add GUC "bgw_launcher_poll_time"
* timescale#4340 Enable now() usage in plan-time chunk exclusion

**Bugfixes**
* timescale#3899 Fix segfault in Continuous Aggregates
* timescale#4225 Fix TRUNCATE error as non-owner on hypertable
* timescale#4236 Fix potential wrong order of results for compressed hypertable with a non-default collation
* timescale#4249 Fix option "timescaledb.create_group_indexes"
* timescale#4251 Fix INSERT into compressed chunks with dropped columns
* timescale#4255 Fix option "timescaledb.create_group_indexes"
* timescale#4259 Fix logic bug in extension update script
* timescale#4269 Fix bad Continuous Aggregate view definition reported in timescale#4233
* timescale#4289 Support moving compressed chunks between data nodes
* timescale#4300 Fix refresh window cap for cagg refresh policy
* timescale#4315 Fix memory leak in scheduler
* timescale#4323 Remove printouts from signal handlers
* timescale#4342 Fix move chunk cleanup logic
* timescale#4349 Fix crashes in functions using AlterTableInternal
* timescale#4358 Fix crash and other issues in telemetry reporter

**Thanks**
* @abrownsword for reporting a bug in the telemetry reporter and testing the fix
* @jsoref for fixing various misspellings in code, comments and documentation
* @yalon for reporting an error with ALTER TABLE RENAME on distributed hypertables
* @zhuizhuhaomeng for reporting and fixing a memory leak in our scheduler
@svenklemm svenklemm mentioned this pull request May 23, 2022
svenklemm added a commit that referenced this pull request May 23, 2022
This release adds major new features since the 2.6.1 release.
We deem it moderate priority for upgrading.

This release includes these noteworthy features:

* Optimize continuous aggregate query performance and storage
* The following query clauses and functions can now be used in a continuous
  aggregate: FILTER, DISTINCT, ORDER BY as well as [Ordered-Set Aggregate](https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE)
  and [Hypothetical-Set Aggregate](https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-HYPOTHETICAL-TABLE)
* Optimize now() query planning time
* Improve COPY insert performance
* Improve performance of UPDATE/DELETE on PG14 by excluding chunks

This release also includes several bug fixes.

If you are upgrading from a previous version and were using compression
with a non-default collation on a segmentby-column you should recompress
those hypertables.

**Features**
* #4045 Custom origin's support in CAGGs
* #4120 Add logging for retention policy
* #4158 Allow ANALYZE command on a data node directly
* #4169 Add support for chunk exclusion on DELETE to PG14
* #4209 Add support for chunk exclusion on UPDATE to PG14
* #4269 Continuous Aggregates finals form
* #4301 Add support for bulk inserts in COPY operator
* #4311 Support non-superuser move chunk operations
* #4330 Add GUC "bgw_launcher_poll_time"
* #4340 Enable now() usage in plan-time chunk exclusion

**Bugfixes**
* #3899 Fix segfault in Continuous Aggregates
* #4225 Fix TRUNCATE error as non-owner on hypertable
* #4236 Fix potential wrong order of results for compressed hypertable with a non-default collation
* #4249 Fix option "timescaledb.create_group_indexes"
* #4251 Fix INSERT into compressed chunks with dropped columns
* #4255 Fix option "timescaledb.create_group_indexes"
* #4259 Fix logic bug in extension update script
* #4269 Fix bad Continuous Aggregate view definition reported in #4233
* #4289 Support moving compressed chunks between data nodes
* #4300 Fix refresh window cap for cagg refresh policy
* #4315 Fix memory leak in scheduler
* #4323 Remove printouts from signal handlers
* #4342 Fix move chunk cleanup logic
* #4349 Fix crashes in functions using AlterTableInternal
* #4358 Fix crash and other issues in telemetry reporter

**Thanks**
* @abrownsword for reporting a bug in the telemetry reporter and testing the fix
* @jsoref for fixing various misspellings in code, comments and documentation
* @yalon for reporting an error with ALTER TABLE RENAME on distributed hypertables
* @zhuizhuhaomeng for reporting and fixing a memory leak in our scheduler
mfundul pushed a commit to mfundul/timescaledb that referenced this pull request May 24, 2022
This release adds major new features since the 2.6.1 release.
We deem it moderate priority for upgrading.

This release includes these noteworthy features:

* Optimize continuous aggregate query performance and storage
* The following query clauses and functions can now be used in a continuous
  aggregate: FILTER, DISTINCT, ORDER BY as well as [Ordered-Set Aggregate](https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE)
  and [Hypothetical-Set Aggregate](https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-HYPOTHETICAL-TABLE)
* Optimize now() query planning time
* Improve COPY insert performance
* Improve performance of UPDATE/DELETE on PG14 by excluding chunks

This release also includes several bug fixes.

If you are upgrading from a previous version and were using compression
with a non-default collation on a segmentby-column you should recompress
those hypertables.

**Features**
* timescale#4045 Custom origin's support in CAGGs
* timescale#4120 Add logging for retention policy
* timescale#4158 Allow ANALYZE command on a data node directly
* timescale#4169 Add support for chunk exclusion on DELETE to PG14
* timescale#4209 Add support for chunk exclusion on UPDATE to PG14
* timescale#4269 Continuous Aggregates finals form
* timescale#4301 Add support for bulk inserts in COPY operator
* timescale#4311 Support non-superuser move chunk operations
* timescale#4330 Add GUC "bgw_launcher_poll_time"
* timescale#4340 Enable now() usage in plan-time chunk exclusion

**Bugfixes**
* timescale#3899 Fix segfault in Continuous Aggregates
* timescale#4225 Fix TRUNCATE error as non-owner on hypertable
* timescale#4236 Fix potential wrong order of results for compressed hypertable with a non-default collation
* timescale#4249 Fix option "timescaledb.create_group_indexes"
* timescale#4251 Fix INSERT into compressed chunks with dropped columns
* timescale#4255 Fix option "timescaledb.create_group_indexes"
* timescale#4259 Fix logic bug in extension update script
* timescale#4269 Fix bad Continuous Aggregate view definition reported in timescale#4233
* timescale#4289 Support moving compressed chunks between data nodes
* timescale#4300 Fix refresh window cap for cagg refresh policy
* timescale#4315 Fix memory leak in scheduler
* timescale#4323 Remove printouts from signal handlers
* timescale#4342 Fix move chunk cleanup logic
* timescale#4349 Fix crashes in functions using AlterTableInternal
* timescale#4358 Fix crash and other issues in telemetry reporter

**Thanks**
* @abrownsword for reporting a bug in the telemetry reporter and testing the fix
* @jsoref for fixing various misspellings in code, comments and documentation
* @yalon for reporting an error with ALTER TABLE RENAME on distributed hypertables
* @zhuizhuhaomeng for reporting and fixing a memory leak in our scheduler
@mfundul mfundul mentioned this pull request May 24, 2022
mfundul pushed a commit that referenced this pull request May 24, 2022
This release adds major new features since the 2.6.1 release.
We deem it moderate priority for upgrading.

This release includes these noteworthy features:

* Optimize continuous aggregate query performance and storage
* The following query clauses and functions can now be used in a continuous
  aggregate: FILTER, DISTINCT, ORDER BY as well as [Ordered-Set Aggregate](https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE)
  and [Hypothetical-Set Aggregate](https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-HYPOTHETICAL-TABLE)
* Optimize now() query planning time
* Improve COPY insert performance
* Improve performance of UPDATE/DELETE on PG14 by excluding chunks

This release also includes several bug fixes.

If you are upgrading from a previous version and were using compression
with a non-default collation on a segmentby-column you should recompress
those hypertables.

**Features**
* #4045 Custom origin's support in CAGGs
* #4120 Add logging for retention policy
* #4158 Allow ANALYZE command on a data node directly
* #4169 Add support for chunk exclusion on DELETE to PG14
* #4209 Add support for chunk exclusion on UPDATE to PG14
* #4269 Continuous Aggregates finals form
* #4301 Add support for bulk inserts in COPY operator
* #4311 Support non-superuser move chunk operations
* #4330 Add GUC "bgw_launcher_poll_time"
* #4340 Enable now() usage in plan-time chunk exclusion

**Bugfixes**
* #3899 Fix segfault in Continuous Aggregates
* #4225 Fix TRUNCATE error as non-owner on hypertable
* #4236 Fix potential wrong order of results for compressed hypertable with a non-default collation
* #4249 Fix option "timescaledb.create_group_indexes"
* #4251 Fix INSERT into compressed chunks with dropped columns
* #4255 Fix option "timescaledb.create_group_indexes"
* #4259 Fix logic bug in extension update script
* #4269 Fix bad Continuous Aggregate view definition reported in #4233
* #4289 Support moving compressed chunks between data nodes
* #4300 Fix refresh window cap for cagg refresh policy
* #4315 Fix memory leak in scheduler
* #4323 Remove printouts from signal handlers
* #4342 Fix move chunk cleanup logic
* #4349 Fix crashes in functions using AlterTableInternal
* #4358 Fix crash and other issues in telemetry reporter

**Thanks**
* @abrownsword for reporting a bug in the telemetry reporter and testing the fix
* @jsoref for fixing various misspellings in code, comments and documentation
* @yalon for reporting an error with ALTER TABLE RENAME on distributed hypertables
* @zhuizhuhaomeng for reporting and fixing a memory leak in our scheduler
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants