Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autovacuum crypto_transfer, token_transfer, and transaction #1825

Merged
merged 7 commits into from
Apr 19, 2021

Conversation

xin-hedera
Copy link
Collaborator

@xin-hedera xin-hedera commented Apr 7, 2021

Detailed description:

  • configure more frequent anti-wraparound autovacuum for crypto_transfer, token_transfer, and transaction for pg 9.6
  • configure more frequent insert-only autovacuum for crypto_transfer, token_transfer, and transaction for pg 13
  • upgrade to timescale/timescaledb-ha:pg13-ts2.2-latest
  • increase shm size for timescaledb container

Which issue(s) this PR fixes:
Fixes #1780
Fixes #1849

Special notes for your reviewer:

Criteria for the tables to enable more frequent autovacuum:

  • insert-only
  • there are sql queries on the table which can benefit from index-only scan

For pg 9.6

Reference doc: https://www.postgresql.org/docs/9.6/routine-vacuuming.html section 24.1.5

  • autovacuum_freeze_max_age can't be set lower than 100,000, mainnet postgresql tx id grows at around 60,000/day
  • setting autovacuum_freeze_table_age is needed otherwise postgresql will use the global parameter vacuum_freeze_table_age which is 150,000,000. then the autovacuum ends up not freezing any pages and running forever since the table age won't change
  • autovacuum_freeze_min_age = 0 is aggressive, freezes all pages at the time of acutovacuum scan
  • log_autovacuum_min_duration = 0 turns on logging of autovacuum of the table

Concerns:

  • anti-wraparound autovacuum scan will lock the table from DDL, so this will be a problem for ops doing upgrades

For timescaledb with pg13

pg 13 introduces a new method to configure autovacuum for insert-only tables:

  1. autovacuum_vacuum_insert_threshold, default to 1000
  2. autovacuum_vacuum_insert_scale_factor, default to 0.2

autovacuum will run when the number of inserts exceeds autovacuum_vacuum_insert_threshold + reltuples * autovacuum_vacuum_insert_scale_factor for a table since last autovacuum. reltuples is the approximate table size.

The values set for table crypto_transfer and transaction are based on the current daily transaction volume in mainnet. token_transfer has very few transactions (less than 6k in the past week) so the value will not have a real impact on performance and we should adjust it when the traffic grows.

For timescaledb, autovacuum will only scan the current chunk, which is more efficient.

Checklist

  • Documentation added
  • Tests updated

…n_transfer

Signed-off-by: Xin Li <xin.li@hedera.com>
@xin-hedera xin-hedera added bug Type: Something isn't working P1 rest Area: REST API database Area: Database labels Apr 7, 2021
@xin-hedera xin-hedera added this to the Mirror 0.32.0 milestone Apr 7, 2021
@xin-hedera xin-hedera requested a review from a team April 7, 2021 14:49
@codecov
Copy link

codecov bot commented Apr 7, 2021

Codecov Report

Merging #1825 (6266490) into master (8125fd6) will decrease coverage by 1.96%.
The diff coverage is 8.41%.

❗ Current head 6266490 differs from pull request most recent head 5dce57e. Consider uploading reports for the commit 5dce57e to get more accurate results
Impacted file tree graph

@@             Coverage Diff              @@
##             master    #1825      +/-   ##
============================================
- Coverage     87.37%   85.41%   -1.97%     
+ Complexity     1743      378    -1365     
============================================
  Files           315      135     -180     
  Lines          7691     3859    -3832     
  Branches        735      379     -356     
============================================
- Hits           6720     3296    -3424     
+ Misses          744      509     -235     
+ Partials        227       54     -173     
Impacted Files Coverage Δ Complexity Δ
...ra/mirror/monitor/config/MonitorConfiguration.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
.../hedera/mirror/monitor/publish/PublishMetrics.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
...a/mirror/monitor/publish/TransactionPublisher.java 1.13% <0.00%> (-0.43%) 1.00 <0.00> (ø)
...dera/mirror/monitor/publish/PublishProperties.java 78.57% <33.33%> (-4.77%) 4.00 <0.00> (ø)
...or/monitor/expression/ExpressionConverterImpl.java 88.88% <100.00%> (+0.17%) 10.00 <0.00> (ø)
...or/generator/ConfigurableTransactionGenerator.java 100.00% <100.00%> (ø) 13.00 <0.00> (ø)
...a/mirror/monitor/generator/ScenarioProperties.java 95.23% <100.00%> (+0.50%) 14.00 <2.00> (+1.00)
.../hedera/mirror/monitor/publish/PublishRequest.java 90.00% <100.00%> (+1.11%) 9.00 <1.00> (+1.00)
transactions.js 98.11% <0.00%> (ø) 0.00% <0.00%> (ø%)
...orter/config/CredentialsProviderConfiguration.java
... and 174 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 896b905...5dce57e. Read the comment docs.

@xin-hedera xin-hedera self-assigned this Apr 7, 2021
…ount_id, consensus_ns)

Signed-off-by: Xin Li <xin.li@hedera.com>
@xin-hedera xin-hedera marked this pull request as ready for review April 9, 2021 17:34
Copy link
Member

@steven-sheehy steven-sheehy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Do we need to consider doing account_balance as well?

@xin-hedera
Copy link
Collaborator Author

Looks good. Do we need to consider doing account_balance as well?

Checked account_balance table and token_balance table, the queries on the two tables either just select the consensus_timestamp (look for the latest consensus_timestamp or the latest at certain time in the past) or select the full row.

The consensus_timestamp query will not benefit much from index only scan since we just look for one consensus_timestamp and at most pg just has to spend an extra trip to make sure the corresponding index's data in the page is not stale.

The other queries always select the balance and balance is not in any indexes, so they are not index-only scans.

So vacuuming these tables more frequently won't speed up our queries thus we shouldn't do it.

Copy link
Contributor

@Nana-EC Nana-EC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, nit suggestion based on your preference.
Also is transactions not a huge insert only table.
Would some queries such as account and time range benefit from a vacuum more often?

-- autovacuum insert-only tables more frequently to ensure most pages are visible for index-only scans
-- prior to postgresql 13, this can be achieved by configuring a more aggressive anti-wraparound autovacuum
-------------------

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

\set autovacuum_freeze_max_age '100000'
\set autovacuum_freeze_table_age '100000'

Or if there's value to configurability you can add it as property in the application.yaml under spring.flyway.placeholders

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed both to repeatable migrations and added properties to spring.flyway.placeholders

steven-sheehy
steven-sheehy previously approved these changes Apr 10, 2021
@steven-sheehy
Copy link
Member

We'll be bumping our TimescaleDB chart to use PostgreSQL 13 soon and our timescale.com SaaS uses PostgreSQL 13. Can you update the schema v2 stuff to use the newer pg13 approach you mention?

@xin-hedera
Copy link
Collaborator Author

We'll be bumping our TimescaleDB chart to use PostgreSQL 13 soon and our timescale.com SaaS uses PostgreSQL 13. Can you update the schema v2 stuff to use the newer pg13 approach you mention?

definitely. the pg13 new insert-only table autovacuum mechanism will be more efficient.

ijungmann
ijungmann previously approved these changes Apr 12, 2021
@steven-sheehy steven-sheehy deleted the autovacuum-insert-only-tables branch April 14, 2021 03:43
@steven-sheehy steven-sheehy restored the autovacuum-insert-only-tables branch April 14, 2021 03:52
@steven-sheehy steven-sheehy reopened this Apr 14, 2021
Signed-off-by: Xin Li <xin.li@hedera.com>
Signed-off-by: Xin Li <xin.li@hedera.com>
…Mount = true

- change db migration to be repeatable

Signed-off-by: Xin Li <xin.li@hedera.com>
@xin-hedera xin-hedera dismissed stale reviews from ijungmann and steven-sheehy via 401e099 April 14, 2021 20:55
@xin-hedera xin-hedera changed the title Autovacuum crypto_transfer and token_transfer Autovacuum crypto_transfer, token_transfer, and transaction Apr 14, 2021
Signed-off-by: Xin Li <xin.li@hedera.com>
@xin-hedera xin-hedera requested a review from Nana-EC April 14, 2021 21:30
Nana-EC
Nana-EC previously approved these changes Apr 14, 2021
Copy link
Contributor

@Nana-EC Nana-EC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

ijungmann
ijungmann previously approved these changes Apr 16, 2021
Signed-off-by: Xin Li <xin.li@hedera.com>
@xin-hedera xin-hedera dismissed stale reviews from ijungmann and Nana-EC via 5dce57e April 16, 2021 16:15
@sonarcloud
Copy link

sonarcloud bot commented Apr 16, 2021

Copy link
Member

@steven-sheehy steven-sheehy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@Nana-EC Nana-EC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@xin-hedera xin-hedera merged commit d0addee into master Apr 19, 2021
@xin-hedera xin-hedera deleted the autovacuum-insert-only-tables branch April 19, 2021 15:17
ijungmann pushed a commit that referenced this pull request May 18, 2021
- configure more frequent anti-wraparound autovacuum for crypto_transfer, token_transfer, and transaction for pg 9.6
- configure more frequent insert-only autovacuum for crypto_transfer, token_transfer, and transaction for pg 13
- upgrade to timescale/timescaledb-ha:pg13-ts2.2-latest
- increase shm size for timescaledb container

Signed-off-by: Xin Li <xin.li@hedera.com>
Signed-off-by: Ian Jungmann <ian.jungmann@hedera.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Type: Something isn't working database Area: Database P1 rest Area: REST API
Projects
None yet
4 participants