Optimize importer memory footprint #2103

xin-hedera · 2021-06-10T01:40:09Z

Detailed description:

Move persistBytes to downloader
Move keepFiles to downloader
Change saved stream files & signature files layout to match cloud bucket
Clear StreamFile bytes and items in post parse
Use DirectChannel when queueCapacity <= 0
Use javaconfig to conditionally set integration service endpoint's poller

Which issue(s) this PR fixes:
Fixes #2087

Special notes for your reviewer:

Checklist

Documentation added
Tests updated

- clear bytes and items if stream file is skipped or successfully parsed Signed-off-by: Xin Li <xin.li@hedera.com>

Signed-off-by: Xin Li <xin.li@hedera.com>

codecov · 2021-06-10T01:43:08Z

Codecov Report

Merging #2103 (b1e5c94) into master (126751e) will decrease coverage by 13.17%.
The diff coverage is 78.26%.

❗ Current head b1e5c94 differs from pull request most recent head 708b918. Consider uploading reports for the commit 708b918 to get more accurate results

@@              Coverage Diff              @@
##             master    #2103       +/-   ##
=============================================
- Coverage     87.05%   73.87%   -13.18%     
+ Complexity     1744      242     -1502     
=============================================
  Files           315      152      -163     
  Lines          7731     4368     -3363     
  Branches        740      439      -301     
=============================================
- Hits           6730     3227     -3503     
- Misses          772     1085      +313     
+ Partials        229       56      -173

Impacted Files	Coverage Δ
...ensus/ConsensusCreateTopicTransactionSupplier.java	`0.00% <ø> (ø)`
...sus/ConsensusSubmitMessageTransactionSupplier.java	`0.00% <0.00%> (ø)`
...ensus/ConsensusUpdateTopicTransactionSupplier.java	`0.00% <ø> (ø)`
...er/schedule/ScheduleCreateTransactionSupplier.java	`0.00% <ø> (ø)`
...supplier/token/TokenCreateTransactionSupplier.java	`0.00% <ø> (ø)`
...supplier/token/TokenUpdateTransactionSupplier.java	`0.00% <ø> (ø)`
hedera-mirror-rosetta/app/domain/types/account.go	`100.00% <ø> (ø)`
...edera-mirror-rosetta/app/domain/types/operation.go	`100.00% <ø> (ø)`
...-mirror-rosetta/app/persistence/account/account.go	`93.33% <ø> (+5.09%)`	⬆️
...rosetta/app/persistence/addressbook/entry/entry.go	`100.00% <ø> (ø)`
... and 272 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 26df5ba...708b918. Read the comment docs.

Signed-off-by: Xin Li <xin.li@hedera.com>

steven-sheehy · 2021-06-10T14:23:53Z

DirectChannel doesn't fit the requirement:
no concept of queueCapacity
subscriber of a DirectChannel is called in the publisher's thread

But that's exactly what I stated in the call and it's what we want for balances to reduce memory usage by 66% or more. My suggestion is this:

    @Bean(CHANNEL_BALANCE)
    MessageChannel channelBalance(BalanceParserProperties properties) {
        if (properties.getQueueCapacity() <= 0) {
            return MessageChannels.direct().get();
        else {
            return MessageChannels.queue(properties.getQueueCapacity()).get();
        }
    }

xin-hedera · 2021-06-10T14:32:45Z

DirectChannel doesn't fit the requirement:
no concept of queueCapacity
subscriber of a DirectChannel is called in the publisher's thread

But that's exactly what I stated in the call and it's what we want for balances to reduce memory usage by 66% or more. My suggestion is this:
    @Bean(CHANNEL_BALANCE)
    MessageChannel channelBalance(BalanceParserProperties properties) {
        if (properties.getQueueCapacity() <= 0) {
            return MessageChannels.direct().get();
        else {
            return MessageChannels.queue(properties.getQueueCapacity()).get();
        }
    }

Then RendezvousChannel (https://docs.spring.io/spring-integration/api/org/springframework/integration/channel/RendezvousChannel.html) is a better fit.

steven-sheehy · 2021-06-10T14:35:04Z

Sure, RendezvousChannel might do the trick. Does it still allow 2 total files in memory?

xin-hedera · 2021-06-10T14:37:55Z

Yes, 2 is the best we can get. One held in downloader and one is being processed in parser.

Nana-EC

Looking good asides MessageChannel discussion. One suggestion and thinking on if the test coverage can be expanded

Nana-EC · 2021-06-10T15:41:20Z

...or-importer/src/test/java/com/hedera/mirror/importer/parser/record/RecordFileParserTest.java

@@ -225,6 +213,19 @@ private void assertFilesArchived(String... fileNames) throws Exception {
                .contains(fileNames);
    }

+    private void assertPostParseState(RecordFile recordFile, boolean success,


Could move this method into an abstract class for the 3 test classes to share since they are exactly the same

will do. potentially can combine if not all at least most test cases for event and record. note the balance parser test is an integration test while the other two are mocked tests. perhaps I should unify the three.

...ror-importer/src/test/java/com/hedera/mirror/importer/downloader/AbstractDownloaderTest.java

...irror-importer/src/main/java/com/hedera/mirror/importer/parser/AbstractStreamFileParser.java

docs/configuration.md

Signed-off-by: Xin Li <xin.li@hedera.com>

- refactor parser test classes Signed-off-by: Xin Li <xin.li@hedera.com>

xin-hedera · 2021-06-11T17:34:01Z

...-mirror-importer/src/main/java/com/hedera/mirror/importer/config/MessagingConfiguration.java

    @Bean(CHANNEL_BALANCE)
    MessageChannel channelBalance(BalanceParserProperties properties) {
-        return MessageChannels.queue(properties.getQueueCapacity()).get();
+        return channel(properties);
    }

    @Bean(CHANNEL_EVENT)
    MessageChannel channelEvent(EventParserProperties properties) {
-        return MessageChannels.queue(properties.getQueueCapacity()).get();
+        return channel(properties);
    }

    @Bean(CHANNEL_RECORD)
    MessageChannel channelRecord(RecordParserProperties properties) {
-        return MessageChannels.queue(properties.getQueueCapacity()).get();
+        return channel(properties);
+    }
+
+    @Bean(INTEGRATION_FLOW_BALANCE)
+    IntegrationFlow integrationFlowBalance(AccountBalanceFileParser parser) {
+        return integrationFlow(parser);
+    }
+
+    @Bean(INTEGRATION_FLOW_EVENT)
+    IntegrationFlow integrationFlowEvent(EventFileParser parser) {
+        return integrationFlow(parser);
+    }
+
+    @Bean(INTEGRATION_FLOW_RECORD)
+    @ConditionalOnRecordParser
+    IntegrationFlow integrationFlowRecord(RecordFileParser parser) {
+        return integrationFlow(parser);


have to repeat these for different beans and different names.

didn't want to move the channel bean and integration flow bean to StreamFileParser classes because I prefer separating configuration from those classes as much as I can.

xin-hedera · 2021-06-11T17:36:07Z

...-mirror-importer/src/main/java/com/hedera/mirror/importer/config/MessagingConfiguration.java

+    }
+
+    @Bean(INTEGRATION_FLOW_RECORD)
+    @ConditionalOnRecordParser


not sure why we only have ConditionalOnRecordParser on RecordFileParser but not the other two parsers. anyway to avoid autowire error when parser.record.enabled is false, I also added the annotation here

I believe @ConditionalOnRecordParser was added more so to help out the pubs flow which should only check whether to run when record parser is enabled.
Balance and event parser don't have a sub flow so it didn't apply.
Although I do believe we wanted to check thee annotations at some point for easier coordination.

Signed-off-by: Xin Li <xin.li@hedera.com>

Nana-EC

LGTM.
A debug log recommendation if anything

...rter/src/main/java/com/hedera/mirror/importer/reader/balance/CompositeBalanceFileReader.java

docs/configuration.md

...-mirror-importer/src/main/java/com/hedera/mirror/importer/config/MessagingConfiguration.java

docs/configuration.md

...irror-importer/src/main/java/com/hedera/mirror/importer/parser/AbstractParserProperties.java

...porter/src/main/java/com/hedera/mirror/importer/parser/balance/AccountBalanceFileParser.java

...rter/src/main/java/com/hedera/mirror/importer/reader/balance/CompositeBalanceFileReader.java

...mirror-importer/src/main/java/com/hedera/mirror/importer/parser/record/RecordFileParser.java

...a-mirror-importer/src/main/java/com/hedera/mirror/importer/parser/event/EventFileParser.java

Signed-off-by: Xin Li <xin.li@hedera.com>

steven-sheehy

LGTM

ijungmann

One small question, but LGTM

ijungmann · 2021-06-14T17:36:54Z

...-mirror-importer/src/main/java/com/hedera/mirror/importer/config/MessagingConfiguration.java

                .get();
    }
+
+    private MessageChannel channel(ParserProperties properties) {
+        if (properties.getQueueCapacity() <= 0) {


If we have validation to ensure the minimum is 0, do we need to check for <? Could just be ==

Because we should write code defensively in case that other class changes in the future and it's the same number of characters either way.

Nana-EC

LGTM

Signed-off-by: Xin Li <xin.li@hedera.com>

sonarcloud · 2021-06-14T21:02:52Z

SonarCloud Quality Gate failed.

0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells

0.0% Coverage
3.2% Duplication

Nana-EC

LGTM

- move keep stream file bytes logic to downloader

d804728

- clear bytes and items if stream file is skipped or successfully parsed Signed-off-by: Xin Li <xin.li@hedera.com>

xin-hedera added bug Type: Something isn't working P2 performance parser Area: File parsing labels Jun 10, 2021

xin-hedera added this to the Mirror 0.36.0 milestone Jun 10, 2021

xin-hedera requested a review from a team June 10, 2021 01:40

xin-hedera self-assigned this Jun 10, 2021

clean up

23df1fb

Signed-off-by: Xin Li <xin.li@hedera.com>

code smell

35e1013

Signed-off-by: Xin Li <xin.li@hedera.com>

Nana-EC requested changes Jun 10, 2021

View reviewed changes

steven-sheehy requested changes Jun 10, 2021

View reviewed changes

steven-sheehy added the breaking Contains a breaking change that warrants mention in the release notes label Jun 10, 2021

xin-hedera added 2 commits June 10, 2021 21:09

use DirectChannel when queueCapacity is 0

6fe84a4

Signed-off-by: Xin Li <xin.li@hedera.com>

- move keepFiles to downloader

bbf6cf2

- refactor parser test classes Signed-off-by: Xin Li <xin.li@hedera.com>

xin-hedera commented Jun 11, 2021

View reviewed changes

code smell

b65792a

Signed-off-by: Xin Li <xin.li@hedera.com>

xin-hedera requested review from Nana-EC and steven-sheehy June 11, 2021 18:49

Nana-EC previously approved these changes Jun 12, 2021

View reviewed changes

...rter/src/main/java/com/hedera/mirror/importer/reader/balance/CompositeBalanceFileReader.java Outdated Show resolved Hide resolved

steven-sheehy requested changes Jun 14, 2021

View reviewed changes

address feedback

06d2b09

Signed-off-by: Xin Li <xin.li@hedera.com>

xin-hedera dismissed Nana-EC’s stale review via 06d2b09 June 14, 2021 16:49

code smells

60ec580

Signed-off-by: Xin Li <xin.li@hedera.com>

adjust CompositeBalanceFileReader log message

b235f08

Signed-off-by: Xin Li <xin.li@hedera.com>

xin-hedera requested a review from steven-sheehy June 14, 2021 17:26

steven-sheehy previously approved these changes Jun 14, 2021

View reviewed changes

ijungmann previously approved these changes Jun 14, 2021

View reviewed changes

Nana-EC previously approved these changes Jun 14, 2021

View reviewed changes

fix test case failure

708b918

Signed-off-by: Xin Li <xin.li@hedera.com>

xin-hedera dismissed stale reviews from Nana-EC, ijungmann, and steven-sheehy via 708b918 June 14, 2021 20:59

xin-hedera requested review from steven-sheehy, ijungmann and Nana-EC June 14, 2021 21:08

steven-sheehy approved these changes Jun 14, 2021

View reviewed changes

ijungmann approved these changes Jun 14, 2021

View reviewed changes

Nana-EC approved these changes Jun 15, 2021

View reviewed changes

xin-hedera merged commit 8b0e277 into master Jun 15, 2021

xin-hedera deleted the gke-importer-oom branch June 15, 2021 18:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize importer memory footprint #2103

Optimize importer memory footprint #2103

xin-hedera commented Jun 10, 2021 •

edited by steven-sheehy

Loading

codecov bot commented Jun 10, 2021 •

edited

Loading

steven-sheehy commented Jun 10, 2021

xin-hedera commented Jun 10, 2021

steven-sheehy commented Jun 10, 2021 •

edited

Loading

xin-hedera commented Jun 10, 2021

Nana-EC left a comment •

edited

Loading

Nana-EC Jun 10, 2021

xin-hedera Jun 10, 2021

xin-hedera Jun 11, 2021 •

edited

Loading

xin-hedera Jun 11, 2021

Nana-EC Jun 11, 2021

Nana-EC left a comment

steven-sheehy left a comment

ijungmann left a comment

ijungmann Jun 14, 2021

steven-sheehy Jun 14, 2021

Nana-EC left a comment

sonarcloud bot commented Jun 14, 2021

Nana-EC left a comment

Optimize importer memory footprint #2103

Optimize importer memory footprint #2103

Conversation

xin-hedera commented Jun 10, 2021 • edited by steven-sheehy Loading

codecov bot commented Jun 10, 2021 • edited Loading

Codecov Report

steven-sheehy commented Jun 10, 2021

xin-hedera commented Jun 10, 2021

steven-sheehy commented Jun 10, 2021 • edited Loading

xin-hedera commented Jun 10, 2021

Nana-EC left a comment • edited Loading

Choose a reason for hiding this comment

Nana-EC Jun 10, 2021

Choose a reason for hiding this comment

xin-hedera Jun 10, 2021

Choose a reason for hiding this comment

xin-hedera Jun 11, 2021 • edited Loading

Choose a reason for hiding this comment

xin-hedera Jun 11, 2021

Choose a reason for hiding this comment

Nana-EC Jun 11, 2021

Choose a reason for hiding this comment

Nana-EC left a comment

Choose a reason for hiding this comment

steven-sheehy left a comment

Choose a reason for hiding this comment

ijungmann left a comment

Choose a reason for hiding this comment

ijungmann Jun 14, 2021

Choose a reason for hiding this comment

steven-sheehy Jun 14, 2021

Choose a reason for hiding this comment

Nana-EC left a comment

Choose a reason for hiding this comment

sonarcloud bot commented Jun 14, 2021

Nana-EC left a comment

Choose a reason for hiding this comment

xin-hedera commented Jun 10, 2021 •

edited by steven-sheehy

Loading

codecov bot commented Jun 10, 2021 •

edited

Loading

steven-sheehy commented Jun 10, 2021 •

edited

Loading

Nana-EC left a comment •

edited

Loading

xin-hedera Jun 11, 2021 •

edited

Loading