Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dynamic Protocol State] TODOs and refactoring, part 2 #5080

Conversation

durkmurder
Copy link
Member

Context

In this PR I am aiming to resolve two mid priority TODOs from the list.

@codecov-commenter
Copy link

codecov-commenter commented Nov 29, 2023

Codecov Report

Attention: 16 lines in your changes are missing coverage. Please review.

Comparison is base (7587d3a) 56.37% compared to head (66b1bab) 57.57%.

Files Patch % Lines
storage/badger/protocol_state.go 56.25% 11 Missing and 3 partials ⚠️
cmd/scaffold.go 0.00% 2 Missing ⚠️
Additional details and impacted files
@@                        Coverage Diff                         @@
##           feature/dynamic-protocol-state    #5080      +/-   ##
==================================================================
+ Coverage                           56.37%   57.57%   +1.19%     
==================================================================
  Files                                 987      745     -242     
  Lines                               92682    72117   -20565     
==================================================================
- Hits                                52254    41519   -10735     
+ Misses                              36570    27436    -9134     
+ Partials                             3858     3162     -696     
Flag Coverage Δ
unittests 57.57% <55.55%> (+1.19%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

if err != nil {
return nil, fmt.Errorf("could not lookup identity table ID for block (%x): %w", blockID[:], err)
}
return cache.Get(protocolStateID)(tx)
Copy link
Member

@jordanschalm jordanschalm Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a minor preference for moving this second cache Get call outside of the byBlockID cache retrieval function (and into ByBlockID). That way each cache remains conceptually only a wrapper around one database call, rather than containing glue logic linking different database calls together. It's also easier to add things like a public ProtocolStateIDByBlockID method (similar to how we have the BlockIDByHeight method that just reads the secondary index for finalized blocks).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean changing signature of secondary cache to Cache[flow.Identifier, flow.Identifier] which will map block_id -> protocol_state_id?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changing signature of secondary cache to Cache[flow.Identifier, flow.Identifier] which will map block_id -> protocol_state_id

Yeah, exactly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

• addressed remaining TODOs for documentation and removed outdated TODOs
• consolidated and updated goDoc of `storage.ProtocolState` interface and implementation
• added logic for populating the `byBlockIdCache` cache, i.e. the cache for looking up a block's Protocol state
Copy link
Member

@AlexHentschel AlexHentschel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. A few minor comments.

Tried to help with consolidating the goDoc and with some minor polishing in my PR #5116 targeting your branch. The only comment that I did not already implement is this one: #5080 (comment)

defer tx.Discard()
return s.byBlockID(blockID)(tx)
}

// byID retrieves the identity table by its ID. Error returns:
// - storage.ErrNotFound if no identity table with the given ID exists
func (s *ProtocolState) byID(protocolStateID flow.Identifier) func(*badger.Txn) (*flow.RichProtocolStateEntry, error) {
return s.cache.Get(protocolStateID)
Copy link
Member

@AlexHentschel AlexHentschel Dec 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to just inline this one line in the two places, where byID is called.

👉 Implemented in my PR #5116

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that there are still some open and outdated TODOs here regarding documentation. Furthermore, the methods' goDoc differers quite a bit between the interface storage.ProtocolState and this implementation. Lastly, we are still using the term "Identity Table" despite the protocol state now being a lot more general.

I put up PR #5116 targeting your branch, which consolidates the goDoc, address the remaining open TODOs, and removes the outdated TODOs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with not populating the cache which holds the RichProtocolStateEntrys on store. This is because (i) we don't have the RichProtocolStateEntry on store readily available and (ii) new RichProtocolStateEntry are really rare throughout an epoch, so the total cost of populating the cache becomes negligible over several views.

Side comment (outside the scope of this PR):

  • I think we could have the State Machine's Build method generate the RichProtocolStateEntry right away. I think it already has the needed Epoch Setup and Epoch Commit events, since it starts with a RichProtocolStateEntry for the parent state and consumes Epoch Setup and Epoch Commit events.
  • I think we might want to implement this, if we want to store more readily changing information in the protocol state, like the latest sealed block.

Though, I think for the scope of this PR, it would be beneficial to populate the byBlockIdCache on store, because here, we add a new entry every block! And we probably query for every block. So argument (ii) does not really apply here. Furthermore, argument (i) also does not apply, because we already have the Protocol State's ID on store, so we could populate the cache without much additional effort.

I implemented this in my PR #5116.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currenlty, we are using the same cache size for both caches

withLimit[flow.Identifier, *flow.RichProtocolStateEntry](cacheSize),

withLimit[flow.Identifier, flow.Identifier](cacheSize),

I don't think that is a good idea for the following reason:

  • byBlockIdCache will contain an entry for every block. We want to be able to cover a broad interval of views without cache misses, so I like the default setting of allowing up to 1000 entries.
  • However, cache only holds the distinct Protocol States. Minimally, we have something like 3 entries per epoch (one on epoch Switchover, one on receiving the Epoch Setup and one when seeing the Epoch Commit event). Lets be generous and assume we have 20 different Protocol States per epoch. Beyond that, we are certainly leaving the domain of normal operations that we optimize for. That would mean we are holding the protocol states for 1 year in the cache. That doesn't seem useful to me.

I would suggest to have a dedicated size parameter for each cache.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

entry.NextEpochCommit = nil
entry.NextEpoch.CommitID = flow.ZeroID
})

Copy link
Member

@AlexHentschel AlexHentschel Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the following lines are sanity checks for the previously-constructed stateEntry, correct?

assert.Nil(t, stateEntry.PreviousEpoch)
assert.Nil(t, stateEntry.PreviousEpochSetup)
assert.Nil(t, stateEntry.PreviousEpochCommit)

Would suggest to include a comment:

Suggested change
// sanity check that previous epoch is not populated in `stateEntry`

👉 Implemented in my PR #5116

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, this is santity checks to ensure we are testing the correct thing

entry.PreviousEpochCommit = nil
entry.PreviousEpoch = nil
})

Copy link
Member

@AlexHentschel AlexHentschel Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// sanity check that previous epoch is not populated in `stateEntry`

👉 Implemented in my PR #5116

@durkmurder durkmurder merged commit 7fdc7fb into feature/dynamic-protocol-state Dec 8, 2023
53 checks passed
@durkmurder durkmurder deleted the yurii/4649-todos-and-refactoring-part-2 branch December 8, 2023 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants