Move `AllPalletsWithSystem::decode_entire_state` to own runtime API #2263

liamaharon · 2023-11-10T07:39:46Z

Fixes broken gitlab-check-runtime-migration-asset-hub-westend and gitlab-check-runtime-migration-bridge-hub-rococo CI checks.

UPDATE Nov 13: Proposed an alternate approach for this PR here: #2263 (comment)

Old / original description

AllPalletsWithSystem::decode_entire_state is currently called in Executive::try_runtime_upgrade and Executive::try_execute_block if any checks are selected.

This is very inflexible and cripples the usefulness of existing runtime APIs if there are any issues with decoding the entire state, as we are currently seeing with Asset Hub Westend CI.

We need to make running these checks optional.

Rather than add complexity to the existing runtime APIs to make the checks optional, in the spirit of a suggestion from @xlc (#2108 (comment)), I've opted to create a new runtime api for these checks. This pushes the complexity out of the runtime and into the caller (such as try-runtime-cli).

This approach of moving complexity out of the runtime and into the caller seems sensible to me. It makes these try-runtime tools much more flexible in how they can be used, and less prone to needing breaking changes to accomodate every way people may want to use them. If in agreement, in the future we can deprecate existing try-runtime runtime APIs and replace them with simpler, minimal functions that perform atomic pieces of work and can be composed by the caller in whatever way they wish.

TODO

Probably makes sense for it not to panic but return a Result to be handled by the caller.
prdoc

bkchr · 2023-11-10T15:37:56Z

substrate/bin/node/runtime/src/lib.rs

+		fn decode_entire_state() {
+			// NOTE: intentional unwrap: we don't want to propagate the error backwards, and want to
+			// have a backtrace here.
+			Executive::try_decode_entire_state().unwrap()


Why not just panic inside try_decode_entire_state instead of copying this comment everywhere? 🙈

bkchr · 2023-11-10T15:40:21Z

substrate/frame/executive/src/lib.rs

-		if checks.any() {
-			let res = AllPalletsWithSystem::try_decode_entire_state();
-			Self::log_decode_result(res)?;
-		}


Why do we have the checks parameter? I mean we could also just add a new variant to check? So no new runtime api function.

Let me answer this question in 2 parts.

Suggest how we could add it to checks parameter, but why I don't think this is the best solution

Suggest an alternative

1. How we could add it to checks

UpgradeCheckSelect is currently an enum:

pub enum UpgradeCheckSelect { None, All, PreAndPost, TryState, }

To keep the enum, we'd need to add a new variant for every combination of checks the user may want to run. Which is obviously not scalable or a good solution.

Instead, I think we'd want to change it to a struct something like this:

struct UpgradeCheckSelect { pre_and_post: bool, try_state: Select, decode_entire_state: bool }

However, I don't think this is optimal for 2 reasons:

It's a breaking change every time we want to add a new try-runtime task

It's inflexible in the sense that we can only run checks in combination with an on_runtime_upgrade check or an execute_block check. What if we just want to decode entire state? or run try-state checks?

We need to clutter the configuration to both try_on_runtime_upgrade and try_execute_block with the check options

2. Proposed alternative

Add a new enum describing all the atomic try-runtime related tasks, something like

pub enum TryRuntimeTask { OnRuntimeUpgrade(pre_and_post: bool), ExecuteBlock(block: Block, state_root_check: bool, signature_check: bool), TryState(TryStateSelect), TryDecodeEntireState }

Remove all try-state and try-decode-entire-state logic from try_on_runtime_upgrade and try_execute_block entirely, instead create dedicated methods for running those checks try_state and try_decode_entire_state.

Create a new Executive method and runtime API that accepts a vec of TryRuntimeTasks, and executes them in order

// runtime api fn execute_try_runtime_tasks(tasks: Vec<TryRuntimeTask>) -> (Weight, Weight) { Executive::execute_try_runtime_tasks(tasks) } // executive method pub fn execute_try_runtime_tasks(Vec<TryRuntimeTasks>) -> (Weight, Weight) { let agg_weight = Weight::zero(); for task in tasks { match task { TryRuntimeTask::OnRuntimeUpgrade(pre_and_post) => { let weight = Self::try_on_runtime_upgrade(pre_and_post); agg_weight.saturating_acc(weight); } TryRuntimeTask::ExecuteBlock(block, state_root_check, signature_check) => { let weight = Self::try_execute_block(block, state_root_check, signature_check, select).unwrap(); agg_weight.saturating_acc(weight); } TryRuntimeTask::TryState(try_state_select) { Self::try_state(try_state_select); } TryRuntimeTask::TryDecodeEntireState { Self::try_decode_entire_state(); } } } (weight, BlockWeights::get().max_block) }

This allows try_on_runtime_upgrade and try_execute_block to not be concerned about what checks to run or when (before or after) to run them or how (configuration) to run them.

The developer instead, with ultimate flexibility, just specifies a Vec like vec![OnRuntimeUpgrade(OnRuntimeUpgradeConfig), TryState(TryStateConfig), TryDecodeEntireState] describing what they want to run in what order.

Once this is implemented, we can add a deprecation notice to the current try-runtime runtime APIs and eventually remove them.

Okay ty for the writeup.

It's a breaking change every time we want to add a new try-runtime task

Your second proposal has the same issues ;) I don't think in general that this is such a problem, because these are development apis that don't need to stay stable or we need to support them over X years.

It's inflexible in the sense that we can only run checks in combination with an on_runtime_upgrade check or an execute_block check. What if we just want to decode entire state? or run try-state checks?

Not sure why it also requires the on_runtime_upgrade check. However, it will always require that we run the runtimes upgrades. You always need to run the upgrades as the new code you are running could be incompatible to the old runtime that created the state. This actually shows a "flaw" in your current pr here, because you don't run the upgrades before.

Your second proposal has the same issues ;) I don't think in general that this is such a problem, because these are development apis that don't need to stay stable or we need to support them over X years.

If it's an enum then I think we can add new variants without it being a breaking change to the caller (e.g. try-runtime-cli)? Since try-runtime-cli will just construct the Vec with enums it knows about.

You're right though not a huge deal, and we likely won't change it frequently.

Do you feel I should proceed with the struct approach (described in 1.) then?

If it's an enum then I think we can add new variants without it being a breaking change to the caller (e.g. try-runtime-cli)? Since try-runtime-cli will just construct the Vec with enums it knows about.

Yeah that is a good point.

Maybe we could also just introduce a bitflag, assuming we don't need to pass some data. However, if we need to pass some data, I would probably use your enum approach given your reasoning above.

move decode_entire_state to own runtime api

e28510c

liamaharon added T1-FRAME This PR/Issue is related to core FRAME, the framework. T4-runtime_API This PR/Issue is related to runtime APIs. labels Nov 10, 2023

liamaharon requested review from a team November 10, 2023 07:39

paritytech-review-bot bot requested review from a team November 10, 2023 07:40

bkchr reviewed Nov 10, 2023

View reviewed changes

paritytech-review-bot bot requested review from a team November 10, 2023 15:41

liamaharon marked this pull request as draft November 13, 2023 07:00

liamaharon mentioned this pull request Nov 13, 2023

Disable try_decode_entire_state try-runtime checks #2283

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move `AllPalletsWithSystem::decode_entire_state` to own runtime API #2263

Move `AllPalletsWithSystem::decode_entire_state` to own runtime API #2263

liamaharon commented Nov 10, 2023 •

edited

Loading

bkchr Nov 10, 2023

bkchr Nov 10, 2023

liamaharon Nov 13, 2023

bkchr Nov 13, 2023

liamaharon Nov 13, 2023 •

edited

Loading

bkchr Nov 13, 2023

Move AllPalletsWithSystem::decode_entire_state to own runtime API #2263

Are you sure you want to change the base?

Move AllPalletsWithSystem::decode_entire_state to own runtime API #2263

Conversation

liamaharon commented Nov 10, 2023 • edited Loading

Old / original description

bkchr Nov 10, 2023

Choose a reason for hiding this comment

bkchr Nov 10, 2023

Choose a reason for hiding this comment

liamaharon Nov 13, 2023

Choose a reason for hiding this comment

1. How we could add it to checks

2. Proposed alternative

bkchr Nov 13, 2023

Choose a reason for hiding this comment

liamaharon Nov 13, 2023 • edited Loading

Choose a reason for hiding this comment

bkchr Nov 13, 2023

Choose a reason for hiding this comment

Move `AllPalletsWithSystem::decode_entire_state` to own runtime API #2263

Move `AllPalletsWithSystem::decode_entire_state` to own runtime API #2263

liamaharon commented Nov 10, 2023 •

edited

Loading

1. How we could add it to `checks`

liamaharon Nov 13, 2023 •

edited

Loading