Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Per-pallet child tries #363

Open
gavofyork opened this issue Jan 13, 2020 · 8 comments
Open

Per-pallet child tries #363

gavofyork opened this issue Jan 13, 2020 · 8 comments
Labels
D3-involved Can be fixed by an expert coder with good knowledge of the codebase. I4-refactor Code needs refactoring. T1-FRAME This PR/Issue is related to core FRAME, the framework.

Comments

@gavofyork
Copy link
Member

gavofyork commented Jan 13, 2020

There should be a variant of decl_storage that stores all items in a child-trie rather than the main trie.

Storage format/hashing should be equivalent, except that there should be no module prefix.

Additionally, there should be migration code allowing for modules using the existing system (e.g. Kusama) to migrate their modules' storage over to per-pallet child tries during an upgrade.

Change tries should be made to work on child tries as per the main trie.

@gui1117
Copy link
Contributor

gui1117 commented Jan 22, 2020

What should be the childinfo used here ? does the module should be the childtrie at the path twox_64("ModulePrefix") ? Or is this too weak.
Because this path must not be the start of any other childtrie. as its deletion will delete all database key starting with this prefix, no ? (isn't it @cheme ?)

@cheme
Copy link
Contributor

cheme commented Jan 22, 2020

Yes, child trie prefix in addition to being unique need to avoid starting with another existing child prefix (to avoid making assumption on the keys used by the module).

So the child trie prefix got the collision resistance of the smaller prefix use, at this point contract allows 256 bit prefix.

So having twox_64("ModulePrefix") lowers the collision resistance of other child trie prefix to 64bit. (considering random key usage or batch deletion by child trie prefix).
Seems a bit small (there may be a middle ground with 256, but 64 definitely seems small).

So I would keep something that is at least 128bit (not sure if enough), twox would be fine, no need for a crypto resistant hash here as we use some static input in this case. But this prefix should be a constant, so using a crypto resistant hash function got no disadvantage, therefore I would probably use the same hash method as for contract.

But if we consider that pallet modules using those macros cannot use random key, IIRC they are already using some twox hashes over field id, if smaller field prefix is a twox_64 then we can use a hash that is "targetted_collision_resistance_size - 64".

If we don't implement batch deletion and consider pallet module cannot write random key, then it should be fine to just use twox_64, it relies on the the same assumption as for module prefix definition (key used internally in rocksdb will be fairly similar).
So maybe it could be more reasonable to forget this deletion implementation for current child trie implementation and target batch child trie deletion for a possible future implementation of child trie with managed unique id.

@gui1117
Copy link
Contributor

gui1117 commented Jan 22, 2020

also we could do more precise stuff such as module could be at 00++twox_64("ModulePrefix") and otherhash that needs more bytes at 01++hash of size 128, 10++ hash of size 256.

@cheme
Copy link
Contributor

cheme commented Jan 22, 2020

yes, that seems like a possibility, that could even be different child_trie kinds.

@gui1117
Copy link
Contributor

gui1117 commented Jan 23, 2020

maybe the simpliest way to have this unique_id not colliding with anything is to make its length to be 256, thus we could use a unique id like twox_64("module_name")++000...0000

@gui1117
Copy link
Contributor

gui1117 commented Feb 4, 2020

I unassigned for two weeks but here is my current progress:

First idea:

I wanted to introduce a new structure: StorageSpace, this storage space should be an isolated space in the storage used by the modules storages.
it would implement the following trait:

/// An abstract storage. Can be used to generalize over a prefixed space in the top trie, or in a
/// child trie.
pub trait StorageSpace {
	type StorageSpaceIterator: Iterator<Item=(Vec<u8>, Vec<u8>)>;

	/// Get a Vec of bytes from storage.
	fn get(&self, key: &[u8]) -> Option<Vec<u8>>;

	/// Put a raw byte slice into storage.
	fn put(&self, key: &[u8], value: &[u8]);

	/// Check to see if `key` has an explicit entry in storage.
	fn exists(&self, key: &[u8]) -> bool;

	/// Ensure `key` has no explicit entry in storage.
	fn kill(&self, key: &[u8]);

	/// Ensure keys with the given `prefix` have no entries in storage.
	fn kill_prefix(&self, prefix: &[u8]);

	/// Iter over all (key, value) of the storage after the given `prefix`
	fn iter_prefix(&self, prefix: &[u8]) -> Self::StorageSpaceIterator;
}

This trait would be implemented for three structure:

  • Structure 1: in main trie after a module prefix and a storage prefix
  • Structure 2: in a module childtrie after a storage prefix
  • Structure 3: in a module_storage childtrie after no prefix.

I would imagine this space structure to implement migration from one to one another, just be iterating on them while removing their value and adding value to new space.
Also we could support lazy_migration by just having a partial migration at every block plus when getting a value (or iterating or else..) we would get from both space.
This is obviously costy but meant to be temporar.

Second idea:

(in this part we talk about StorageMap but this is generalizable to all 4 storage kind).

Note: such StorageSpace an ok idea to me, though we should notice that this doesn't cover everything. For example a migration between StorageMap<Hasher=twox_concat, Trie=MainTriePrefixed> to StorageMap<Hasher=blake2_256_concat, Trie=ChildTriePrefixed> is in principle doable but no implementable by this storage space structure as it doesn't have hash informations.

Considering this limitation maybe it is better not to introduce this intermediate abstraction and just create a new associated type to our StorageMap which would be:

trait StorageMap {
  type Trie: Trie;
  fn hashed_module_prefix() -> &'static [u8]; // make it empty if you are in a module unique child trie
  fn hashed_storage_prefix() -> &'static [u8]; // make it empty if you are in a module_storage unique child_trie
}

having these module_prefix hashed at compile time is doable, you can't import sp_core at compile time because it seems features are mixed by cargo but you can import twox-hash crate and reimplement twox_128 yourself. (though we might want to create a sp_core_compile_time crate or such thing)

So implementing all kind of migration from one StorageMap to one another should be possible.
Like we could do:

migration_same_hasher<Hasher, StorageMap1<Hasher=Hasher>, StorageMap2<Hasher=Hasher>>(..)

migration_hasher_concat<
  Hasher1: HasherConcat,
  Hasher2: HasherConcat,
  StorageMap1<Hasher1>,
  StorageMap2<Hasher2>
>(..)

But when it come to lazy migration, it is more difficult, the only thing I see it to crate a new trait StorageMapLazyMigration and which would have two associated type StorageMap1 and StorageMap2 and do a lazy migration between both.

Note that maybe lazy migration is not very interesting as is or should be done in some other way.

Current wip branch is gui-module-in-childtries: https://github.com/paritytech/substrate/compare/gui-module-in-childtries?diff=unified&expand=1

@seunlanlege
Copy link
Contributor

Yeah i figured i wasn't going to be the only one who's thought of this, I wonder if there's a tracking issue for custom keys for storage items 🤔

@Kaktus-green
Copy link

@thiolliere any updates on this one?

@juangirini juangirini transferred this issue from paritytech/substrate Aug 24, 2023
@the-right-joyce the-right-joyce added I4-refactor Code needs refactoring. T1-FRAME This PR/Issue is related to core FRAME, the framework. D3-involved Can be fixed by an expert coder with good knowledge of the codebase. and removed I7-refactor labels Aug 25, 2023
claravanstaden pushed a commit to Snowfork/polkadot-sdk that referenced this issue Dec 8, 2023
helin6 pushed a commit to boolnetwork/polkadot-sdk that referenced this issue Feb 5, 2024
* Return error when OutOfFund

* Fix estimate_gas early return

* Update comments

* Update changelog

* Update message info
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
D3-involved Can be fixed by an expert coder with good knowledge of the codebase. I4-refactor Code needs refactoring. T1-FRAME This PR/Issue is related to core FRAME, the framework.
Projects
Status: Backlog
Development

No branches or pull requests

7 participants