Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chainHead based backend implementation #1161

Merged
merged 24 commits into from
Sep 26, 2023
Merged

chainHead based backend implementation #1161

merged 24 commits into from
Sep 26, 2023

Conversation

jsdw
Copy link
Collaborator

@jsdw jsdw commented Sep 12, 2023

This PR results in an UnstableBackend, which implements the Backend trait and can be used in place of the LegacyBackend to drive Subxt with the new (and currently unstable) RPC APIs.

The files are structured roughly as so:

  • subxt/src/backend/unstable/follow_stream.rs: A Stream whose goal is to remain subscribed to chainHead_follow. It will re-subscribe if the subscription is ended for any reason, and it will return the current subscription_id as an event, along with follow events.
  • subxt/src/backend/unstable/follow_stream_unpin.rs: A Stream which builds on the above, and handles pinning. It replaces any block hash seen in the follow events with a BlockRef which, when all clones are dropped, will lead to an "unpin" call for that block hash being queued. It will also automatically unpin any blocks that exceed a given max age, to try and prevent the underlying stream from ending (and all blocks from being unpinned as a result). Put simply, it tries to keep every block pinned as long as possible until it's no longer used anywhere.
  • subxt/src/backend/unstable/follow_stream_driver.rs: A Stream which builds on the above, and allows multiple subscribers to obtain events from the single underlying subscription (each being provided an Initialised message and all new blocks since then, as if they were each creating a unique chainHead_follow subscription). This is the "top" layer and the one that's interacted with elsewhere.

Each of these layers is independently unit tested to check that they are doing what we expect. A key consideration is that we want all of this to compile to WASM, and so we avoid using tokio and such anywhere.

We also have subxt/src/backend/unstable/storage_items.rs: A wrapper around a stream of storage events which will handle continuing and stopping correctly, and stream each StorageResult back one at a time (rather than in groups).

From these, we then implement the actual Backend trait in subxt/src/backend/unstable/mod.rs to pull everything together.

To test this all, we add an "unstable-backend-client" feature flag to our integration-tests crate. When this is enabled, all of the integration tests are ran using the new UnstableBackend instead of the default LegacyBackend. All tests should pass with both backends (excepting CI issues at the time of writing). We tweak CI to run these tests, too.

Other misc changes were made to be compatible with the newest substrate-node binary (locally at least), namely tranfer => transfer_allow_death.

@jsdw jsdw marked this pull request as ready for review September 22, 2023 10:25
@jsdw jsdw requested review from a team as code owners September 22, 2023 10:25
Copy link
Contributor

@tadeohepperle tadeohepperle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, thanks a lot for your explanation last week, without it I would have had a hard time seeing what is going on. I just have a few comments.

@@ -12,7 +12,7 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {

// Build a balance transfer extrinsic.
let dest = dev::bob().public_key().into();
let balance_transfer_tx = polkadot::tx().balances().transfer(dest, 10_000);
let balance_transfer_tx = polkadot::tx().balances().transfer_allow_death(dest, 10_000);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: What is the advantage of transfer_allow_death over the normal transfer here?

Copy link
Collaborator Author

@jsdw jsdw Sep 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transfer actually disappeared from the recent version of the balances pallet, and was replaced with two functions: transfer_keep_alive and transfer_allow_death.

transfer_keep_alive will reject doing a transfer that would bring your balance below the existential deposit (ED) amount (which, if happens, would allow the account to be repeaed, ie disappear entirely with any remaining balance going away)

transfer_allow_death is the same as the old transfer call afaiu, and will allow the transfer to occur even if it leads to the acoucnt being reaped :)

And so, I just renamed all uses of transfer that I found to transfer_allow_death to keep the same logic as before to make CI happy. (well, happier)

/// We're fetching the inner subscription. Move to Ready when we have one.
Initializing(FollowEventStreamFut<Hash>),
/// Report back the subscription ID here, and then start ReceivingEvents.
Ready(Option<(FollowEventStream<Hash>, String)>),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could remove the Option here in favor of Ready(FollowEventStream<Hash>, String), or is it used somewhere?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hehe, I sat pondering this for a bit myself!

The reason for the option is just because, below, we want to take ownership of those two values to put them in other places, but we only have a &mut reference to them. So the question is, if I want to take ownership of them, I have to replace the originals with something.. but what?! With an Option I can replace it with None using option.take() :)

I'll look at this again though because I can maybe do better here and it'd be lovely to remove the option!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, nice trick!

current_init_message: None,
current_subscription_id: None,
seen_runtime_events: HashMap::new(),
block_events_from_last_finalized: VecDeque::new(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Why do we set next_id: 1 instead of having it initialized to 0?
I think we can also derive the Default trait for SharedState, to use SharedState { next_id: 1, ..Default::default() } or Default::default() here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For no particular reason other than to avoid 0 being used, which sometimes could be seen as a special or meaningful value. But there would be no harm in this being initialized to 0 either really; just a habit I adopted a bit!

+ Encode
+ PartialEq
+ std::cmp::Eq
+ std::cmp::PartialEq
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this trait bound might have PartialEq twice?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, so it does! Good spot!

+ Encode
+ PartialEq
+ std::cmp::Eq
+ std::cmp::PartialEq
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, probably one of the PartialEqs can go.

// - find the submitAndWatchExtrinsic call in the WS connection to get these bytes:
let expected_tx_bytes = hex::decode(
"b004060700d43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d0f0090c04bb6db2b"
"b004060000d43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d0f0090c04bb6db2b"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: could we also arrive at these bytes constructing a call via subxt-signer and the static interface?

Copy link
Collaborator Author

@jsdw jsdw Sep 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We wouldn't even need to sign anything :)

This call is really just checking that, for a given random unsigned extrinsic, subxt (via the static interface above) produces the same bytes that somehting like Polkadot.js produces. ie we want to compare the Subxt bytes with some non-subxt bytes and if they align, we gain some confidence that subxt constructed them properly. We could also manually construct the bytes by hand in the test, but it's easier to trust something like Polkadot.js

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay cool! I didn't mean to use subxt-signer for signing but to get the public key of a dev account into the call payload, for the case that we could construct these tx bytes ourselves. But I guess it makes sense like this if we want to check against polkadot.js

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aah I see, yup we could have done that too :) But at least by checking the whole hash against a different client, we get a bit of extra reassurance that everything subxt does is in line at least with the other client; any part we do using Subxt code itself will no longer be "compared"

@@ -356,6 +352,7 @@ impl<T: Config> Stream for StorageFetchDescendantKeysStream<T> {

// We have some keys to hand back already, so do that.
if let Some(key) = this.keys.pop_front() {
println!("Storage entry: {key:?}");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe log::debug! although I feel they are left-overs from debugging :D

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oooh yup, whoops!!

@@ -279,20 +273,26 @@ async fn transaction_unstable_submit_and_watch() {
.transaction_unstable_submit_and_watch(&tx_bytes)
.await
.unwrap();

println!("about to wait for tx progress");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we could remove these 2 extra printlns

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup; I completely forgot to go clean up my debug logs by the looks of it! :)

@@ -63,7 +63,7 @@ async fn run() {

// Save metadata to a file:
let out_dir = env::var_os("OUT_DIR").unwrap();
let metadata_path = Path::new(&out_dir).join("metadata.scale");
let metadata_path = Path::new(&out_dir).join("test_node_runtime_metadata.scale");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dq: Does this fix a conflict with rust-analyzer that constantly spawns cargo check processes?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly I wasn't sure; I was just running into weird issues and decided to rename it as a bit of a last ditch thing and it seemed to work afterwards. Probably it was something local to my system getting confused, but no harm in making the name harder to collide with anything else :)

let block_ref = match block_ref {
Some(r) => r,
None => client.backend().latest_best_block_ref().await?,
None => client.backend().latest_finalized_block_ref().await?,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we make the change from using the best block to using the finalized block to offer a stronger guarantee? Since the best reported block might actually be pruned in the near future?

Copy link
Collaborator Author

@jsdw jsdw Sep 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah exactly; when I went over the code, it felt like using the current best block for this stuff by default was weird for that exact reason that it may well be pruned. So yeah, this will hopefully be a little more reliable, and of course the user can still provide a block hash that they prefer if they want to YOLO it :)

.pinned
.get(&details.parent_block_hash)
.map(|p| p.rel_block_num)
.unwrap_or(this.rel_block_num);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a good place to catch rpc-v2 implementation inconsistencies in the wild, but realize that we might not want to polute users with this.
What would you think about having a log::warning about the rpc not reporting parent hashes that we expect to exist?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea, buttt if the parent block is unpinned already then it will not exist when we look it up here, so I don't think I can so easily add a log that won't lead to spurious warnings

@lexnv
Copy link
Collaborator

lexnv commented Sep 25, 2023

Lovely PR and as always amazing work 👍

@jsdw jsdw merged commit cf7e2db into master Sep 26, 2023
10 checks passed
@jsdw jsdw deleted the jsdw-unstable-backend branch September 26, 2023 15:58
@jsdw jsdw mentioned this pull request Sep 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants