Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Upgrade to libp2p-0.31. #7606

Merged
4 commits merged into from
Nov 27, 2020
Merged

Upgrade to libp2p-0.31. #7606

4 commits merged into from
Nov 27, 2020

Conversation

romanb
Copy link
Contributor

@romanb romanb commented Nov 26, 2020

This PR upgrades substrate to libp2p-0.31. Highlights:

@romanb romanb added A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C3-medium PR touches the given topic and has a medium impact on builders. labels Nov 26, 2020
@romanb romanb requested a review from mxinden as a code owner November 26, 2020 11:11
client/network/src/service.rs Outdated Show resolved Hide resolved
@romanb
Copy link
Contributor Author

romanb commented Nov 26, 2020

I'm wondering if we can already remove support for the fallback noise ix config with this PR? It would take out one negotiation round-trip from the connection setup.

@mxinden
Copy link
Contributor

mxinden commented Nov 26, 2020

I'm wondering if we can already remove support for the fallback noise ix config with this PR? It would take out one negotiation round-trip from the connection setup.

As far as I can tell, the spec compliant handshake was introduced back in May with #6064 and should thus be supported by the majority of nodes. With that in mind, I don't see why not to remove the legacy fallback. Is there something I am missing?

@romanb
Copy link
Contributor Author

romanb commented Nov 26, 2020

As far as I can tell, the spec compliant handshake was introduced back in May with #6064 and should thus be supported by the majority of nodes. With that in mind, I don't see why not to remove the legacy fallback. Is there something I am missing?

Maybe not, I'm just never quite up-to-date w.r.t. which versions are running where and I think it is not uncommon for there to be a few months between a merged substrate PR and a polkadot deployment that includes these changes, so I better ask twice. If @tomaka also thinks we can remove it, I will go ahead and do it!

@@ -368,7 +397,7 @@ impl<B: BlockT + 'static, H: ExHashT> NetworkWorker<B, H> {

// Add external addresses.
for addr in &params.network_config.public_addresses {
Swarm::<B, H>::add_external_address(&mut swarm, addr.clone());
Swarm::<B, H>::add_external_address(&mut swarm, addr.clone(), AddressScore::Infinite);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the patch for #7518.

@tomaka
Copy link
Contributor

tomaka commented Nov 26, 2020

According to this comment, we have already broken compatibility with nodes that don't have #6064, and so we can remove the legacy noise handshake.

@tomaka
Copy link
Contributor

tomaka commented Nov 26, 2020

Could you also update the burn-in PR?

@romanb
Copy link
Contributor Author

romanb commented Nov 26, 2020

Could you also update the burn-in PR?

Of course, done in paritytech/polkadot@4020626.

@tomaka
Copy link
Contributor

tomaka commented Nov 27, 2020

image

The "request answer time", in other words the time before receiving a response to a request that the node has sent out, has been divided by two, as predicted.

The "request serving time", which is the other way around, has also been divided by two, and I don't understand why. Looking at the source code, this measures the time elapsed between upgrade_inbound being called and the response being written on the substream. If this is indeed what is being measured, then the times being shown (100ms to 300ms) don't make sense, and we should investigate this out of the scope of this PR.

Since nodes only perform one concurrent request at a time, a consequence of the serving time being reduced is that more requests are being served per second, and the bandwidth usage increased.
To me this is caused by the fact that the node running this burn-in is suddenly twice as good as all the other nodes on the network. Once this PR has been deployed, the bandwidth should be back to previous levels.


image

The number of failed connection attempts has increased, maybe because of the legacy Noise handshake. It is unclear whether this is caused by older nodes.

@mxinden
Copy link
Contributor

mxinden commented Nov 27, 2020

The "request serving time", which is the other way around, has also been divided by two, and I don't understand why. Looking at the source code, this measures the time elapsed between upgrade_inbound being called and the response being written on the substream. If this is indeed what is being measured, then the times being shown (100ms to 300ms) don't make sense, and we should investigate this out of the scope of this PR.

This is indeed surprising. I compared the kusama-sentry-ew4-1 running this pull request with canary-sentry-ew4-0 running recent master.

First off, to proof that the above phenomenon is not just unrelated to this pull request and instead related to a change on master, the graph below compares the median request handling rate:

image

When comparing the 95th percentile instead of the median the difference becomes less obvious:

image

With that in mind, a wild guess from me would be: Given that the handshake time decreased, peers can ask for blocks more frequently. Asking for blocks more frequently implies smaller replies which in turn implies shorter request handling time.

@tomaka
Copy link
Contributor

tomaka commented Nov 27, 2020

Given that the handshake time decreased, peers can ask for blocks more frequently.

But the handshake time hasn't decreased for requests emitted by remotes.

@romanb
Copy link
Contributor Author

romanb commented Nov 27, 2020

But the handshake time hasn't decreased for requests emitted by remotes.

Maybe I'm missing something, but if we have the inbound request handling like so

< /my/protocol
< [request-data]
> /my/protocol
> [response-data]

i.e. a single round-trip, isn't it expected that this is quicker than

< /my/protocol
> /my/protocol
< [request-data]
> [response-data]

?

@romanb
Copy link
Contributor Author

romanb commented Nov 27, 2020

Oh, I guess it is because other nodes don't use this branch yet that this is unexpected.

@mxinden
Copy link
Contributor

mxinden commented Nov 27, 2020

The number of failed connection attempts has increased, maybe because of the legacy Noise handshake. It is unclear whether this is caused by older nodes.

Following up on the increased dialing attempt errors, I am not sure this is related to this pull request.

Again comparing kusama-sentry-ew4-1 running this pull request with canary-sentry-ew4-0 running recent master both face increased dialing attempt errors.

image

Query: sum(rate(polkadot_sub_libp2p_pending_connections_errors_total{instance=~"kusama-sentry-ew4-1.*|canary-sentry-ew4-0.*"}[$__interval])) by (instance)

Looking at the dialing attempt errors of the canary node over the last 30 days, this issue seems to have been introduced earlier this month around the 19th of November:

image

Query: sum(rate(polkadot_sub_libp2p_pending_connections_errors_total{instance=~"canary-sentry-ew4-0.*"}[$__interval])) by (instance)

@tomaka
Copy link
Contributor

tomaka commented Nov 27, 2020

Let's merge this PR, then?

@tomaka
Copy link
Contributor

tomaka commented Nov 27, 2020

bot merge

@ghost
Copy link

ghost commented Nov 27, 2020

Trying merge.

@ghost ghost merged commit 6c0cd2a into master Nov 27, 2020
@ghost ghost deleted the libp2p-0.31 branch November 27, 2020 14:29
clearloop added a commit to patractlabs/substrate that referenced this pull request Dec 1, 2020
* CI: build docs after test; publish docs after build (paritytech#7591)

docs time test/build success on master pub

* node-template: add aura to light block import pipeline (paritytech#7595)

added aura to block import pipeline

* Fix notifications sometimes not being sent (paritytech#7594)

* Fix notifications sometimes not being sent

* Add comment

* Bump rpassword from 4.0.5 to 5.0.0 (paritytech#7597)

Bumps [rpassword](https://github.com/conradkleinespel/rpassword) from 4.0.5 to 5.0.0.
- [Release notes](https://github.com/conradkleinespel/rpassword/releases)
- [Commits](conradkleinespel/rpassword@v4.0.5...v5.0.0)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* remove std feature flags for assert macros (paritytech#7600)

* remove std feature flags for assert macros

* re-add note about availability in no_std envs

* Add small header cache (paritytech#7516)

* Remove header query

* Header cache

* Fix potential race issue

* Simplify status query

* Inform sync explicitly about new best block (paritytech#7604)

* Inform sync explicitly about new best block

Instead of "fishing" the new best block out of the processed blocks, we
now tell sync directly that there is a new best block. It also makes
sure that we update the corresponding sync handshake to the new best
block. This is required for parachains as they first import blocks and
declare the new best block after being made aware of it by the relay chain.

* Adds test

* Make sure async stuff had time to run

* Bump directories from 2.0.2 to 3.0.1 (paritytech#7609)

Bumps [directories](https://github.com/soc/directories-rs) from 2.0.2 to 3.0.1.
- [Release notes](https://github.com/soc/directories-rs/releases)
- [Commits](https://github.com/soc/directories-rs/commits)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Remove `RpcMetrics` weirdness (paritytech#7608)

* Remove `RpcMetrics` weirdness

The metrics was returning an error when prometheus was not given. This
was a really weird setup, especially when compared to all other metrics
that just do nothing if there is no registry.

* Fix browser build

* Upgrade to libp2p-0.31. (paritytech#7606)

* Upgrade to libp2p-0.31.

* Address line width.

* Add generous incoming connection limit.

* Remove old noise configuration.

* Add Key Subcommand to node-template (paritytech#7615)

* Forward storage changes in manual seal (paritytech#7614)

This prevents nodes from executing the same block 2 times.

* chore/error: remove from str conversion and add deprecation notificat… (paritytech#7472)

* chore/error: remove from str conversion and add deprecation notifications

* fixup changes

* fix test looking for gone ::Msg variant

* another test fix

* one is duplicate, the other is not, so duplicates reported are n-1

* darn spaces

Co-authored-by: Andronik Ordian <write@reusable.software>

* remove pointless doc comments of error variants without any value

* low hanging fruits (for a tall person)

* moar error type variants

* avoid the storage modules for now

They are in need of a refactor, and the pain is rather large
removing all String error and DefaultError occurences.

* chore remove pointless error generic

* fix test for mocks, add a bunch of non_exhaustive

* max line width

* test fixes due to error changes

* fin

* error outputs... again

* undo stderr adjustments

* Update client/consensus/slots/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* remove closure clutter

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* more error types

* introduce ApiError

* extract Mock error

* ApiError refactor

* even more error types

* the last for now

* chore unused deps

* another extraction

* reduce should panic, due to extended error messages

* error test happiness

* shift error lines by one

* doc tests

* white space

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* Into -> From

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* remove pointless codec

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* avoid pointless self import

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

Co-authored-by: Bernhard Schuster <bernhard@parity.io>
Co-authored-by: Andronik Ordian <write@reusable.software>
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* network: don't force send block announcements (paritytech#7601)

* Change TRACING_SET to static (paritytech#7607)

* change TRACING_SET to static

* Update primitives/io/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* modify test with nested spans

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* sc-network: Log outgoing notifications too (paritytech#7624)

* Log outgoing notifications too

* Update client/network/src/protocol/generic_proto/handler.rs

Co-authored-by: Max Inden <mail@max-inden.de>

Co-authored-by: Addie Wagenknecht <addie@nortd.com>
Co-authored-by: Max Inden <mail@max-inden.de>

* Bump console_log from 0.1.2 to 0.2.0 (paritytech#7623)

Bumps [console_log](https://github.com/iamcodemaker/console_log) from 0.1.2 to 0.2.0.
- [Release notes](https://github.com/iamcodemaker/console_log/releases)
- [Commits](https://github.com/iamcodemaker/console_log/commits)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* `sudo_as` should return a result (paritytech#7620)

* Fix wrong value put for pending_opening (paritytech#7633)

* Fix wrong value put for pending_opening

* Oops, didn't even try compiling it

* Rename pallet trait `Trait` to `Config` (paritytech#7599)

* rename Trait to Config

* add test asserting using Trait is still valid.

* fix ui tests

* resolve unresolved error nits of paritytech#7617 (paritytech#7631)

* handle executor should_panic test better

* Revert "reduce should panic, due to extended error messages"

This reverts commit c080594.

* remove excessive constraints

* remove duplicate documentation messages for error variants

* reduce T: constraints to the abs minimum

* whoops

* fewer bounds again

Co-authored-by: Bernhard Schuster <bernhard@parity.io>

* Fix bad state transition with DisabledPendingEnable+OpenDesiredByRemote (paritytech#7638)

* Renames of `Trait` to `Config` in README.md, weight templates and few minor ones (paritytech#7636)

* manual rename

* renamse in README.md

* fix template

* Fix CI Link Check (paritytech#7639)

* fix trigger fingers

* more

* Update frame/example-offchain-worker/README.md

Co-authored-by: Guillaume Thiolliere <gui.thiolliere@gmail.com>

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
Co-authored-by: Guillaume Thiolliere <gui.thiolliere@gmail.com>

* Fix cargo clippy warning in peerset. (paritytech#7641)

* Fix cargo clippy warning in peerset.

* Update client/peerset/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com>

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com>

Co-authored-by: Denis Pisarev <denis.pisarev@parity.io>
Co-authored-by: André Silva <123550+andresilva@users.noreply.github.com>
Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Alexander Popiak <alexander.popiak@parity.io>
Co-authored-by: Arkadiy Paronyan <arkady.paronyan@gmail.com>
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
Co-authored-by: Roman Borschel <romanb@users.noreply.github.com>
Co-authored-by: Benjamin Kampmann <ben@gnunicorn.org>
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
Co-authored-by: Bernhard Schuster <bernhard@parity.io>
Co-authored-by: Andronik Ordian <write@reusable.software>
Co-authored-by: Andrew Plaza <aplaza@liquidthink.net>
Co-authored-by: Addie Wagenknecht <addie@nortd.com>
Co-authored-by: Max Inden <mail@max-inden.de>
Co-authored-by: Shawn Tabrizi <shawntabrizi@gmail.com>
Co-authored-by: Guillaume Thiolliere <gui.thiolliere@gmail.com>
Co-authored-by: jolestar <jolestar@gmail.com>
darkfriend77 pushed a commit to mogwaicoin/substrate that referenced this pull request Jan 11, 2021
* Upgrade to libp2p-0.31.

* Address line width.

* Add generous incoming connection limit.

* Remove old noise configuration.
This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C3-medium PR touches the given topic and has a medium impact on builders.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Addresses passed with --public-addr get purged after a while
4 participants