Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Batch signature verification #5023

Merged
merged 90 commits into from
Apr 16, 2020
Merged

Batch signature verification #5023

merged 90 commits into from
Apr 16, 2020

Conversation

NikVolf
Copy link
Contributor

@NikVolf NikVolf commented Feb 22, 2020

polkadot companion: paritytech/polkadot#939

@NikVolf NikVolf added the A3-in_progress Pull request is in progress. No review needed at this stage. label Feb 22, 2020
@NikVolf
Copy link
Contributor Author

NikVolf commented Feb 22, 2020

/bench

@parity-benchapp
Copy link

parity-benchapp bot commented Feb 22, 2020

Finished benchmark for branch: nv-dynamic-extensions

MASTER RESULT

import block/Wasm time:
[58.580 ms 58.687 ms 58.795 ms]

import block/Native time:
[20.870 ms 20.958 ms 21.057 ms]

BRANCH RESULT

import block/Wasm time:
[54.505 ms 54.596 ms 54.683 ms]
change:
[-7.3724% -7.0200% -6.6699%] (p = 0.00 < 0.05)
Performance has improved.

import block/Native time:
[18.150 ms 18.226 ms 18.298 ms]
change:
[-13.665% -12.992% -12.310%] (p = 0.00 < 0.05)
Performance has improved.

@burdges
Copy link

burdges commented Feb 23, 2020

We should put some benchmarks in schnorrkel for this so that we have a comparison. Can I ask @demimarie-parity to fix the existing ones?

@burdges
Copy link

burdges commented Feb 23, 2020

As discussed in dalek-cryptography/ed25519-dalek#117 we seemingly need deterministic batch verification for ed25519 since you can craft invalid blocks that's invalid for only some validators using its current form. We avoid this problem in schnorrkel by avoiding the cofactor. Yay Decaf!

@paritytech paritytech deleted a comment from parity-benchapp bot Feb 23, 2020
@parity-benchapp
Copy link

parity-benchapp bot commented Feb 23, 2020

Finished benchmark for branch: nv-dynamic-extensions

MASTER RESULT

import block/Wasm time:
[58.193 ms 58.269 ms 58.339 ms]

import block/Native time:
[20.671 ms 20.762 ms 20.845 ms]

BRANCH RESULT

import block/Wasm time:
[54.525 ms 54.595 ms 54.668 ms]
change:
[-6.4276% -6.1534% -5.8591%] (p = 0.00 < 0.05)
Performance has improved.

import block/Native time:
[18.018 ms 18.118 ms 18.227 ms]
change:
[-14.066% -13.136% -12.297%] (p = 0.00 < 0.05)
Performance has improved.

@NikVolf
Copy link
Contributor Author

NikVolf commented Feb 23, 2020

NikVolf deleted a comment from parity-benchapp

that was bench before ab94831 where batching was effectively activated

@bkchr
Copy link
Member

bkchr commented Feb 25, 2020

Just for testing, I think it would be nice to see a version that "dispatches" the verification to a background thread. The background thread could just verify signature by signature. However, as this is done in parallel to the execution of the transaction. We may can get even a better performance than blocking at the end for batch verify.

@NikVolf
Copy link
Contributor Author

NikVolf commented Feb 25, 2020

Just for testing, I think it would be nice to see a version that "dispatches" the verification to a background thread. The background thread could just verify signature by signature. However, as this is done in parallel to the execution of the transaction. We may can get even a better performance than blocking at the end for batch verify.

For sr25519/ed25519 part, I'm thinking of maybe even better approach - figure out optimal batch size (it is variable afaik, so 100 tx can be more cheap to batch-verify per-tx than 200, for example - @burdges ?), and then send batch of optimal size to the background thread once it is reached. I think it will be more robust on low-tier machines and will cost less computing power.

For ECDSA yes, I will add what you are proposing

@bkchr
Copy link
Member

bkchr commented Feb 25, 2020

What happens when we don't reach this batch size? I think the batch size should be much smaller to make sure we really dispatch it multiple times.

@NikVolf
Copy link
Contributor Author

NikVolf commented Feb 25, 2020

What happens when we don't reach this batch size? I think the batch size should be much smaller to make sure we really dispatch it multiple times.

It will just block-verify with suboptimal batch size at the end. It will be suboptimal in terms of time than doing everything at the background immediately, but it means that block are very far from full, and we don't need to waste computing power to try to import them as fast as possible?

@bkchr
Copy link
Member

bkchr commented Feb 25, 2020

The whole work isn't required when we talk about the current, almost empty blocks :D This optimizations will just be really useful for full blocks.

In general I just wanted to know what would be the difference when we import in the background while we execute the transactions. Maybe it is total useless and don't bring that much.

@NikVolf
Copy link
Contributor Author

NikVolf commented Feb 25, 2020

The whole work isn't required when we talk about the current, almost empty blocks :D This optimizations will just be really useful for full blocks.

This is my point also, yes :)

In general I just wanted to know what would be the difference when we import in the background while we execute the transactions. Maybe it is total useless and don't bring that much.

I will use your approach for ed25519 (it's batching is buggy anyway) and make a benchmarking run just for that. I don't think it is useless, I think theoretical answer to that that it will effectively remove signature verification from the main pipeline and give time performance boost of ~20% instead of current 13%.

@Demi-Marie
Copy link
Contributor

I filed w3f/schnorrkel#54 to fix the benchmarks @burdges.

// execute extrinsics
let (header, extrinsics) = block.deconstruct();
Self::execute_extrinsics_with_book_keeping(extrinsics, *header.number());

if !sp_io::crypto::batch_verify() {
panic!("Signature verification failed.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This really needs better justification. Obviously, this panic is reachable, so there needs to be an explanation for why panicking here is justified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should panic in runtime AFAIK

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks very much strange comparing to the rest of the runtime, but if I have understood correctly ant this is only being used for block execution (not validation and production) then if we are trying to import a block with even a single invalid signature, then yes panic it is.

@@ -106,10 +117,24 @@ impl Extensions {
self.extensions.insert(ext.type_id(), Box::new(ext));
}

/// Register extension `ext`.
pub fn register_with_type_id(&mut self, type_id: TypeId, extension: Box<dyn Extension>) -> Result<(), Error> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not have a separate type_id parameter. Instead, we should make Extension a subtrait of Any.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide an example?

primitives/io/src/batch_verifier.rs Outdated Show resolved Hide resolved
primitives/io/src/lib.rs Outdated Show resolved Hide resolved
@NikVolf NikVolf force-pushed the nv-dynamic-extensions branch 3 times, most recently from 78525b3 to 10f7356 Compare February 27, 2020 13:11
@paritytech paritytech deleted a comment from parity-benchapp bot Feb 27, 2020
@paritytech paritytech deleted a comment from parity-benchapp bot Feb 27, 2020
@burdges
Copy link

burdges commented Feb 28, 2020

An auditor noticed nodes gossiping bad transactions consumes too much CPU. Among the solutions is rate limiting gossiped transactions per gossip source. If we know the gossip source, and can assume the source checked all signatures, then we can batch verify transaction signatures gossiped from the same source. We require the Ed25519 batch_deterministic feature for this too.

We could benefit even more with other gossip messages signed by different nodes. We might want a more general infrastructure for this, ala https://www.zfnd.org/blog/futures-batch-verification/

@bkchr
Copy link
Member

bkchr commented Feb 28, 2020

An auditor noticed nodes gossiping bad transactions consumes too much CPU. Among the solutions is rate limiting gossiped transactions per gossip source. If we know the gossip source, and can assume the source checked all signatures, then we can batch verify transaction signatures gossiped from the same source. We require the Ed25519 batch_deterministic feature for this too.

We could benefit even more with other gossip messages signed by different nodes. We might want a more general infrastructure for this, ala https://www.zfnd.org/blog/futures-batch-verification/

I'm not sure about the current implementation, but we should use the same reputation system as it is used for sync. If a node sends a transaction with an invalid signature, it should be instantly disconnected.

@NikVolf
Copy link
Contributor Author

NikVolf commented Feb 28, 2020

An auditor noticed nodes gossiping bad transactions consumes too much CPU.

Can I read this auditor report?

@NikVolf
Copy link
Contributor Author

NikVolf commented Feb 28, 2020

/bench ed25519

@parity-benchapp
Copy link

parity-benchapp bot commented Feb 28, 2020

Finished benchmark for branch: nv-dynamic-extensions

Master: Wasm: 62.26 ms (+/- 0.14 ms)
Master: Native: 22.85 ms (+/- 0.11 ms)
Branch: Wasm: 56.60 ms (+/- 0.13 ms)
Branch: Native: 18.22 ms (+/- 0.08 ms)
Change: Wasm: -9.09%
Change: Native: -20.28%

@NikVolf
Copy link
Contributor Author

NikVolf commented Feb 28, 2020

Finished benchmark for branch: nv-dynamic-extensions

Master: Wasm: 62.26 ms (+/- 0.14 ms)
Master: Native: 22.85 ms (+/- 0.11 ms)
Branch: Wasm: 56.60 ms (+/- 0.13 ms)
Branch: Native: 18.22 ms (+/- 0.08 ms)
Change: Wasm: -9.09%
Change: Native: -20.28%

as predicted @bkchr 😅

@NikVolf
Copy link
Contributor Author

NikVolf commented Feb 28, 2020

/bench import

@parity-benchapp
Copy link

parity-benchapp bot commented Feb 28, 2020

Finished benchmark for branch: nv-dynamic-extensions

Benchmark: Import Benchmark (random transfers)

Master: Wasm: 61.76 ms (+/- 0.07 ms)
Master: Native: 23.48 ms (+/- 0.08 ms)
Branch: Wasm: 57.09 ms (+/- 0.17 ms)
Branch: Native: 20.08 ms (+/- 0.08 ms)
Change: Wasm: -7.56%
Change: Native: -14.48%

@gnunicorn gnunicorn assigned bkchr and NikVolf and unassigned bkchr Apr 14, 2020
Copy link
Member

@bkchr bkchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some last nitpicks and than we are good to go ;)

primitives/externalities/src/extensions.rs Outdated Show resolved Hide resolved
primitives/io/src/batch_verifier.rs Outdated Show resolved Hide resolved
primitives/io/src/lib.rs Outdated Show resolved Hide resolved
primitives/io/src/lib.rs Outdated Show resolved Hide resolved
/// Returns `true` if all signatures are correct, `false` otherwise.
#[cfg(feature = "std")]
pub fn verify_batch(
messages: Vec<&[u8]>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we make these 3 parameters impl Iterator<&[u8]> etc, we could prevent some allocations.

Copy link
Contributor Author

@NikVolf NikVolf Apr 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but schnorrkel expects slice and IntoIterator :(

https://docs.rs/schnorrkel/0.9.1/schnorrkel/fn.verify_batch.html

NikVolf and others added 3 commits April 15, 2020 09:25
Co-Authored-By: Bastian Köcher <bkchr@users.noreply.github.com>
Co-Authored-By: Bastian Köcher <bkchr@users.noreply.github.com>
primitives/io/src/lib.rs Outdated Show resolved Hide resolved
primitives/io/src/lib.rs Outdated Show resolved Hide resolved
NikVolf and others added 4 commits April 15, 2020 13:14
Co-Authored-By: Bastian Köcher <bkchr@users.noreply.github.com>
Co-Authored-By: Bastian Köcher <bkchr@users.noreply.github.com>
@tomusdrw tomusdrw removed their request for review April 15, 2020 10:35
# Conflicts:
#	bin/node/runtime/src/lib.rs
#	frame/executive/Cargo.toml
@bkchr bkchr merged commit d615043 into master Apr 16, 2020
@bkchr bkchr deleted the nv-dynamic-extensions branch April 16, 2020 19:40
NikVolf added a commit that referenced this pull request Apr 17, 2020
* create parallel tasks extension

* make type system happy

* basic externalities

* test for dynamic extensions

* batching test

* remove premature verify_batch

* shnschnorrkel batch

* alter test

* shnschnorrkel test

* executive batching

* some docs

* also multi/any signatgures

* error propagation

* styling

* make verification extension optional

* experimental ed25519 parallelization

* some merge fallout

* utilize task executor

* merge fallout

* utilize task executor more

* another merge fallout

* feature-gate sp-io

* arrange toml

* fix no-std

* sr25519 batching and refactoring

* add docs

* fix name

* add newline

* fix block import test

* long sr25519 test

* blocking instead of parking

* move everything in crypto

* return batch_verify to check :)

* use condvars

* use multi-threaded executor for benches

* don't call via host interface

* try no spawning

* add true

* cleanup

* straighten batching

* remove signature check from this test (?)

* remove now pointless test

* remove another now useless test

* fix warnings

* Revert "remove another now useless test"

This reverts commit bbdec24.

* rethink the sp-io-part

* Revert "remove now pointless test"

This reverts commit 4d55306.

* fix wording

* add  wording

* add todo and fix

* return check and fix

* add logging in sp-io

* Update primitives/io/src/batch_verifier.rs

Co-Authored-By: cheme <emericchevalier.pro@gmail.com>

* address review and use std condvar

* account for early exit

* address reivew

* address review

* more suggestions

* add docs for batch verification

* remove unused

* more review suggestions

* move to sp-runtime

* add expects

* remove blocks

* use entry

* Update primitives/io/src/batch_verifier.rs

Co-Authored-By: Bastian Köcher <bkchr@users.noreply.github.com>

* Update primitives/externalities/src/extensions.rs

Co-Authored-By: Bastian Köcher <bkchr@users.noreply.github.com>

* update overlooked note

* remove stupid return

* Update primitives/io/src/lib.rs

Co-Authored-By: Bastian Köcher <bkchr@users.noreply.github.com>

* Update primitives/io/src/lib.rs

Co-Authored-By: Bastian Köcher <bkchr@users.noreply.github.com>

* fix wording

* bump spec_version

Co-authored-by: cheme <emericchevalier.pro@gmail.com>
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
# Conflicts:
#	bin/node/runtime/src/lib.rs
#	frame/executive/Cargo.toml
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants