Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework decoding of Boxes, Rcs, Arcs, arrays and enums (stack overflow fix) #426

Merged
merged 38 commits into from
Jun 15, 2023

Conversation

koute
Copy link
Contributor

@koute koute commented Apr 20, 2023

This PR reworks how we decode Boxes etc. and arrays; in particular it fixes three issues:

  • Trying to deserialize a Box which contains a big array doesn't overflow the stack anymore. This is fixed by first preallocating an empty Box and then directly deserializing into it, instead of first deserializing the value on the stack and moving it into the Box after it's deserialized.
  • Trying to deserialize big nested enums doesn't overflow the stack anymore. AFAIK this is due to an upstream bug in rustc (Match expressions use O(n) stack space with n branches in debug mode rust-lang/rust#34283) and one workaround which seems to work is to just wrap the body of each match with a lambda and immediately call it and return the value. I've also added a test for this issue.
  • Elements of partially read arrays are now properly dropped if the whole array can't be decoded.

I've also updated Cargo.lock and updated to criterion 0.4 while I was at it.

Fixes #419

Fixes #425

@gilescope gilescope requested a review from jsdw April 20, 2023 08:36
src/codec.rs Outdated Show resolved Hide resolved
Co-authored-by: Squirrel <giles@parity.io>
tests/mod.rs Show resolved Hide resolved
src/codec.rs Outdated Show resolved Hide resolved
src/codec.rs Outdated Show resolved Hide resolved
src/codec.rs Show resolved Hide resolved
@michalkucharczyk michalkucharczyk dismissed their stale review April 21, 2023 11:11

I'd prefer someone else to approve.

@koute
Copy link
Contributor Author

koute commented May 8, 2023

I've resolved conflicts, bumped version from 3.5.0 to 3.6.0 (since 3.5.0 was released in the meantime) and cleaned up the README.

I've also cleaned up our feature flags semantics:

  • The max-encoded-len now doesn't automatically enable the derive macro reexports nor pull in the parity-scale-codec-derive dependency.
  • Enabling the parity-scale-codec-derive without enabling the derive feature doesn't reexport the macros anymore; derive is now necessary.

These two issues are why @ggwpez made #429; now with these changes that PR should be not necessary anymore. (: (The issue was that the derive macros were accidentally imported twice, not that the derive macro and the trait were imported with the same name - that compiles just fine, so importing the derive macros and the traits under different names is not necessary.)

These are technically breaking changes, but max-encoded-len is explicitly documented an unstable, and I'd argue the derive macro reexports working without the derive feature enabled was a bug (everyone should have been enabling the derive feature), so I didn't bump the major version.

Should be good to go.

Copy link
Member

@ggwpez ggwpez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two issues are why @ggwpez made #429; now with these changes that PR should be not necessary anymore. (: (The issue was that the derive macros were accidentally imported twice, not that the derive macro and the trait were imported with the same name - that compiles just fine, so importing the derive macros and the traits under different names is not necessary.)

Could you point to the line of how this is fixed now?
AFAIK it was importing scale::Decode (macro + trait) and scale_derive::Decode (macro).
So yes, the macro was imported twice, but without the derive feature the macro was not available for import from scale.

derive/src/decode.rs Show resolved Hide resolved
@koute
Copy link
Contributor Author

koute commented May 8, 2023

Could you point to the line of how this is fixed now?

  • The use parity_scale_codec_derive::{Encode, Decode}; was feature gated on whether the derive feature is enabled.
  • The reexport of Encode and Decode was was feature gated on whether the parity-scale-codec-derive feature is enabled.
  • Enabling the max-encoded-len feature also implicitly enabled parity-scale-codec-derive.

So basically the use parity_scale_codec_derive::{Encode, Decode}; was enabled because it thought that the derive reexports were not enabled (because the derive feature was not enabled), but the reexports weren't actually controlled by the derive feature but by the parity-scale-codec-derive feature, and that was transitively enabled by the max-encoded-len feature, so we ended up with the derive macro imported twice: once explicitly with use and once through the parity-scale-codec reexport (where it overlaps with the traits, so importing the trait also automatically imports the macro).

src/codec.rs Outdated Show resolved Hide resolved
src/codec.rs Show resolved Hide resolved
src/codec.rs Outdated Show resolved Hide resolved
src/codec.rs Outdated Show resolved Hide resolved
let ptr: *mut u8 = ptr.cast();
let bytesize = core::mem::size_of::<T>().checked_mul(N).expect("array lengths are sane");

// TODO: This is potentially slow; it'd be better if `Input` supported
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to see specialization here for primitive types using [0; N] on the stack:

use num::{Zero, zero};

fn x<T: Zero + Copy, const N: usize>() -> [T; N] {
    let z = zero::<T>();
    [z; N]
}

So you don't have to init MaybeUninit::<u8; N> with zeroes manually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, sorry, I'm not sure I understand what you're suggesting here. What exactly would this change? (: They'd have to be zeroed anyway as long as Input accepts a &[u8] instead of &mut [MaybeUninit<u8>].

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They'd have to be zeroed anyway

Yes, but they will be zeroed by compiler and there will be no need for an unsafe code to memzero them.

The previous implementation used that trick for i8/u8, see let mut array: [u8; N] = [0; N];. Still this comment is not a blocker, you can ignore it/implement in another PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I see what you mean now!

Hmm... it which case it'd probably be better to just make it possible to read into the array without initializing it. (:

src/codec.rs Show resolved Hide resolved
src/codec.rs Show resolved Hide resolved
// and we only drop at most `count` elements here,
// so all of the elements we drop here are valid.
unsafe {
item.assume_init_drop();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not stable, but I'll add a TODO comment.

Copy link
Member

@bkchr bkchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice implementation, I like :D

Left some minor comments on things that should be improved. Especially the story around DecodeContext and decode_into is not really well explained. Currrently this isn't callable by someone outside of the crate, which is fine, but should be left some reason why.

It would also be nice to create some issue for the assume_init stuff and the best would be to directly link to the Rust issues.

Cargo.toml Outdated Show resolved Hide resolved
Comment on lines +623 to +624
// TODO: This is inefficient; use `Rc::new_uninit` once that's stable.
Box::<T>::decode(input).map(|output| output.into())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we not just put the code for the Box implementation into some macro and reuse it here? 🙈

Copy link
Contributor Author

@koute koute Jun 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we not just put the code for the Box implementation into some macro and reuse it here? see_no_evil

That'd be tricky because without ::new_uninit we have no good way to preallocate the uninitialized memory for the Rc/Arc.

In case of a Box<T> the only thing that the memory holds is a T, so we can get its layout and manually allocate the memory.

In case of Rc/Arc there are also the refcounts allocated in the same chunk of memory, and AFAIK there's no good way to get a Layout that could be used to manually preallocate that memory and pass it to Rc::from_raw. One way we could do it would be to just hardcode it (Rc<T> is basically equivalent Box<(usize, usize, T)>), and then verify in a build script that it's correct (in case it suddenly changes, since it's an internal implementation detail), and if it is then do it the fast way, and if it isn't then emit a compile-time warning and do it the slow way. So for all practical purposes that'd work and be safe, but obviously it is very hacky and I don't really like it. And I'm not convinced it's worth it. (As opposed to just waiting waiting until the new_uninit methods are stabilized.)

src/codec.rs Outdated Show resolved Hide resolved
src/codec.rs Outdated
///
/// This is enforced by requiring the implementation to return a [`DecodeFinished`]
/// which can only be created by calling [`DecodeContext::assert_decoding_finished`] which is `unsafe`.
fn decode_into<I: Input>(ctx: DecodeContext, input: &mut I, dst: &mut MaybeUninit<Self>) -> Result<DecodeFinished, Error> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really some public api? DecodeContext can not be constructed outside of this crate. If this should not be used by someone external, we should put a doc(hide) above it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, you may want people that they can implement it, but not call it themselves?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, you may want people that they can implement it, but not call it themselves?

It wasn't entirely a deliberate "let's block it", but with no immediate use case in mind I just didn't make it publicly constructible.

Anyhow, it's a moot point now though as I nuked the DecodeContext type anyway. :P

src/codec.rs Outdated Show resolved Hide resolved
src/codec.rs Show resolved Hide resolved
src/codec.rs Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: done
Development

Successfully merging this pull request may close these issues.

Stack overflow issue in scalecodec while decoding inputs Stack overflow when decoding large boxed arrays
8 participants