Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace RawVec with Boxed slice #94421

Closed
wants to merge 14 commits into from

Conversation

conradludgate
Copy link
Contributor

@conradludgate conradludgate commented Feb 27, 2022

Issue: #54470

On pause while we figure out aliasing rules for Box. Currently Box is noalias which holds for the contents. This breaks some of the unsafe tests we held on Vec (although we technically haven't promised them to always be upheld)

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Apr 6, 2022

☔ The latest upstream changes (presumably #95711) made this pull request unmergeable. Please resolve the merge conflicts.

@bors bors added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Apr 6, 2022
@conradludgate conradludgate force-pushed the vec-box branch 2 times, most recently from 0962b9c to d01e58a Compare April 9, 2022 07:29
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@conradludgate conradludgate marked this pull request as ready for review April 11, 2022 08:10
@bors
Copy link
Contributor

bors commented Apr 11, 2022

☔ The latest upstream changes (presumably #95125) made this pull request unmergeable. Please resolve the merge conflicts.

@rustbot rustbot added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Apr 30, 2022
@conradludgate conradludgate changed the title [wip] replace RawVec with Boxed slice replace RawVec with Boxed slice Apr 30, 2022
@5225225
Copy link
Contributor

5225225 commented Apr 30, 2022

Looks like miri complains about stacked borrows stuff when running https://github.com/rust-lang/miri-test-libstd on this.

I guess using a box here allows miri to see more about what the code is doing?

Error is:

# MIRI_LIB_SRC=~/src/rust/library ./run-test.sh alloc --all-targets
A libstd for Miri is now available in `/home/jess/.cache/miri/HOST`.
error: Undefined Behavior: trying to reborrow <4385> for Unique permission at alloc6094[0x0], but that tag does not exist in the borrow stack for this location
    --> /home/jess/src/rust/library/alloc/src/vec/mod.rs:2649:24
     |
2649 |             IntoIter { buf, phantom: PhantomData, ptr: begin, end }
     |                        ^^^
     |                        |
     |                        trying to reborrow <4385> for Unique permission at alloc6094[0x0], but that tag does not exist in the borrow stack for this location
     |                        this error occurs as part of a reborrow at alloc6094[0x0..0x18]
     |
     = help: this indicates a potential bug in the program: it performed an invalid operation, but the rules it violated are still experimental
     = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
             
     = note: inside `<std::vec::Vec<std::ffi::OsString> as core::iter::IntoIterator>::into_iter` at /home/jess/src/rust/library/alloc/src/vec/mod.rs:2649:24
     = note: inside `std::sys::unix::args::imp::args` at /home/jess/src/rust/library/std/src/sys/unix/args.rs:128:22
     = note: inside `std::sys::unix::args::args` at /home/jess/src/rust/library/std/src/sys/unix/args.rs:19:5
     = note: inside `std::env::args_os` at /home/jess/src/rust/library/std/src/env.rs:802:21
     = note: inside `std::env::args` at /home/jess/src/rust/library/std/src/env.rs:767:19
     = note: inside `test::test_main_static` at /home/jess/src/rust/library/test/src/lib.rs:131:16
     = note: inside `main`

I can't immediately see why that's considered to be wrong. And, if it's wrong in a way we care about in the original code, and using Box here just makes miri see what's up.

Going to ping @ RalfJung here Didn't realise you were on vacation, sorry!

@RalfJung
Copy link
Member

Miri treats Box special, but not Unique. So yeah if there are latent aliasing violations in RawVec, this PR would uncover them.

Whether this should be a bug or should be allowed, I don't know -- that would require investigating in depth what the code is doing here that leads to an error.

@RalfJung
Copy link
Member

Also see rust-lang/unsafe-code-guidelines#326

@saethlin
Copy link
Member

I originally posted this on the UCG issue, but it probably belongs here.

This PR needs a lot of careful attention, and probably a better way to run Miri on it (running miri-test-libstd on a local checkout and manually dumping the cache is the best I could come up with, and that's quite an arduous procedure). For example, this diff trips right over the issue that the comment above it says to avoid:

    pub fn as_mut_ptr(&mut self) -> *mut T {
        // We shadow the slice method of the same name to avoid going through
        // `deref_mut`, which creates an intermediate reference.
-        let ptr = self.buf.ptr();
+        let ptr = self.buf.as_mut_ptr().cast::<T>();
        unsafe {
            assume(!ptr.is_null());
        }

There's also an issue where the extend impl when passed a slice produces an uninitialized Box. I started looking into that issue, but it's specialization soup down there and I don't have patience to dig through that at the moment.

But I agree broadly with Ralf's assessment that this probably reveals some pretty deep aliasing problems, though getting to them might require a lot of patient iteration. The issues I've mentioned above are detected inside the Rust runtime, which is not a great place for debugging because Miri doesn't have spans.

@conradludgate
Copy link
Contributor Author

For the record, this PR is more to scratch a curiosity of mine, rather than actually being implemented.

Although I would definitely prefer this code over RawVec. IMO the abstraction makes more sense. I would prefer explaining that Vec<T> is a boxed slice allocation tracking how many values are initialised, vs Vec<T> is just a RawVec<T> (which no one outside of alloc would really be expected to know about). The box solution makes vec feel less magic.

It also helps explain away how vec![] works by doing (box [..]).into_vec(). Again, mental jumps to make

@bors
Copy link
Contributor

bors commented May 4, 2022

☔ The latest upstream changes (presumably #94775) made this pull request unmergeable. Please resolve the merge conflicts.

@eddyb
Copy link
Member

eddyb commented May 10, 2022

This PR would fix #54470, can you add that to the PR description? Thanks for working on this!

@scottmcm
Copy link
Member

FWIW, I'd also really like to see this. Using Box<[MaybeUninit<T>], A> really makes it clear what's responsible for what. (As just came up on URLO recently.) And having things like Vec::deref use just MaybeUninit::slice_assum_init_ref also keeps it from needing to talk about things that are irrelevant to it -- right now it's using from_raw_parts (without a SAFETY comment) which has way more preconditions than Vec is really responsible for maintaining.

Vec::deref getting to be just slice_assum_init_ref(self.buffer.get_unchecked(..len)) is beautiful -- one call for each of the two things that are Vec's responsibility. (And nothing about "is the pointer valid" part.)

@saethlin
Copy link
Member

I'm not sure @conradludgate is interested in driving this over the finish line, but I might be. I'm soft-blocked on the aforementioned ICE but I might be able to keep making progress by just removing non-default allocator tests for now. I just don't want to accumulate too much discussion here or apply pressure to someone who isn't really invested in this work.

@conradludgate
Copy link
Contributor Author

I'm happy to help drive it! But the soundness issues with stacked borrows and aliasing are very much out of my area of expertise

@saethlin
Copy link
Member

saethlin commented May 11, 2022

My branch of this work now passes Miri with -Zmiri-tag-raw-pointers. I fixed about 3 soundness issues in the new implementation, and ~12 separate pieces of code in the standard library that this turns into UB. I had to comment out 2 tests which use Box with a custom allocator.

As far as I can tell the test which asserts that Vec provides pointer stability is incompatible with this implementation. It's unclear to me if this PR is supposed to remove pointer stability from Vec or not.

Now I think I'm going to try running some of the most-downloaded crates with this standard library to see how much ecosystem code turns into UB.

@scottmcm
Copy link
Member

As far as I can tell the test which asserts that Vec provides pointer stability is incompatible with this implementation. It's unclear to me if this PR is supposed to remove pointer stability from Vec or not.

Can you elaborate what "pointer stability" means here? Maybe it would make sense to open a PR (draft, if you want) with your changes so it could be discussed there?

@saethlin
Copy link
Member

Draft PR: #96607

Can you elaborate what "pointer stability" means here?

This program becomes UB:

fn main() {
    let mut v = Vec::with_capacity(2);
    v.push(0);
    let ptr = &v[0] as *const i32;
    v.push(0);
    unsafe { dbg!(*ptr) };
}

Miri reports:

$ MIRIFLAGS=-Zmiri-tag-raw-pointers MIRI_LIB_SRC=/home/ben/rust/library cargo +miri miri run
    Finished dev [unoptimized + debuginfo] target(s) in 0.01s
     Running `/home/ben/.cargo/bin/cargo-miri target/miri/x86_64-unknown-linux-gnu/debug/scratch`
error: Undefined Behavior: attempting a read access using <3294> at alloc1659[0x0], but that tag does not exist in the borrow stack for this location
 --> src/main.rs:6:14
  |
6 |     unsafe { dbg!(*ptr) };
  |              ^^^^^^^^^^
  |              |
  |              attempting a read access using <3294> at alloc1659[0x0], but that tag does not exist in the borrow stack for this location
  |              this error occurs as part of an access at alloc1659[0x0..0x4]
  |
  = help: this indicates a potential bug in the program: it performed an invalid operation, but the rules it violated are still experimental
  = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <3294> was created by a retag at offsets [0x0..0x4]
 --> src/main.rs:4:15
  |
4 |     let ptr = &v[0] as *const i32;
  |               ^^^^^
help: <3294> was later invalidated at offsets [0x0..0x8]
 --> src/main.rs:5:5
  |
5 |     v.push(0);
  |     ^^^^^^^^^
  = note: inside `main` at /home/ben/rust/library/std/src/macros.rs:311:13
  = note: this error originates in the macro `dbg` (in Nightly builds, run with -Z macro-backtrace for more info)

note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace

error: aborting due to previous error

@koehlma
Copy link
Contributor

koehlma commented May 18, 2022

Sorry for chiming in. I certainly lack the expertise to understand the intricacies of the ongoing debate but @SabrinaJewson just opened an issue for one of my crates which would be affected by this PR. I just wanted to let you know that I was under the impression that Vec explicitly guaranteed stable references. In particular, the statement

push and insert will never (re)allocate if the reported capacity is sufficient.

of the Guarantees section with the previous description of the memory layout makes it seem as if references are guaranteed to be stable when pushing onto a Vec whose length is less than its capacity. At least, this has been my reasoning at the time of implementing the crate. There is also another statement about converting the Vec into a Box without reallocating and “moving” the elements. After reading this PR, it is unclear to me whether these guarantees become void with this PR or whether my understanding of them has been wrong all the time. In the latter case, maybe the documentation could be improved to clarify that stability is explicitly not a guarantee but rather is/was an implementation detail?

@RalfJung
Copy link
Member

@koehlma thanks for your comment! This is valuable feedback. :)
Could I ask you to move this discussion to #54470 instead? PR discussions tend to get mixed with implementation details and are easily lost; your concern is more general than that though.

@JohnCSimon
Copy link
Member

ping from triage:
@conradludgate
can you please address the merge issues and post your status on this PR?

FYI: when a PR is ready for review, send a message containing
@rustbot ready to switch to S-waiting-on-review so the PR is in the reviewer's backlog.

@saethlin
Copy link
Member

The community Discord found a rather diabolical solution for the pointer invalidation issue:

    pub fn as_mut_ptr(&mut self) -> *mut T {
        // We shadow the slice method of the same name to avoid going through
        // `deref_mut`, which creates an intermediate reference.
        unsafe {
            let ptr: &*mut [T] = core::mem::transmute(&self.buf);
            let ptr = *ptr as *mut T;
            assume(!ptr.is_null());
            ptr
        }
    }

Using this implementation technique, I can make the test_stable_pointers test I mentioned above pass Miri.

@saethlin
Copy link
Member

Predictably, this just gets Miri past one test and runs into at least one more issue. Don't get too excited just yet...

@JohnCSimon
Copy link
Member

@conradludgate
ping from triage - can you post your status on this PR? Thanks

@JohnCSimon
Copy link
Member

@conradludgate
Ping from triage: I'm closing this due to inactivity, Please reopen when you are ready to continue with this.
Note: if you do please open the PR BEFORE you push to it, else you won't be able to reopen - this is a quirk of github.
Thanks for your contribution.

@rustbot label: +S-inactive

@JohnCSimon JohnCSimon closed this Nov 6, 2022
@rustbot rustbot added the S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. label Nov 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.