Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CTFE/Miri engine Pointer type overhaul #87123

Merged
merged 16 commits into from
Jul 17, 2021

Conversation

RalfJung
Copy link
Member

@RalfJung RalfJung commented Jul 14, 2021

This fixes the long-standing problem that we are using Scalar as a type to represent pointers that might be integer values (since they point to a ZST). The main problem is that with int-to-ptr casts, there are multiple ways to represent the same pointer as a Scalar and it is unclear if "normalization" (i.e., the cast) already happened or not. This leads to ugly methods like force_mplace_ptr and force_op_ptr.
Another problem this solves is that in Miri, it would make a lot more sense to have the Pointer::offset field represent the full absolute address (instead of being relative to the AllocId). This means we can do ptr-to-int casts without access to any machine state, and it means that the overflow checks on pointer arithmetic are (finally!) accurate.

To solve this, the Pointer type is made entirely parametric over the provenance, so that we can use Pointer<AllocId> inside Scalar but use Pointer<Option<AllocId>> when accessing memory (where None represents the case that we could not figure out an AllocId; in that case the offset is an absolute address). Moreover, the Provenance trait determines if a pointer with a given provenance can be cast to an integer by simply dropping the provenance.

I hope this can be read commit-by-commit, but the first commit does the bulk of the work. It introduces some FIXMEs that are resolved later.
Fixes rust-lang/miri#841
Miri PR: rust-lang/miri#1851
r? @oli-obk

@rust-highfive
Copy link
Collaborator

Some changes occured to the CTFE / Miri engine

cc @rust-lang/miri

Some changes occured to rustc_codegen_cranelift

cc @bjorn3

Some changes occured to the CTFE / Miri engine

cc @rust-lang/miri

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 14, 2021
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

Comment on lines +168 to +169
// We know `offset` is relative to the allocation, so we can use `into_parts`.
let (data, start) = match ecx.scalar_to_ptr(a.check_init().unwrap()).into_parts() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could into_parts assert this flag internally? Or does miri also have a need for it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah Miri needs to access the offset to compare between Pointer<AllocId> (address is relative) and Pointer<Tag> (address is absolute).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I actually did think of a way to make this more type-safe: the Provenance trait could have an associated type Offset, and a function wrap_offset(Size) -> Offset. Then into_parts would return Offset, so code generic in Tag would actually not know the type Offset and this could not use it wrong.

Then we might go even further and make the Pointer::offset field be of type Offset... though at that point we cannot define generic pointer arithmetic methods any more so we'd need some more methods in the Provenance trait.

But I am not sure if that's worth it. I had one bug caused by mis-interpreting a relative offset as absolute, and this would not have caught it (since the bug was about pointers stored inside an Allocation, where the offset is part of the byte array anyway).

// so ptr-to-int casts are not possible (since we do not know the global physical offset).
const OFFSET_IS_ADDR: bool = false;

fn fmt(ptr: &Pointer<Self>, f: &mut fmt::Formatter<'_>) -> fmt::Result {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there a custom formatting function? Couldn't this be an impl on Pointer<AllocId> directly, allowing all aggregates containing Pointer to derive their Debug impls?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how. The Tag type affects how the Pointer itself is rendered:

  • Pointer<AllocId>: alloc123+0x13
  • Pointer<Tag>: 0x12defg60[alloc123]<Stacked Borrows info>

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm mainly unhappy about the roundabout formatting stuff that requires one to essentially write the derived impl by hand.

Yeah I am unhappy about that as well.^^ Open to suggestions for how to do this better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can miri implement Display for Pointer since Tag is a miri-internal type?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't that only work for "fundamental" types?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh also that would not help -- the generic code in rustc_mir::interpret needs to know for sure that it can debug-print pointers. So we need something generic.

Is it possible to add Pointer<Self::PointerTag>: Debug as a requirement to the Machine trait?^^

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add Pointer<Self::PointerTag>: Debug as a requirement to the Machine trait?^^

Not in a useful way (you'd have to repeat the bound at every usage site of Machine).

but you could keep your fmt function and only implement Debug manually on Pointer<Tag: Provenance> and then you should be able to derive Debug everywhere else, as long as you add the Tag: Provenance bound to all the types instead of only on the impls. It's not nice, but nicer than doing manual trivial impls everywhere I think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking about this again, it's not too bad, so we can merge and figure out improvements later

@oli-obk
Copy link
Contributor

oli-obk commented Jul 15, 2021

The design lgtm in general, I'm mainly unhappy about the roundabout formatting stuff that requires one to essentially write the derived impl by hand.

@rust-log-analyzer

This comment has been minimized.

@RalfJung
Copy link
Member Author

I pushed some more commits, mostly because changes were needed to get Miri to work again.

One change is rather unfortunate: I had to give the init_allocation_extra machine hook access to &Memory (and not just &MemoryExtra, and this necessitated a less optimal implementation of Memory::get_mut. With a better borrow checker we could recover the original cost for the happy case (where the allocation is in the local memory), but there'd still be extra cost for when we have to reach for the global memory. I also hat to make MonoHashMap::get in Miri less efficient to avoid RefCell panics.
The reason Miri needs &Memory is that it needs to query the size and alignment of allocations. So one possible alternative would be to make dead_alloc_map actually contain all allocations, not just the dead ones; and to give the init_allocation_extra machine hook access to that map as well. This would be a rather gross hack since Rust does not have good support for representing "a struct with some of its fields borrowed away".

An extremely coarse benchmark (time ./miri test pass) showed no regressions in Miri performance; let's make sure there is no rustc regression either:
@bors try
@rust-timer queue

@rust-log-analyzer

This comment has been minimized.

@oli-obk
Copy link
Contributor

oli-obk commented Jul 16, 2021

r=me if you want, unless you are already working on the things we discussed and want to get them into this PR

@RalfJung
Copy link
Member Author

Yeah I am already experimenting now.^^

@RalfJung
Copy link
Member Author

RalfJung commented Jul 16, 2021

This leads to an unnecessary Tag: Debug bound (because we can't control the where clause if the generated Debug, which is the entire problem of course), but I still think it's a win overall.

I did not yet do this for Scalar and ScalarMaybeUninit though, just to the higher-level types. Those two have custom Debug impls anyway.

@RalfJung
Copy link
Member Author

Ah, with that last commit this PR actually has a net negative line count. :)

I wish the derive macro would support adding extra where clauses...
@oli-obk
Copy link
Contributor

oli-obk commented Jul 17, 2021

@bors r+

@bors
Copy link
Contributor

bors commented Jul 17, 2021

📌 Commit efbee50 has been approved by oli-obk

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 17, 2021
@bors
Copy link
Contributor

bors commented Jul 17, 2021

⌛ Testing commit efbee50 with merge c78ebb7...

@bors
Copy link
Contributor

bors commented Jul 17, 2021

☀️ Test successful - checks-actions
Approved by: oli-obk
Pushing c78ebb7 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jul 17, 2021
@bors bors merged commit c78ebb7 into rust-lang:master Jul 17, 2021
@rustbot rustbot added this to the 1.55.0 milestone Jul 17, 2021
@rust-highfive
Copy link
Collaborator

📣 Toolstate changed by #87123!

Tested on commit c78ebb7.
Direct link to PR: #87123

💔 miri on windows: test-pass → build-fail (cc @eddyb @RalfJung @oli-obk).
💔 miri on linux: test-pass → build-fail (cc @eddyb @RalfJung @oli-obk).

rust-highfive added a commit to rust-lang-nursery/rust-toolstate that referenced this pull request Jul 17, 2021
Tested on commit rust-lang/rust@c78ebb7.
Direct link to PR: <rust-lang/rust#87123>

💔 miri on windows: test-pass → build-fail (cc @eddyb @RalfJung @oli-obk).
💔 miri on linux: test-pass → build-fail (cc @eddyb @RalfJung @oli-obk).
@RalfJung RalfJung deleted the miri-provenance-overhaul branch July 17, 2021 18:06
bors added a commit to rust-lang/miri that referenced this pull request Jul 17, 2021
adjust Miri to Pointer type overhaul

This is the Miri side of rust-lang/rust#87123.

This was a lot more work than I expected... lucky enough it is also (hopefully) the last large-scale refactoring I will do.^^

Fixes #224
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make force_ptr a total operation, and ptr-to-int-casts not require extra machine state
8 participants