Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can references to uninhabited types ever be valid? #413

Open
RalfJung opened this issue Jun 6, 2023 · 59 comments
Open

Can references to uninhabited types ever be valid? #413

RalfJung opened this issue Jun 6, 2023 · 59 comments

Comments

@RalfJung
Copy link
Member

RalfJung commented Jun 6, 2023

This discussion presupposes that "Do we have full recursive validity for references" (#412) is answered "no".

Is the following UB?

let x: &! = transmute(&());

The reason we could consider making this UB is that all of the arguments in #412 do not apply: to determine whether an &! is valid, we do not have to check memory, do any potentially racy reads, or anything like that. We can just answer the question with "no".

Basically, we can think of ! having an impossible-to-satisfy alignment requirement, and therefore &! is invalid since it is never properly aligned. (I don't really want to fold this into the alignment check, but the point is that we already have the validity of &T depend on T, namely via alignment -- I see no fundamental reason why we couldn't do the same with inhabitedness.)

@digama0
Copy link

digama0 commented Jun 6, 2023

Does this issue also cover &Void and &(!, T)? The latter is problematic since I think there are situations where you can have partially uninit references (and this also partially overlaps with #346). But maybe it suffices to say that an initialized &T (i.e. a value returned from a function, like your transmute example) requires that T be inhabited, even though uninitialized or partially initialized places of type &! or &(!, T) can exist.

@RalfJung
Copy link
Member Author

RalfJung commented Jun 6, 2023

Does this issue also cover &Void and &(!, T)?

Yes.

While you can have partially initialized locals of type (!, T), I don't think you can have references to them. Do you have an example for that?

@chorman0773
Copy link
Contributor

Is there a reason why they should be invalid unless &T is invalid if the T is?

@RalfJung
Copy link
Member Author

RalfJung commented Jun 6, 2023

One motivation is being able to replace the entire body of an &self function where self is uninhabited with just unreachable_unchecked(). Back at the last Rust All hands (2019 I think?), someone expressed strong interest in that (code size) optimization.

@digama0
Copy link

digama0 commented Jun 6, 2023

One motivation is being able to replace the entire body of an &self function where self is uninhabited with just unreachable_unchecked(). Back at the last Rust All hands (2019 I think?), someone expressed strong interest in that (code size) optimization.

This has been mentioned elsewhere and I think you agreed to it, but that is not a very compelling motivation since you can just put match *self {} in the body to get the same optimization. (If I had my druthers match self {} would also work; this does not strictly speaking require a "no" answer to the OP question.)

@digama0
Copy link

digama0 commented Jun 6, 2023

While you can have partially initialized locals of type (!, T), I don't think you can have references to them. Do you have an example for that?

I don't, at least not without writing an unsafe block that would be UB under the OP question (more or less by definition). A variation that may or may not count is &Box<!>; I'm not sure whether this counts as inhabited or not, and it might actually thread the needle between competing notions of inhabitedness since you can empty-match on it and move out a ! (although maybe you want to call this a safety violation), even though by structural analysis of Box<!> you get a Unique which could just be a dangling pointer.

My main motivation for considering that case specially is that the borrow checker does allow for types like &(T, U) to have more complex initialization states, and future lang proposals might reify these as types (e.g. view types), in which case you might have a function that takes an argument of type &{only T is init} (!, T) and you would not be able to assume that this type is uninhabited.

@Diggsey
Copy link

Diggsey commented Jun 6, 2023

One motivation is being able to replace the entire body of an &self function where self is uninhabited with just unreachable_unchecked(). Back at the last Rust All hands (2019 I think?), someone expressed strong interest in that (code size) optimization.

Wouldn't it be almost always trivial to prove that such a function is unreachable anyway? Assuming reachability checks occur post-monomorphisation.

@saethlin
Copy link
Member

saethlin commented Jun 6, 2023

One motivation is being able to replace the entire body of an &self function where self is uninhabited with just unreachable_unchecked(). Back at the last Rust All hands (2019 I think?), someone expressed strong interest in that (code size) optimization.

I feel like we could do an experiment here to figure out if this is actually worth anything?

@digama0
Copy link

digama0 commented Jun 6, 2023

If it's unreachable, then you wouldn't even need an unreachable_unchecked, the whole function could be deleted. But it could be a field in a trait object, for example, in which case you do need the function (but it could be a dangling pointer? 🤔 )

@RalfJung
Copy link
Member Author

RalfJung commented Jun 7, 2023

This has been mentioned elsewhere and I think you agreed to it, but that is not a very compelling motivation since you can just put match *self {} in the body to get the same optimization. (If I had my druthers match self {} would also work; this does not strictly speaking require a "no" answer to the OP question.)

I'm trying to remember the old discussion. The concern might have been that these are trait impls generated automatically / via proc macros for tons of types. So we can't just add custom code into the body.

I think it was @cramertj who raised this wish back then, not sure if they still remember any details though.

Wouldn't it be almost always trivial to prove that such a function is unreachable anyway? Assuming reachability checks occur post-monomorphisation.

If it's a public function in a lib crate, how would that be trivial?

@saethlin
Copy link
Member

saethlin commented Jun 7, 2023

The perf run for this optimization implemented as a MIR pass indicates no benefit in terms of code size: https://perf.rust-lang.org/compare.html?start=afab3662eb066f05fcdb43c421b72dd19472e752&end=0f4641ea16e650c77a03aa649330702722a5f66f&stat=size%3Alinked_artifact

@RalfJung
Copy link
Member Author

RalfJung commented Jun 8, 2023

I guess the benchmarks don't contain auto-derived code for uninhabited types. Also as a MIR opt this is less effective than it could be, if generic types and their impls are monomorphized for an uninhabited type.

@RalfJung
Copy link
Member Author

RalfJung commented Jun 8, 2023

My main motivation for considering that case specially is that the borrow checker does allow for types like &(T, U) to have more complex initialization states

I have never seen this happen, so this is news to me. What is an example of that?

@digama0
Copy link

digama0 commented Jun 8, 2023

I take it back, the borrow checker will reason about partially initialized Box<(T, U)> but not &(T, U). However I can come up with examples that would plausibly benefit from it:

#[derive(Default)]
struct Foo {
  a: String,
  ...
  z: String,
}
let mut x = Foo::default();
drop(x.z);
|| println!("{}, ..., {}", x.a, ..., x.y);

Currently, the returned closure captures &x.a, ..., &x.y as references and hence would be 25 pointers large, but a future optimization could capture &x instead with a borrow state indicating that x.z is uninit.

@RalfJung
Copy link
Member Author

RalfJung commented Jun 8, 2023

the borrow checker will reason about partially initialized Box<(T, U)> but not &(T, U).

Ah, yes that makes sense -- Box<T> is treated like T in many regards. I don't see us doing that for references though since it is valid in much fewer cases. And if we ever want to do that, we can just take back the UB on &Uninhabited.

Currently, the returned closure captures &x.a, ..., &x.y as references and hence would be 25 pointers large, but a future optimization could capture &x instead with a borrow state indicating that x.z is uninit.

With per-field closure capture, don't we actually do capture just a single reference and use it for all fields? That was considered as an implementation strategy at some point.

But anyway if we do that, we just have to pick a suitable type for the closure capture environment, one that is not uninhabited. Arguably actually using &(!, i32) for that capture would be a blatant lie and could confuse various parts of the compiler.

@chorman0773
Copy link
Contributor

The concern might have been that these are trait impls generated automatically / via proc macros for tons of types.

I'd expect that a lot of those trait impls will ultimately contain match *self{} as the first item.

use serde::Serialize;
#[derive(Serialize)]
pub enum Void{}

expands to

pub enum Void {}
#[doc(hidden)]
#[allow(non_upper_case_globals, unused_attributes, unused_qualifications)]
const _: () =
    {
        #[allow(unused_extern_crates, clippy :: useless_attribute)]
        extern crate serde as _serde;
        #[allow(unused_macros)]
        macro_rules! try {
            ($__expr : expr) =>
            {
                match $__expr
                {
                    _serde :: __private :: Ok(__val) => __val, _serde ::
                    __private :: Err(__err) =>
                    { return _serde :: __private :: Err(__err) ; }
                }
            }
        }
        #[automatically_derived]
        impl _serde::Serialize for Void {
            fn serialize<__S>(&self, __serializer: __S)
                -> _serde::__private::Result<__S::Ok, __S::Error> where
                __S: _serde::Serializer {
                match *self {}
            }
        }
    };

(I tried with builtin derives, and it just generates core::intrinsics::unreachable(), so apparently we can insert custom code into them in particular.

@RalfJung
Copy link
Member Author

RalfJung commented Jun 9, 2023 via email

@saethlin
Copy link
Member

saethlin commented Jun 9, 2023

I guess the benchmarks don't contain auto-derived code for uninhabited types.

No, they do.

Also as a MIR opt this is less effective than it could be, if generic types and their impls are monomorphized for an uninhabited type.

I moved the transform to codegen where we have monomoprhized MIR. It still has no effect, and in a stage2 build of rustc it only fires one additional time.
https://perf.rust-lang.org/compare.html?start=8b35c0bb0f833c0077dc57006eb317edde2a2d1e&end=b7380c57d311609c0fd4f128fbe8adbbe0f96a24&stat=size%3Alinked_artifact

Many of the times this optimization fires, it transforms this:

bb0 {
    _0 = const _;
    return;
}

into

bb0 {
    unreachable;
}

Hardly anything to celebrate.

What we don't have is a highly artificial crate that just derives a bunch of traits on a bunch of types which are all uninhabited. Maybe then this would show up. But that form of argument can justify any optimization so I don't think it is worth considering.

I do not think an argument about the value of this optimization holds up. But the try toolchain(s) are up there, so if someone has a codebase where they can demonstrate it helping, please do so.

@RalfJung
Copy link
Member Author

RalfJung commented Jun 10, 2023

Given a large tuple type with an uninhabited field, we currently do generate a whole bunch of MIR / LLVM IR that is all completely pointless, right? I was told that in the Fuchsia codebase, being able to optimize this is relevant. That's all I know, and it seems plausible. You are saying in our benchmarks we have no such types. Fair, but that doesn't really say much about the code in the wild.

I'm not trusting our ability to come up with real-world benchmarks and predict future coding patterns enough to conclusively say "this optimization is useless and will also never be useful in the future". (I will also note that this kind of argument hasn't been used in other UB discussions before, despite other forms of UB being a lot more subtle, having non-local effects, and having no real-world evidence of being helpful either... are we going to demand real-world optimization evidence for every single bit of UB, and make everything else defined? I'm fine with "here is an optimization that demonstrates conclusively that his UB is useful" as an argument, but I find the negative version of this a lot less convincing. It's hard to come up with good evidence for the absence of a usecase.)

It seems to me to be cheap and easy to do this (have this UB), since there really isn't any reason to allow such references to exist and the extra complexity incurred on the spec is minimal. It's also easy to take back the UB and make it a mere safety invariant, if this ever causes a problem somewhere. It's impossible to go in the other direction, if we discover in the future that we really want this optimization.

@saethlin
Copy link
Member

I don't mean to argue that "this optimization is useless and will also never be useful in the future". But I do think that if we are going to claim that some optimization is valuable currently, we should have data to back up that claim. I thought that claim was being made up-thread (by you someone at an all-hands you are echoing).

I'll see if I can take a look at Fuchsia later, if I can figure out how to interact with it.

@RalfJung
Copy link
Member Author

RalfJung commented Jun 10, 2023 via email

@RalfJung
Copy link
Member Author

One other reason it might make sense to consider &! and &Void validity-uninhabited is coherence with match rules. I assume with x: &Void we want to accept match *x {}, and in fact we do accept it today. However, if validity for references allows &Void to exist, then there exists code where match *x { _ => () } is actually well-defined (i.e., the arm is reachable), and yet removing the match arm gets accepted! That seems somewhat incoherent.

Arguably, fixing this by making more code UB is a bad fix, but the alternative of rejecting the empty match does not seem appealing either.

@digama0
Copy link

digama0 commented Oct 25, 2023

However, if validity for references allows &Void to exist, then there exists code where match *x { _ => () } is actually well-defined (i.e., the arm is reachable), and yet removing the match arm gets accepted! That seems somewhat incoherent.

Just as a single data point, this behavior doesn't seem that weird to me. I am used to pattern matching that goes looking for exhaustive matches, and using _ means that it doesn't do that. I would rather optimize for the case of safe code here, because there is nothing about this example that involves unsafe code and x: &Void is already outside its safety invariant. This example is no more concerning to me than the fact that writing let foo = *x; is UB if x: &u32 is a dangling pointer (substitute &&u32, &Vec<u32> or similar depending on where we land on shallow validity). You shouldn't allow these time bombs in safe code in the first place.

compiler-errors added a commit to compiler-errors/rust that referenced this issue Jan 26, 2024
… r=compiler-errors

Add the `min_exhaustive_patterns` feature gate

## Motivation

Pattern-matching on empty types is tricky around unsafe code. For that reason, current stable rust conservatively requires arms for empty types in all but the simplest case. It has long been the intention to allow omitting empty arms when it's safe to do so. The [`exhaustive_patterns`](rust-lang#51085) feature allows the omission of all empty arms, but hasn't been stabilized because that was deemed dangerous around unsafe code.

## Proposal

This feature aims to stabilize an uncontroversial subset of exhaustive_patterns. Namely: when `min_exhaustive_patterns` is enabled and the data we're matching on is guaranteed to be valid by rust's operational semantics, then we allow empty arms to be omitted. E.g.:

```rust
let x: Result<T, !> = foo();
match x { // ok
    Ok(y) => ...,
}
let Ok(y) = x; // ok
```

If the place is not guaranteed to hold valid data (namely ptr dereferences, ref dereferences (conservatively) and union field accesses), then we keep stable behavior i.e. we (usually) require arms for the empty cases.

```rust
unsafe {
    let ptr: *const Result<u32, !> = ...;
    match *ptr {
        Ok(x) => { ... }
        Err(_) => { ... } // still required
    }
}
let foo: Result<u32, &!> = ...;
match foo {
    Ok(x) => { ... }
    Err(&_) => { ... } // still required because of the dereference
}
unsafe {
    let ptr: *const ! = ...;
    match *ptr {} // already allowed on stable
}
```

Note that we conservatively consider that a valid reference can point to invalid data, hence we don't allow arms of type `&!` and similar cases to be omitted. This could eventually change depending on [opsem decisions](rust-lang/unsafe-code-guidelines#413). Whenever opsem is undecided on a case, we conservatively keep today's stable behavior.

I proposed this behavior in the [`never_patterns`](rust-lang#118155) feature gate but it makes sense on its own and could be stabilized more quickly. The two proposals nicely complement each other.

## Unresolved Questions

Part of the question is whether this requires an RFC. I'd argue this doesn't need one since there is no design question beyond the intent to omit unreachable patterns, but I'm aware the problem can be framed in ways that require design (I'm thinking of the [original never patterns proposal](https://smallcultfollowing.com/babysteps/blog/2018/08/13/never-patterns-exhaustive-matching-and-uninhabited-types-oh-my/), which would frame this behavior as "auto-nevering" happening).

EDIT: I initially proposed a future-compatibility lint as part of this feature, I don't anymore.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 26, 2024
… r=compiler-errors

Add the `min_exhaustive_patterns` feature gate

## Motivation

Pattern-matching on empty types is tricky around unsafe code. For that reason, current stable rust conservatively requires arms for empty types in all but the simplest case. It has long been the intention to allow omitting empty arms when it's safe to do so. The [`exhaustive_patterns`](rust-lang#51085) feature allows the omission of all empty arms, but hasn't been stabilized because that was deemed dangerous around unsafe code.

## Proposal

This feature aims to stabilize an uncontroversial subset of exhaustive_patterns. Namely: when `min_exhaustive_patterns` is enabled and the data we're matching on is guaranteed to be valid by rust's operational semantics, then we allow empty arms to be omitted. E.g.:

```rust
let x: Result<T, !> = foo();
match x { // ok
    Ok(y) => ...,
}
let Ok(y) = x; // ok
```

If the place is not guaranteed to hold valid data (namely ptr dereferences, ref dereferences (conservatively) and union field accesses), then we keep stable behavior i.e. we (usually) require arms for the empty cases.

```rust
unsafe {
    let ptr: *const Result<u32, !> = ...;
    match *ptr {
        Ok(x) => { ... }
        Err(_) => { ... } // still required
    }
}
let foo: Result<u32, &!> = ...;
match foo {
    Ok(x) => { ... }
    Err(&_) => { ... } // still required because of the dereference
}
unsafe {
    let ptr: *const ! = ...;
    match *ptr {} // already allowed on stable
}
```

Note that we conservatively consider that a valid reference can point to invalid data, hence we don't allow arms of type `&!` and similar cases to be omitted. This could eventually change depending on [opsem decisions](rust-lang/unsafe-code-guidelines#413). Whenever opsem is undecided on a case, we conservatively keep today's stable behavior.

I proposed this behavior in the [`never_patterns`](rust-lang#118155) feature gate but it makes sense on its own and could be stabilized more quickly. The two proposals nicely complement each other.

## Unresolved Questions

Part of the question is whether this requires an RFC. I'd argue this doesn't need one since there is no design question beyond the intent to omit unreachable patterns, but I'm aware the problem can be framed in ways that require design (I'm thinking of the [original never patterns proposal](https://smallcultfollowing.com/babysteps/blog/2018/08/13/never-patterns-exhaustive-matching-and-uninhabited-types-oh-my/), which would frame this behavior as "auto-nevering" happening).

EDIT: I initially proposed a future-compatibility lint as part of this feature, I don't anymore.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 26, 2024
… r=compiler-errors

Add the `min_exhaustive_patterns` feature gate

## Motivation

Pattern-matching on empty types is tricky around unsafe code. For that reason, current stable rust conservatively requires arms for empty types in all but the simplest case. It has long been the intention to allow omitting empty arms when it's safe to do so. The [`exhaustive_patterns`](rust-lang#51085) feature allows the omission of all empty arms, but hasn't been stabilized because that was deemed dangerous around unsafe code.

## Proposal

This feature aims to stabilize an uncontroversial subset of exhaustive_patterns. Namely: when `min_exhaustive_patterns` is enabled and the data we're matching on is guaranteed to be valid by rust's operational semantics, then we allow empty arms to be omitted. E.g.:

```rust
let x: Result<T, !> = foo();
match x { // ok
    Ok(y) => ...,
}
let Ok(y) = x; // ok
```

If the place is not guaranteed to hold valid data (namely ptr dereferences, ref dereferences (conservatively) and union field accesses), then we keep stable behavior i.e. we (usually) require arms for the empty cases.

```rust
unsafe {
    let ptr: *const Result<u32, !> = ...;
    match *ptr {
        Ok(x) => { ... }
        Err(_) => { ... } // still required
    }
}
let foo: Result<u32, &!> = ...;
match foo {
    Ok(x) => { ... }
    Err(&_) => { ... } // still required because of the dereference
}
unsafe {
    let ptr: *const ! = ...;
    match *ptr {} // already allowed on stable
}
```

Note that we conservatively consider that a valid reference can point to invalid data, hence we don't allow arms of type `&!` and similar cases to be omitted. This could eventually change depending on [opsem decisions](rust-lang/unsafe-code-guidelines#413). Whenever opsem is undecided on a case, we conservatively keep today's stable behavior.

I proposed this behavior in the [`never_patterns`](rust-lang#118155) feature gate but it makes sense on its own and could be stabilized more quickly. The two proposals nicely complement each other.

## Unresolved Questions

Part of the question is whether this requires an RFC. I'd argue this doesn't need one since there is no design question beyond the intent to omit unreachable patterns, but I'm aware the problem can be framed in ways that require design (I'm thinking of the [original never patterns proposal](https://smallcultfollowing.com/babysteps/blog/2018/08/13/never-patterns-exhaustive-matching-and-uninhabited-types-oh-my/), which would frame this behavior as "auto-nevering" happening).

EDIT: I initially proposed a future-compatibility lint as part of this feature, I don't anymore.
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Jan 26, 2024
Rollup merge of rust-lang#118803 - Nadrieril:min-exhaustive-patterns, r=compiler-errors

Add the `min_exhaustive_patterns` feature gate

## Motivation

Pattern-matching on empty types is tricky around unsafe code. For that reason, current stable rust conservatively requires arms for empty types in all but the simplest case. It has long been the intention to allow omitting empty arms when it's safe to do so. The [`exhaustive_patterns`](rust-lang#51085) feature allows the omission of all empty arms, but hasn't been stabilized because that was deemed dangerous around unsafe code.

## Proposal

This feature aims to stabilize an uncontroversial subset of exhaustive_patterns. Namely: when `min_exhaustive_patterns` is enabled and the data we're matching on is guaranteed to be valid by rust's operational semantics, then we allow empty arms to be omitted. E.g.:

```rust
let x: Result<T, !> = foo();
match x { // ok
    Ok(y) => ...,
}
let Ok(y) = x; // ok
```

If the place is not guaranteed to hold valid data (namely ptr dereferences, ref dereferences (conservatively) and union field accesses), then we keep stable behavior i.e. we (usually) require arms for the empty cases.

```rust
unsafe {
    let ptr: *const Result<u32, !> = ...;
    match *ptr {
        Ok(x) => { ... }
        Err(_) => { ... } // still required
    }
}
let foo: Result<u32, &!> = ...;
match foo {
    Ok(x) => { ... }
    Err(&_) => { ... } // still required because of the dereference
}
unsafe {
    let ptr: *const ! = ...;
    match *ptr {} // already allowed on stable
}
```

Note that we conservatively consider that a valid reference can point to invalid data, hence we don't allow arms of type `&!` and similar cases to be omitted. This could eventually change depending on [opsem decisions](rust-lang/unsafe-code-guidelines#413). Whenever opsem is undecided on a case, we conservatively keep today's stable behavior.

I proposed this behavior in the [`never_patterns`](rust-lang#118155) feature gate but it makes sense on its own and could be stabilized more quickly. The two proposals nicely complement each other.

## Unresolved Questions

Part of the question is whether this requires an RFC. I'd argue this doesn't need one since there is no design question beyond the intent to omit unreachable patterns, but I'm aware the problem can be framed in ways that require design (I'm thinking of the [original never patterns proposal](https://smallcultfollowing.com/babysteps/blog/2018/08/13/never-patterns-exhaustive-matching-and-uninhabited-types-oh-my/), which would frame this behavior as "auto-nevering" happening).

EDIT: I initially proposed a future-compatibility lint as part of this feature, I don't anymore.
@ijackson
Copy link

Currently, with nightly, this program:

#![feature(never_type)]
use std::mem::transmute;

fn main() {
    let x = &();
    let x: &! = unsafe { transmute(x) };
    match *x { _ => {} }
}

https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=8cfc2b548c44d8f0b00b1dca258f7e43

runs, but complains about an unreachable match arm.

If you remove the supposedy unreachable arm,

#![feature(never_type)]
use std::mem::transmute;

fn main() {
    let x = &();
    let x: &! = unsafe { transmute(x) };
    match *x { }
}

it dies with SIGILL.

I think this is rather strange, unless both programs have UB.

@ijackson
Copy link

As a possible argument in favour of &! being UB to construct:

In derive-adhoc I have a large enum whose variants contain "markers" that are sometimes units and sometimes uninhabited: See https://gitlab.torproject.org/Diziet/rust-derive-adhoc/-/blob/main/macros/framework.rs?ref_type=heads#L115 and https://gitlab.torproject.org/Diziet/rust-derive-adhoc/-/blob/main/macros/syntax.rs?ref_type=heads#L67

This allows the reuse of the same enum (source code) for similar value sets that occur in different contexts. Call sites that receive one of these values can call trait methods like this: https://gitlab.torproject.org/Diziet/rust-derive-adhoc/-/blob/main/macros/paste.rs?ref_type=heads#L816

This allows me as the author to statically demonstrate that the situation is impossible, and therefore I don't have to write code to handle it.

It would be nice if the compiler could remove as much of this as possible. Ideally, the uninhabited enum variants, and all the code that handles uninhabited values, would completely vanish as soon as possible after monomorphisation.

@Nadrieril
Copy link
Member

Your example is an instance of rust-lang/rust#117288, which the lang team has agreed was something they wanted to fix (the arm is deemed unreachable because the *x expression is incorrectly believed to diverge).

@ijackson
Copy link

Here's another example where disallowing &! improves things:

enum MockableRef<'m> {
    Production,
    Mocked(&'m Mocks),
}

#[cfg(not(test))]
enum Mocks {}

We hope MockableRef is a unit in production code, since only one of its variants ought ever to exist. But it's not: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=cb3a9289904acb3112d76b7a9284ea9f

@ijackson
Copy link

(Maybe bare &! could be ok, if we could say that the constructor MockableRef::Mocked counts as a function and therefore passing it &! is UB and therefore ::Mocked cannot exist.)

@RalfJung
Copy link
Member Author

runs, but complains about an unreachable match arm.

Yeah that should be fixed... I thought @Nadrieril fixed this recently but I guess that was another match lint issue that got fixed?

(Maybe bare &! could be ok, if we could say that the constructor MockableRef::Mocked counts as a function and therefore passing it &! is UB and therefore ::Mocked cannot exist.)

I wouldn't want to make function calls special this way. I'd prefer making &! validity-uninhabited. (It obviously is safety-uninhabited so the validity invariant is the only open question.)

@Nadrieril
Copy link
Member

Nadrieril commented Feb 13, 2024

Yeah what I fixed is the exhaustiveness code; that example triggers a warning because of divergence tracking instead (i.e. it triggers the unreachable_code lint, and I only fixed the unreachable_pattern lint)

@dhedey

This comment has been minimized.

@dhedey
Copy link

dhedey commented Oct 8, 2024

And then, some further musing about why I feel constructing (or at least matching on) a &! should be UB:

I don't quite understand why logically match &empty_enum requires a blanket _ => unreachable!(..) but match &non_empty_enum can quite happily match on just its variants, and doesn't need such a blanket wildcard?

My mathematical intuition feels unsatisfied with this... To try to be more concrete about my abstract intuition - if it's not UB to create &! then it seems reasonable to me that a non-existing variant of a non-empty enum would also be !, and so I could create a &non_existing_variant_of_non_empty_enum, at which point the fact that match &non_empty_enum doesn't require a blanket match would be incorrect.

I'm not suggesting we require a blanket statement when matching on &non_empty_enum here (this would be such a major code break) - but that the idea of needing such a term matching &empty_enum is inconsistent.

@chorman0773
Copy link
Contributor

I agree with that point (I believe I stated it differently above). Deduction shows that matching on a reference to a zero-variant enum having zero arms should be trivially valid, because matching on a reference to an n-variant (n>0) enum having n arms is valid.

@dhedey

This comment has been minimized.

@Diggsey
Copy link

Diggsey commented Oct 8, 2024

@dhedey I think the difference is that when a match statement has arms that pattern match, then a dereference must occur, and that dereference is what's UB. However, if you have an empty match, or a match arm that matches everything, eg.

match invalid_ref {
    _ => println!("No dereference!")
}

Then no dereference occurs, and so there is no UB.

@dhedey
Copy link

dhedey commented Oct 8, 2024

Thanks @Diggsey that makes sense 👍.

And so the edge case is more specifically "what about if there are zero arms?". In the dereference model you explain, clearly zero arms don't require a deference either... but also the compiler can only permit zero arms it if the type is uninhabited (which - for my notes - appears stricter than invalid. e.g. uninit might be invalid but not uninhabited).

And so that's why the question becomes "Can references to uninhabited types ever be valid?".

It seems like most people on this thread agree (and indeed Miri prevents let x: &EmptyEnum = unsafe { transmute(x) };) - so I'll stop talking at this point.

And I look forward to a time where, if indeed it's decided that "we can say that &! is uninhabited", then we can then fix the empty match issue, potentially independent of a resolution on #412

@chorman0773
Copy link
Contributor

chorman0773 commented Oct 8, 2024

@dhedey I think the difference is that when a match statement has arms that pattern match, then a dereference must occur, and that dereference is what's UB. However, if you have an empty match, or a match arm that matches everything, eg.

match invalid_ref {
_ => println!("No dereference!")
}
Then no dereference occurs, and so there is no UB.

This is true of n-variant enums as well, if we use the correct generalization: Replacing every arm in the match of a reference to an enum with a single _ pattern, there is less UB. The fact that n=0 doesn't make a difference, it just makes the inference rule trivial. as adding the _ pattern is the same as replacing every other arm (all 0 of them) with an _ pattern.

@Nadrieril
Copy link
Member

Do you want full recursive inhabitedness? E.g. should &&! and &(T, &!) be uninhabited too?

@RalfJung
Copy link
Member Author

RalfJung commented Oct 9, 2024

// This gives an E0004 error

This isn't even an opsem question though.

That's a language design question of whether and when we want to implicitly dereference things. IMO the proper fix here is to systematically write the same thing in both cases where you actually say what you want:

impl MyEnum {
    fn with_self(self) {
        // This compiles fine: the `!` pattern says "I claim there's no possible case here".
        // The compiler checks that that is true.
        match self { ! }
    }

    fn with_ref(&self) {
        // This also compiles fine
        match self { ! }
    }
}

https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=1da7610869a4bdf886b5c7702fa3d154

Please keep this thread on topic. This thread is not about match. It is about things like let x: &EmptyEnum = unsafe { transmute(&0u128) }; and whether they should be different from let x: &NonEmptyEnum = unsafe { transmute(&0u128) };

So please open a new issue (in the rustc repo, not here) about the macro concern. That's a valid point, just off-topic here. Whether we accept that match code is essentially completely independent of what we decide here.

@RalfJung
Copy link
Member Author

RalfJung commented Oct 9, 2024

then this appears to be permitted:

No, you just didn't properly understand all the rules that play together here. Please go to Zulip when you are confused about things like this, so that we can keep this thread for discussions about the actual subject matter, rather than discussions about what the question even means. :)

@dhedey
Copy link

dhedey commented Oct 9, 2024

Thanks @RalfJung - apologies for being accidentally off-topic. I appreciate bystander intuition and confusion isn't very helpful with your in-depth mission to untangle unsafe semantics.

I have raised rust-lang/rust#131452 to track this unexpected match behaviour. If you have time, I'd appreciate you verifying that I have correctly captured the intention of your comment - and in particular, the fact that the "&void being inhabited" is in your mind not an objection/blocker to this behaviour being changed, as it was deemed to be in the previous such issue ( rust-lang/rust#78123 ).

On the off-topic note, I was attempting to provide a concrete example in reference to your comment further up the thread:

I'm trying to remember the old discussion. The concern might have been that these are trait impls generated automatically / via proc macros for tons of types. So we can't just add custom code into the body.

Which, if it's still relevant and I haven't got the wrong end of the stick, I believe is provided by the motivating example in rust-lang/rust#131452

Thanks again for your help, and apologies for any derailment!

@RalfJung
Copy link
Member Author

RalfJung commented Oct 9, 2024

No worries, and thanks for opening that issue with a very clear motivating use-case. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants