Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add clarifying context to the most confusing pointer APIs #95851

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 35 additions & 2 deletions library/core/src/ptr/const_ptr.rs
Original file line number Diff line number Diff line change
Expand Up @@ -520,9 +520,16 @@ impl<T: ?Sized> *const T {
/// Calculates the distance between two pointers. The returned value is in
/// units of T: the distance in bytes is divided by `mem::size_of::<T>()`.
///
/// This function is the inverse of [`offset`].
/// This is equivalent to `(self as isize - other as isize) / (mem::size_of::<T>() as isize)`,
/// except that it has a lot more opportunities for UB, in exchange for the compiler
/// understanding what you're doing better.
///
/// [`offset`]: #method.offset
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB: I dropped the note that it was the "inverse" of offset because this is a genuinely confusing statement.

offset is an asymmetric binary operation, it's like talking about the "inverse" of xy. There are two answers! (yth root and logx)

/// The primary motivation of this method is for computing the `len` of an array/slice
/// of `T` that you are currently representing as a "start" and "end" pointer
/// (and "end" is "one past the end" of the array).
/// In that case, `end.offset_from(start)` gets you the length of the array.
///
/// All of the following safety requirements are trivially satisfied for this usecase.
///
/// # Safety
///
Expand Down Expand Up @@ -560,6 +567,7 @@ impl<T: ?Sized> *const T {
/// such large allocations either.)
///
/// [`add`]: #method.add
/// [`offset`]: #method.offset
/// [allocated object]: crate::ptr#allocated-object
///
/// # Panics
Expand Down Expand Up @@ -1021,6 +1029,31 @@ impl<T: ?Sized> *const T {
/// Computes the offset that needs to be applied to the pointer in order to make it aligned to
/// `align`.
///
/// Before getting into the precise semantics, it helps to have some context for why this
/// operation seems so weird on the surface. This is for two reasons:
///
/// * It's for SIMD, specifically [`slice::align_to`]
/// * It wants code that uses SIMD operations to work in a `const fn`
///
/// In particular, this operation is intended for breaking up a loop into its component
/// unaligned and aligned parts, so that the aligned parts can be processed using SIMD.
/// For that kind of pattern it's always ok to "fail" to align the pointer, because the
/// unaligned path can always handle all the data.
///
/// Lots of really basic operations want to use SIMD under the hood, so it would be very
/// frustrating if using this pattern made it impossible for an operation to work in
/// const contexts. Unfortunately, observing any properties of a pointer's address
/// is a very dangerous and problematic operation in const contexts, because it's a huge
/// reproducibility issue (which has actual soundness implications with compilation units).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also, like, just impossible since we don't know the address at which allocations are placed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "it's bad for at least two simple reasons" is good enough, unless you think it's more important to focus on that detail over the others?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(my concern is that if you get too deep into the details people will get mired in insisting it can work/makes sense/can be adjusted to make sense, and I think "nondeterminism bad" is the simplest way to dismiss that all.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find "dangerous and problematic" a bit vague and not as convincing as "it's simply not possible". But, no strong opinion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my eyes nondeterminism is kind of a poor argument here, considering that in general this function is nondeterministic at runtime in the sense that your allocation/stack whatever could be aligned to different things from one execution to the next, right? I think many folks would probably also respond to this with "ok, sure, but I'm fine with it".

It also seems to me that if we really wanted to, we probably could do "better" (e.g., if we guaranteed that all linker-level allocations were 8-byte-aligned, and then propagated that information down, creating something like monomorphized constants by the "parent" alignment). It's just actually doing that is very hard and not particularly worthwhile (you'd end up with a lot of duplicate work I'd guess).

Is there a reason we need this paragraph, instead of moving directly to the next one, which makes no real argument for why this is the case, aside from justifying it being OK to not worry about it since "we're not doing const SIMD anyway"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can constrain the non-determinism, and then some more things become possible. But others remain impossible (e.g. in your case, asking about any alignment >8).

The problem is not that allocation non-determinism makes the behavior of this function non-deterministic. The problem is that, as a const fn, this function is executed before allocation non-determinism has been resolved. Like, we'd literally have to predict the future to be able to always implement this as a const fn.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that makes sense. I think we could probably tackle all realistic programs in a similar way, but it's a pile of worms and a bunch of complexity. In any case, I'm not strictly opposed to the wording here, but it also feels like it's an unnecessary diversion -- what do you think about just skipping this middle paragraph, and moving directly to the lack of need for it, i.e., const not supporting SIMD etc anyway?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally would leave a short note saying that const-eval runs before addresses are picked, so the final alignment if unknoweable, and that's why this operation always "fails" during const.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we need this paragraph, instead of moving directly to the next one, which makes no real argument for why this is the case, aside from justifying it being OK to not worry about it since "we're not doing const SIMD anyway"?

puppy-dog eyes b-b-b-but

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my experience, SIMD programmers don't actually care about Miri's needs that much. It's not like they hate Miri, it just doesn't help them understand why they have to handle this. They want to instead hear something like,

"Yes, the promises are technically very weak, but this function will exercise a good faith effort to work as you would expect it to when your code actually runs on a hard processor. But if your code is executed on Miri's virtual machine, Miri is allowed to do really really weird things, like put all the data of your program in virtual registers instead of truly addressable virtual memory, and then give you pointers into the registers, and then alignment doesn't truly exist? It's... it's weird.

Just write code that is sound and this method will give it approximately the performance characteristics you would want at runtime. But if it is executed during CTFE, then the performance characteristics don't really matter, because the actual runtime cost becomes zero, and no matter how much SIMD acceleration you bring, you can't beat zero time."

///
/// Thankfully, because the precise SIMD pattern we're interested in *already* has a fallback
/// mode where the alignment completely fails, the compiler can just make this operation always
/// fail to align the pointer when doing const evaluation, and then everything is
/// always deterministic and reproducible (and the const evaluator is an interpreter anyway,
/// so you weren't actually going to get amazing SIMD speedups).
Comment on lines +1049 to +1053
Copy link
Member

@workingjubilee workingjubilee Aug 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More directly: given a slice with a random position in memory, we can't know the head of the slice is SIMD-aligned, and given a slice, we can have any number of values. Thus we must have a head and tail handling routine, because asserting on the size is not good enough (may be off-alignment) and asserting on the alignment is not good enough (may be an uneven number of elements, thus not trivially vectorized without tail-handling), and asserting on both just gives you a panic generator where a useful program should be, because you usually don't really have any way to pre-establish those guarantees in a useful manner based on the type information currently present in the system. And if you actually do, you have much bigger tricks to deploy on your code than this.

Thus, "you better believe you are going to be writing code to handle this correctly, Miri's semantics be damned."

///
/// Alright, now on to the actual semantics:
///
/// If it is not possible to align the pointer, the implementation returns
/// `usize::MAX`. It is permissible for the implementation to *always*
/// return `usize::MAX`. Only your algorithm's performance can depend
Expand Down
37 changes: 35 additions & 2 deletions library/core/src/ptr/mut_ptr.rs
Original file line number Diff line number Diff line change
Expand Up @@ -698,9 +698,16 @@ impl<T: ?Sized> *mut T {
/// Calculates the distance between two pointers. The returned value is in
/// units of T: the distance in bytes is divided by `mem::size_of::<T>()`.
///
/// This function is the inverse of [`offset`].
/// This is equivalent to `(self as isize - other as isize) / (mem::size_of::<T>() as isize)`,
/// except that it has a lot more opportunities for UB, in exchange for the compiler
/// understanding what you're doing better.
///
/// [`offset`]: #method.offset-1
/// The primary motivation of this method is for computing the `len` of an array/slice
/// of `T` that you are currently representing as a "start" and "end" pointer
/// (and "end" is "one past the end" of the array).
/// In that case, `end.offset_from(start)` gets you the length of the array.
///
/// All of the following safety requirements are trivially satisfied for this usecase.
///
/// # Safety
///
Expand Down Expand Up @@ -738,6 +745,7 @@ impl<T: ?Sized> *mut T {
/// such large allocations either.)
///
/// [`add`]: #method.add
/// [`offset`]: #method.offset
/// [allocated object]: crate::ptr#allocated-object
///
/// # Panics
Expand Down Expand Up @@ -1292,6 +1300,31 @@ impl<T: ?Sized> *mut T {
/// Computes the offset that needs to be applied to the pointer in order to make it aligned to
/// `align`.
///
/// Before getting into the precise semantics, it helps to have some context for why this
/// operation seems so weird on the surface. This is for two reasons:
///
/// * It's for SIMD, specifically [`slice::align_to`]
/// * It wants code that uses SIMD operations to work in a `const fn`
///
/// In particular, this operation is intended for breaking up a loop into its component
/// unaligned and aligned parts, so that the aligned parts can be processed using SIMD.
/// For that kind of pattern it's always ok to "fail" to align the pointer, because the
/// unaligned path can always handle all the data.
///
/// Lots of really basic operations want to use SIMD under the hood, so it would be very
/// frustrating if using this pattern made it impossible for an operation to work in
/// const contexts. Unfortunately, observing any properties of a pointer's address
/// is a very dangerous and problematic operation in const contexts, because it's a huge
/// reproducibility issue (which has actual soundness implications with compilation units).
///
/// Thankfully, because the precise SIMD pattern we're interested in *already* has a fallback
/// mode where the alignment completely fails, the compiler can just make this operation always
/// fail to align the pointer when doing const evaluation, and then everything is
/// always deterministic and reproducible (and the const evaluator is an interpreter anyway,
/// so you weren't actually going to get amazing SIMD speedups).
///
/// Alright, now on to the actual semantics:
///
/// If it is not possible to align the pointer, the implementation returns
/// `usize::MAX`. It is permissible for the implementation to *always*
/// return `usize::MAX`. Only your algorithm's performance can depend
Expand Down
44 changes: 44 additions & 0 deletions library/core/src/slice/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3406,6 +3406,28 @@ impl<T> [T] {
/// This method has no purpose when either input element `T` or output element `U` are
/// zero-sized and will return the original slice without splitting anything.
///
/// The reason this operation is slightly vague on its guarantees is because:
///
/// * It's for SIMD, specifically [`slice::as_simd`]
/// * It wants code that uses SIMD operations to work in a `const fn`
///
/// In particular, this operation is intended for breaking up a loop into its component
/// unaligned and aligned parts, so that the aligned parts can be processed using SIMD.
/// For that kind of pattern it's always ok to "fail" to align the slice, because the
/// unaligned path can always handle all the data.
///
/// Lots of really basic operations want to use SIMD under the hood, so it would be very
/// frustrating if using this pattern made it impossible for an operation to work in
/// const contexts. Unfortunately, observing any properties of a pointer's address
/// is a very dangerous and problematic operation in const contexts, because it's a huge
/// reproducibility issue (which has actual soundness implications with compilation units).
///
/// Thankfully, because the precise SIMD pattern we're interested in *already* has a fallback
/// mode where the alignment completely fails, the compiler can just make this operation always
/// fail to align the slice when doing const evaluation, and then everything is
/// always deterministic and reproducible (and the const evaluator is an interpreter anyway,
/// so you weren't actually going to get amazing SIMD speedups).
///
/// # Safety
///
/// This method is essentially a `transmute` with respect to the elements in the returned
Expand Down Expand Up @@ -3467,6 +3489,28 @@ impl<T> [T] {
/// This method has no purpose when either input element `T` or output element `U` are
/// zero-sized and will return the original slice without splitting anything.
///
/// The reason this operation is slightly vague on its guarantees is because:
///
/// * It's for SIMD, specifically [`slice::as_simd_mut`]
/// * It wants code that uses SIMD operations to work in a `const fn`
///
/// In particular, this operation is intended for breaking up a loop into its component
/// unaligned and aligned parts, so that the aligned parts can be processed using SIMD.
/// For that kind of pattern it's always ok to "fail" to align the slice, because the
/// unaligned path can always handle all the data.
///
/// Lots of really basic operations want to use SIMD under the hood, so it would be very
/// frustrating if using this pattern made it impossible for an operation to work in
/// const contexts. Unfortunately, observing any properties of a pointer's address
/// is a very dangerous and problematic operation in const contexts, because it's a huge
/// reproducibility issue (which has actual soundness implications with compilation units).
///
/// Thankfully, because the precise SIMD pattern we're interested in *already* has a fallback
/// mode where the alignment completely fails, the compiler can just make this operation always
/// fail to align the slice when doing const evaluation, and then everything is
/// always deterministic and reproducible (and the const evaluator is an interpreter anyway,
/// so you weren't actually going to get amazing SIMD speedups).
///
/// # Safety
///
/// This method is essentially a `transmute` with respect to the elements in the returned
Expand Down