Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsafe Pointer Methods #1966

Merged
merged 5 commits into from
Aug 17, 2017
Merged

Unsafe Pointer Methods #1966

merged 5 commits into from
Aug 17, 2017

Conversation

Gankra
Copy link
Contributor

@Gankra Gankra commented Apr 4, 2017

Copy most of the static ptr:: functions to methods on unsafe pointers themselves.
Also add a few conveniences for ptr.offset with unsigned integers.

// So this:
ptr::read(self.ptr.offset(idx as isize))

// Becomes this:
self.ptr.add(idx).read()

More conveniences should probably be added to unsafe pointers, but this proposal is basically the "minimally controversial" conveniences.

Rendered

@nrc nrc added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label Apr 4, 2017
@Manishearth
Copy link
Member

There's one issue this rfc doesn't quite address.

We are wary of methods on box/&/Rc/Arc because they interfere with deref. This isn't directly an issue here, because raw pointers aren't Deref. However, borrows do coerce to raw pointers, and this means that these methods will become available on borrows. I think. I'm not quite sure how coercion works on raw pointers but this seems like a footgun. We should carefully specify what should be done both in case of a conflict between deref and raw ptr coercion (especially when the deref must go through multiple layers of deref coercions to get there!) and in case a borrowed reference has these methods called on it when the methods are not defined on the pointed-to data (do we call ptr::offset? or just fail to resolve?)

Probably is a straightforward way of speccing this out, but I think it should be explicitly mentioned.

@Gankra
Copy link
Contributor Author

Gankra commented Apr 4, 2017

@Manishearth iirc only deref coercion triggers on ., so this isn't an issue. Note that ptr.offset(idx) already works today.

@Manishearth
Copy link
Member

That's autoderef, deref coercions apply to &, and we have other coercions which happen without any syntax

@seanmonstar
Copy link
Contributor

Seems like it wouldn't be too hard to have some compile-fail tests to ensure that coercion is never accidentally introduced slash happens already.

@Gankra
Copy link
Contributor Author

Gankra commented Apr 4, 2017

@Manishearth I don't know what specific code you're possibly imagining as ambiguous?

@Manishearth
Copy link
Member

I don't know what specific code you're possibly imagining as ambiguous?

x.offset() where x is &T, not *const T, but it gets coerced.

Now, I tried this with traits, and it seems like we currently do not coerce &T to *const T in method receiver position, probably because it isn't a coercion site (I'm surprised!). I guess as long as this behavior is preserved we're fine?

@Gankra
Copy link
Contributor Author

Gankra commented Apr 4, 2017

Yes, that's what I said. Only deref is done with ..

@bluss
Copy link
Member

bluss commented Apr 4, 2017

👍 for the introduction of ptr.add(i).read(), even though I disagree that signed offsets are uncommon (but for that we have offset, or as linked below, stride_offset).

On my wishlist I have far more opinionated methods for raw pointers in the PointerExt trait; these have been useful in implementing matrixmultiply, ndarray, low level iterators: raw pointer funhouses.

@mbrubeck
Copy link
Contributor

mbrubeck commented Apr 4, 2017

post_inc and pre_dec are also used in (but not exposed by) the standard library.

@Gankra
Copy link
Contributor Author

Gankra commented Apr 4, 2017

@bluss @mbrubeck I would love to address those in a followup proposal, but I'd like to keep this one simple.

@mark-i-m
Copy link
Member

mark-i-m commented Apr 4, 2017

@gankro Thanks! I have personally experienced this pain a lot, since I write a lot of low-level code...

I feel like the names of copy and copy_nonoverlapping should be changed to indicate the direction of the copy (i.e. are we copying to or from ptr?). I suppose it would be clear from the documentation, but it would still be really easy to write bugs with this. One could argue that you have the same problem with the static functions, but isn't that all the more reason to fix it? Perhaps copy_to would be more useful?

Rust doesn't support "unsafe operators", and offset is an unsafe function...

You could make offset generic so it accepts usize and isize.

I get your points, but I still think add and sub are a bit weird. It feels like I should be able to do ptr + idx, but I can't 😖

I would rather introduce PtrAdd and PtrSub traits for Deref types which would allow overloading + and - for pointers, though I admittedly haven't thought about the complexities involved or whether such a construct would be useful for other pointer types.

@pcwalton
Copy link
Contributor

pcwalton commented Apr 4, 2017

What's the use case for pointer arithmetic that can't be solved by taking the address of an index operation?

@mark-i-m
Copy link
Member

mark-i-m commented Apr 4, 2017

@pcwalton Do you mean that we should impl Index<isize> for *const T, for example? Wouldn't this suffer from the same problem as overloading + and -?

@Gankra
Copy link
Contributor Author

Gankra commented Apr 4, 2017

I feel like the names of copy and copy_nonoverlapping should be changed to indicate the direction of the copy

Yeah I briefly considered proposing copy_to, but I kinda want to hold out for Rust ever getting function argument labels, so it can be copy(to: x). I've completely lost track of if that was ever a real proposal or not though, so copy_to is probably just the right call.

@Manishearth
Copy link
Member

Manishearth commented Apr 4, 2017 via email

@strega-nil
Copy link

strega-nil commented Apr 5, 2017

We could finally fix the ptr::copy controversy by renaming it copy_from and correcting the order of the arguments. That'd be really nice.

Edit: for those not in the know:

std::ptr::copy and std::ptr::copy_nonoverlapping's parameter ordering were changed in the lead-up to 1.0, bringing them in line with fs::copy, but out of line with the other pointer functions, as well as with clone_from, slice::clone_from_slice, and, eventually, slice::copy_from_slice.

@mark-i-m
Copy link
Member

mark-i-m commented Apr 5, 2017

Yes, copy_from makes more sense as a method too, since it will change the receiver, rather than the argument...

@Gankra
Copy link
Contributor Author

Gankra commented Apr 5, 2017

copy_from is a bad idea, because it will just make people get confused about ptr::copy and mess things up. copy_to, on the other hand, does the opposite.

@Manishearth
Copy link
Member

Perhaps deprecate the ptr:: funcs so that it doesn't look like they provide something different from the new methods?

@Gankra
Copy link
Contributor Author

Gankra commented Apr 5, 2017

The RFC already explains why we shouldn't deprecate them.

@Manishearth
Copy link
Member

Oh, right. We could still deprecate the copy function though. Idk.

@bluss
Copy link
Member

bluss commented Apr 5, 2017

You can still recover coercion for references by using the explicit method call syntax: <*const T>::read(&x).

Personally I don't like the casual style of the RFC, for example that static functions would stink but still the author wants to keep them. In that sense the style just makes it harder for the RFC to convey its meaning.

@mark-i-m
Copy link
Member

mark-i-m commented Apr 5, 2017

@gankro May be we can silently change the order of parameters for ptr::copy 😝

Seriously, though, speaking for myself, knowing the order of parameters for copy_from doesn't seem to convey anything about the arguments for ptr::copy, IMHO. I'm not convinced this would actually cause any more confusion. People already have to look up the order of the arguments in the docs.

Also, after thinking some more, I agree with others that have said the static functions should be deprecated. The methods seem more elegant and "rustic" than the functions. I think that having duplicate functionality sets a bad precedent which might lead to bloat (unless you can point to other places where this is already standard practice).

@mark-i-m
Copy link
Member

mark-i-m commented Apr 5, 2017

Also, methods can be called with function syntax, though I must confess I don't know exactly what it would look like... maybe something like

<*const T>::copy_from(x, y, 23);

@SimonSapin
Copy link
Contributor

@mark-i-m Yes. I’ve just checked that this compiles:

fn main() {
    <*const ()>::method(std::ptr::null())
}

trait Trait {
    fn method(self);
}

impl<T: ?Sized> Trait for *const T {
    fn method(self) {}
}

@Gankra
Copy link
Contributor Author

Gankra commented Apr 5, 2017

@bluss the static functions are bad in most cases, but in rare (IME) cases can be a bit more ergonomic. Since this RFC is targeted at strictly improving ergonomics, it doesn't make sense to regress current ergonomics.

@mark-i-m <*mut _>::copy_from(dest, src, count) seems like an ergonomics regression to me. I'd rather write (dest as *mut _).copy_from(src, count), and I don't even want to write that.

@mark-i-m
Copy link
Member

I tend to agree with @ubsan, but I would rather this RFC pass than not pass if that's what it comes down to...

@aturon
Copy link
Member

aturon commented Jul 26, 2017

Thanks, @gankro, for taking this on, and apologies for my taking so long to review it!

It looks great, and I agree with essentially all of your reasoning. Regarding copy_from vs copy_to, I agree with @ubsan: we should add both.

I think we may want to consider deprecating some of the free functions eventually, but agree that we can treat that as a separate step after we have more experience with the methods.

@rfcbot rfcbot added final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. and removed proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. labels Jul 26, 2017
@rfcbot
Copy link
Collaborator

rfcbot commented Jul 26, 2017

🔔 This is now entering its final comment period, as per the review above. 🔔

@Gankra
Copy link
Contributor Author

Gankra commented Jul 26, 2017

Ok I've come around to adding both, with maybe an intent of strategic deprecation in the longterm.

@ghost
Copy link

ghost commented Aug 3, 2017

I've tried porting pdqsort to use the proposed API and the result can be seen here.
Definitely looks a bit cleaner. Minor annoyances I've had while whipping this up:

  1. Avoiding verbose v.get_uncheckeds can be cumbersome.
  2. (&tmp).copy_to_nonoverlapping(right!(), 1) doesn't work because &tmp is not a pointer.
  3. copy_to_nonoverlapping is a frequently used method with a very long name.

Overall, I'm happy with the API.

By the way, a blog post was just published, complaining about unergonomic unchecked arithmetic:

non-checking arithmetic is a bit too inconvenient to write (by design, I suppose) but it may help to squeeze some performance

Also it’s obvious how Rust makes the code that sacrifices checks for speed uglier (compare a + b and a.wrapping_add(b) or arr[i] and *ptr.offset(i as isize)).

@mark-i-m
Copy link
Member

mark-i-m commented Aug 4, 2017

@stjepang Would copy_to_disjoint be better?

@ghost
Copy link

ghost commented Aug 4, 2017

@mark-i-m Yes - I like your suggestion!

IMO, that negation in nonoverlapping is pretty awkward, and disjoint feels like a more natural synonym. Consider this code:

v.add(1).copy_to_nonoverlapping(v, 1);

v.add(1).copy_to_disjoint(v, 1);

One more thing: This RFC doesn't mention the ptr::offset_to function.
We might want to add fn offset_to(self, other: *const T) -> Option<usize> for *mut T and *const T.

@Gankra
Copy link
Contributor Author

Gankra commented Aug 4, 2017

Heh, I had never heard of offset_to, it was added at the same time as this RFC!

I agree nonoverlapping is verbose, but I don't really have any inclination to fix that one. I've personally always had an affinity for its length as a "think twice" thing, since if you're not sure, you should be using plain copy.

@rfcbot
Copy link
Collaborator

rfcbot commented Aug 5, 2017

The final comment period is now complete.

@strega-nil
Copy link

@gankro I am not really sure that's true - especially in Rust, where aliasing is at least rare.

@Gankra
Copy link
Contributor Author

Gankra commented Aug 10, 2017

I have updated the text to add copy_from alongside copy_to.

@ubsan it's basically 50/50 in liballoc/collections stuff, because you can't use copy_nonoverlapping for shifting subslices.

@strega-nil
Copy link

@gankro hmm, it might be different in other codebases. I don't think I've ever personally used memmove, anyways.

@Amanieu
Copy link
Member

Amanieu commented Aug 10, 2017

This RFC is still missing a replacement for offset_to. offset_to returns a isize, and was designed to work well will the existing offset method. If you plan on deprecating offset then you should also provide a replacement for offset_to.

@Gankra
Copy link
Contributor Author

Gankra commented Aug 10, 2017

This RFC deprecates nothing.

@ghost
Copy link

ghost commented Aug 17, 2017

Can we merge this RFC?

@alexcrichton
Copy link
Member

Oops apologies! Merged with a tracking issue: rust-lang/rust#43941

kennytm added a commit to kennytm/rust that referenced this pull request May 14, 2018
…bnik

Rewrite docs for `std::ptr`

This PR attempts to resolve rust-lang#29371.

This is a fairly major rewrite of the `std::ptr` docs, and deserves a fair bit of scrutiny. It adds links to the GNU libc docs for various instrinsics, adds internal links to types and functions referenced in the docs, adds new, more complex examples for many functions, and introduces a common template for discussing unsafety of functions in `std::ptr`.

All functions in `std::ptr` (with the exception of `ptr::eq`) are unsafe because they either read from or write to a raw pointer. The "Safety" section now informs that the function is unsafe because it dereferences a raw pointer and requires that any pointer to be read by the function points to "a valid vaue of type `T`".

Additionally, each function imposes some subset of the following conditions on its arguments.

* The pointer points to valid memory.
* The pointer points to initialized memory.
* The pointer is properly aligned.

These requirements are discussed in the "Undefined Behavior" section along with the  consequences of using functions that perform bitwise copies without requiring `T: Copy`. I don't love my new descriptions of the consequences of making such copies. Perhaps the old ones were good enough?

Some issues which still need to be addressed before this is merged:
- [ ] The new docs assert that `drop_in_place` is equivalent to calling `read` and discarding the value. Is this correct?
- [ ] Do `write_bytes` and `swap_nonoverlapping` require properly aligned pointers?
- [ ] The new example for `drop_in_place` is a lackluster.
- [ ] Should these docs rigorously define what `valid` memory is? Or should is that the job of the reference? Should we link to the reference?
- [ ] Is it correct to require that pointers that will be read from refer to "valid values of type `T`"?
- [x] I can't imagine ever using `{read,write}_volatile` with non-`Copy` types.  Should I just link to {read,write} and say that the same issues with non-`Copy` types apply?
- [x] `write_volatile` doesn't link back to `read_volatile`.
- [ ] Update docs for the unstable [`swap_nonoverlapping`](rust-lang#42818)
- [ ] Update docs for the unstable [unsafe pointer methods RFC](rust-lang/rfcs#1966)

Looking forward to your feedback.

r? @steveklabnik
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request May 15, 2018
…bnik

Rewrite docs for `std::ptr`

This PR attempts to resolve rust-lang#29371.

This is a fairly major rewrite of the `std::ptr` docs, and deserves a fair bit of scrutiny. It adds links to the GNU libc docs for various instrinsics, adds internal links to types and functions referenced in the docs, adds new, more complex examples for many functions, and introduces a common template for discussing unsafety of functions in `std::ptr`.

All functions in `std::ptr` (with the exception of `ptr::eq`) are unsafe because they either read from or write to a raw pointer. The "Safety" section now informs that the function is unsafe because it dereferences a raw pointer and requires that any pointer to be read by the function points to "a valid vaue of type `T`".

Additionally, each function imposes some subset of the following conditions on its arguments.

* The pointer points to valid memory.
* The pointer points to initialized memory.
* The pointer is properly aligned.

These requirements are discussed in the "Undefined Behavior" section along with the  consequences of using functions that perform bitwise copies without requiring `T: Copy`. I don't love my new descriptions of the consequences of making such copies. Perhaps the old ones were good enough?

Some issues which still need to be addressed before this is merged:
- [ ] The new docs assert that `drop_in_place` is equivalent to calling `read` and discarding the value. Is this correct?
- [ ] Do `write_bytes` and `swap_nonoverlapping` require properly aligned pointers?
- [ ] The new example for `drop_in_place` is a lackluster.
- [ ] Should these docs rigorously define what `valid` memory is? Or should is that the job of the reference? Should we link to the reference?
- [ ] Is it correct to require that pointers that will be read from refer to "valid values of type `T`"?
- [x] I can't imagine ever using `{read,write}_volatile` with non-`Copy` types.  Should I just link to {read,write} and say that the same issues with non-`Copy` types apply?
- [x] `write_volatile` doesn't link back to `read_volatile`.
- [ ] Update docs for the unstable [`swap_nonoverlapping`](rust-lang#42818)
- [ ] Update docs for the unstable [unsafe pointer methods RFC](rust-lang/rfcs#1966)

Looking forward to your feedback.

r? @steveklabnik
@Centril Centril added A-unsafe Unsafe related proposals & ideas A-inherent-impl Proposals related to inherent implementations A-raw-pointers Proposals relating to raw pointers. labels Nov 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-inherent-impl Proposals related to inherent implementations A-raw-pointers Proposals relating to raw pointers. A-unsafe Unsafe related proposals & ideas final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.