Handle out of memory errors in `io::Read::read_to_end` #116570

kornelski · 2023-10-09T14:44:17Z

I'd like to propose handling of out-of-memory errors in the default implementation of io::Read::read_to_end() and fs::read(). These methods create/grow a Vec with a size that is external to the program, and could be arbitrarily large.

Due to being I/O methods, they can already fail in a variety of ways, in theory even including ENOMEM from the OS too, so another failure case should not surprise anyone.

While this may not help much Linux with overcommit, it's useful for other platforms like WASM. Internals thread.

I haven't changed implementation of impl Read for &[u8] and VecDeque out of caution, because in these cases users could assume read can't fail.

This code uses try_reserve() + extend_from_slice() which is optimized since #117503.

rustbot · 2023-10-09T14:44:26Z

r? @joshtriplett

(rustbot has picked a reviewer for you, use r? to override)

joshtriplett · 2023-10-14T04:29:58Z

This does affect the API surface, so:

@rfcbot merge

rfcbot · 2023-10-14T04:30:00Z

Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

documented guarantee (Handle out of memory errors in io::Read::read_to_end #116570 (comment))

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

BurntSushi · 2023-10-14T12:57:39Z

Is this error condition something we're going to guarantee? If so, it should probably be in the docs.

The litmus test here is this: if we merge this, and then some time later for some reason to decide to change it back, could we do that without it being considered a breaking change in behavior? I suspect not. And if not, I think that means we might as well document the behavior.

@rfcbot concern documented guarantee

the8472 · 2023-10-14T17:51:03Z

@BurntSushi that seems like circular reasoning to me. We can break that which we haven't guaranteed not to break.
If we only guarantee what's in the API/documentation then we can change implementation details however we like. If implementation details become part of the contract... then... uh, we never can change anything at all.
Anything between that is best-effort and not a guarantee.

BurntSushi · 2023-10-14T17:53:32Z

We can break that which we guaranteed not to break.

Not sure what you mean. Is this something we guarantee we won't break in the future?

If it is, then we ought to document it.

If it isn't, then this probably should be a libs fcp and not a libs-api fcp.

BurntSushi · 2023-10-14T17:55:28Z

To be clear, IMO, I believe this ought to be an API change and something we guarantee. And thus it should go in the docs and thus folks can rely on these routines returning an error in an out-of-memory condition.

the8472 · 2023-10-14T17:57:41Z

That's complicated by

I haven't changed implementation of impl Read for &[u8] and VecDeque out of caution, because in these cases users could assume read can't fail.

which means we'd make this guarantee conditionally, not for all R: Read, which would make it far less useful for things that take generic readers.

Not sure what you mean.

My bad, I made a critical typo, missing a negation.

kornelski · 2023-10-14T18:03:40Z

The difficulty here is that there likely are 3rd party implementations of read_to_end that don't handle OOM. It can't be enforced by Rust or libstd. So it's a quality of implementation issue rather than an API contract.

Of course it can be documented what libstd does, perhaps with a recommendation for Read trait implementers to follow the example and handle OOM too.

BurntSushi · 2023-10-14T18:07:12Z

@the8472 I feel like you're responding to my comments here very narrowly. Can we please take a more expansive view here? My understanding is that this PR changes the behavior of some code to presumably be more useful in some contexts than it was previously by returning an error in OOM instead of the usual OOM behavior. Is that correct?

If so, then we should make a choice about whether we want to guarantee this more useful behavior going forward, however limited it is. If we want to, then we ought to document it as a guarantee in whatever way makes sense.

If we don't want to, then arguably this is "just" an implementation detail and ought not be decided by libs-api.

I bring this up because @joshtriplett said:

This does affect the API surface, so

But the fact that this PR changes zero wording in the docs made me pause and wonder whether we actually ought to treat this as an implementation detail or whether we truly do want to treat this as an API change.

BurntSushi · 2023-10-14T18:12:41Z

The difficulty here is that there likely are 3rd party implementations of read_to_end that don't handle OOM. It can't be enforced by Rust or libstd. So it's a quality of implementation issue rather than an API contract.

Of course it can be documented what libstd does, perhaps with a recommendation for Read trait implementers to follow the example and handle OOM too.

Right... This is why I posed what I should have more explicitly labeled as a thought experiment:

Let's say we merge this PR as-is.
It gets release in a stable release.
People start to write code in a way that depends on the out-of-memory error happening.
Something comes up (I don't know what, this is why it's a thought experiment) that causes us to revert this PR.
The revert gets into a stable release.
People who previously wrote code relying on this behavior are upset.

How do we respond? "Sorry, it was an implementation detail and we were within our rights to change it." Or, "Oh shoot you're right, we didn't mean to change the behavior here and we regard this as an unintentional breaking change."

If it's the latter, then this is indeed an API change and we should figure out how to document it. If it's the former, then I don't think this is a libs-api matter.

My opinion is that it probably ought to be the latter, but I certainly do not feel strongly.

the8472 · 2023-10-14T18:25:27Z

My understanding is that this PR changes the behavior of some code to presumably be more useful in some contexts than it was previously by returning an error in OOM instead of the usual OOM behavior. Is that correct?

That is also my understanding, but I take it as a best-effort approach that'll mostly help code that already handles all kinds of errors in a useful way.

Things that unwrap() or exit(1) on any error will continue to do so.
Things that only handle specific errors and bail on other ones will continue to do so.

Without an API guarantee that we always return ErrorKind::OutOfMemory - which we won't, even with this change - it can still make some programs on some platforms more robust.

If so, then we should make a choice about whether we want to guarantee this more useful behavior going forward, however limited it is. If we want to, then we ought to document it as a guarantee in whatever way makes sense.

If we don't want to, then arguably this is "just" an implementation detail and ought not be decided by libs-api.

Agreed. Imo it's an implementation detail.

kornelski · 2023-10-18T11:34:17Z

@BurntSushi I think the hypothetical response would be "Sorry, it was an implementation detail, because the io::Read trait never guaranteed that."

BurntSushi · 2023-10-18T11:41:44Z

All righty, then I think this shouldn't be libs-api? Any I've disagree with that? @joshtriplett?

Hint optimizer about try-reserved capacity This is rust-lang#116568, but limited only to the less-common `try_reserve` functions to reduce bloat in debug binaries from debug info, while still addressing the main use-case rust-lang#116570

…ngjubilee Hint optimizer about try-reserved capacity This is rust-lang#116568, but limited only to the less-common `try_reserve` functions to reduce bloat in debug binaries from debug info, while still addressing the main use-case rust-lang#116570

kornelski · 2023-11-07T12:21:11Z

@BurntSushi you haven't removed your concern from rfcbot, so it's still blocking merge.

BurntSushi · 2023-11-07T12:45:41Z

@kornelski I don't think my concern has been resolved? Specifically:

Is this something we guarantee we won't break in the future?

If it is, then we ought to document it.

If it isn't, then this probably should be a libs fcp and not a libs-api fcp.

kornelski · 2023-11-07T12:48:27Z

It isn't a guarantee. I've updated the docs to explicitly state it's not guaranteed.

I don't have ability to change the FCP, so if it needs to be handed over, someone with access has to do it.

Handle out of memory errors in io:Read::read_to_end() rust-lang#116570 got stuck due to a [procedural confusion](rust-lang#116570 (comment)). Retrying so that it can get FCP with the proper team now. cc `@joshtriplett` `@BurntSushi` ---- I'd like to propose handling of out-of-memory errors in the default implementation of `io::Read::read_to_end()` and `fs::read()`. These methods create/grow a `Vec` with a size that is external to the program, and could be arbitrarily large. Due to being I/O methods, they can already fail in a variety of ways, in theory even including `ENOMEM` from the OS too, so another failure case should not surprise anyone. While this may not help much Linux with overcommit, it's useful for other platforms like WASM. [Internals thread](https://internals.rust-lang.org/t/io-read-read-to-end-should-handle-oom/19662). I've added documentation that makes it explicit that the OOM handling is a nice-to-have, and not a guarantee of the trait. I haven't changed the implementation of `impl Read for &[u8]` and `VecDeque` out of caution, because in these cases users could assume `read` can't fail. This code uses `try_reserve()` + `extend_from_slice()` which is optimized since rust-lang#117503.

rustbot assigned joshtriplett Oct 9, 2023

kornelski force-pushed the read-to-oom branch from 9c46b2e to 625e3bc Compare October 9, 2023 14:49

joshtriplett added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Oct 14, 2023

rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Oct 14, 2023

workingjubilee changed the title ~~Handle out of memory errors in io:Read::read_to_end()~~ Handle out of memory errors in io::Read::read_to_end Oct 28, 2023

kornelski mentioned this pull request Nov 2, 2023

Hint optimizer about try-reserved capacity #117503

Merged

kornelski added 3 commits November 7, 2023 12:49

Handle out of memory errors in io:Read::read_to_end()

cf446cd

Handle out of memory errors in fs::read/read_to_string

75e1065

Document read_to_end memory handling non-guarantee

98f7d4c

kornelski force-pushed the read-to-oom branch from 625e3bc to 98f7d4c Compare November 7, 2023 13:09

kornelski closed this Nov 15, 2023

rfcbot removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Nov 15, 2023

kornelski mentioned this pull request Nov 15, 2023

Handle out of memory errors in io:Read::read_to_end() #117925

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle out of memory errors in `io::Read::read_to_end` #116570

Handle out of memory errors in `io::Read::read_to_end` #116570

kornelski commented Oct 9, 2023 •

edited

Loading

rustbot commented Oct 9, 2023

joshtriplett commented Oct 14, 2023

rfcbot commented Oct 14, 2023 •

edited by m-ou-se

Loading

BurntSushi commented Oct 14, 2023

the8472 commented Oct 14, 2023 •

edited

Loading

BurntSushi commented Oct 14, 2023 •

edited

Loading

BurntSushi commented Oct 14, 2023

the8472 commented Oct 14, 2023 •

edited

Loading

kornelski commented Oct 14, 2023

BurntSushi commented Oct 14, 2023

BurntSushi commented Oct 14, 2023

the8472 commented Oct 14, 2023

kornelski commented Oct 18, 2023 •

edited

Loading

BurntSushi commented Oct 18, 2023

kornelski commented Nov 7, 2023

BurntSushi commented Nov 7, 2023

kornelski commented Nov 7, 2023 •

edited

Loading

Handle out of memory errors in io::Read::read_to_end #116570

Handle out of memory errors in io::Read::read_to_end #116570

Conversation

kornelski commented Oct 9, 2023 • edited Loading

rustbot commented Oct 9, 2023

joshtriplett commented Oct 14, 2023

rfcbot commented Oct 14, 2023 • edited by m-ou-se Loading

BurntSushi commented Oct 14, 2023

the8472 commented Oct 14, 2023 • edited Loading

BurntSushi commented Oct 14, 2023 • edited Loading

BurntSushi commented Oct 14, 2023

the8472 commented Oct 14, 2023 • edited Loading

kornelski commented Oct 14, 2023

BurntSushi commented Oct 14, 2023

BurntSushi commented Oct 14, 2023

the8472 commented Oct 14, 2023

kornelski commented Oct 18, 2023 • edited Loading

BurntSushi commented Oct 18, 2023

kornelski commented Nov 7, 2023

BurntSushi commented Nov 7, 2023

kornelski commented Nov 7, 2023 • edited Loading

Handle out of memory errors in `io::Read::read_to_end` #116570

Handle out of memory errors in `io::Read::read_to_end` #116570

kornelski commented Oct 9, 2023 •

edited

Loading

rfcbot commented Oct 14, 2023 •

edited by m-ou-se

Loading

the8472 commented Oct 14, 2023 •

edited

Loading

BurntSushi commented Oct 14, 2023 •

edited

Loading

the8472 commented Oct 14, 2023 •

edited

Loading

kornelski commented Oct 18, 2023 •

edited

Loading

kornelski commented Nov 7, 2023 •

edited

Loading