Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relocate coroutine upvars into Unresumed state #120168

Closed

Conversation

dingxiangfei2009
Copy link
Contributor

@dingxiangfei2009 dingxiangfei2009 commented Jan 20, 2024

Related to #62958

This PR is an attempt to address the async/coroutine size issue by allowing independent def-use/liveness analysis on individual upvars in coroutines. It has appeared to address partially the size doubling issue introduced by the use of upvars.

However, there are caveats detailed in the following list that I would like to address before turning this draft in.

  • The treatment here towards the ty::Coroutine in MIR passes is unfortunately "messier" than my liking, which is something I definitely want to change. I propose to promote upvars into Body<'tcx> along with local_decls, so that we can safely handle them safely. I would happily open a new separate PR to improve the upvar management.
  • It is not a generic solution, yet. For instance, we are still doubling the size in the example of Async fn doubles argument size #62958. If we insert a pass before MIR type analysis to remove unnecessary drops, which we can, that particular size doubling will be solved. However, if a Future upvar is alive across more than one yield points, that upvar is still ineligible. It makes sense because we would like to minimize moving of variant fields. How to handle these upvars is not the focus of this PR for now.

Out of expectation of possible change in the high level plan, I am keeping this as a draft in hope of invoking conversations. 🙇

cc @pnkfelix for the context.

@rustbot
Copy link
Collaborator

rustbot commented Jan 20, 2024

r? @petrochenkov

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jan 20, 2024
@rust-log-analyzer

This comment has been minimized.

@petrochenkov
Copy link
Contributor

r? @pnkfelix

@rustbot rustbot assigned pnkfelix and unassigned petrochenkov Jan 21, 2024
@bors
Copy link
Contributor

bors commented Jan 23, 2024

☔ The latest upstream changes (presumably #116152) made this pull request unmergeable. Please resolve the merge conflicts.

@rust-log-analyzer

This comment has been minimized.

@pnkfelix
Copy link
Member

pnkfelix commented Feb 1, 2024

@dingxiangfei2009 am I right in assuming that the debugger test failure is expected and you plan to address it if we move forward with this approach?

@pnkfelix
Copy link
Member

pnkfelix commented Feb 2, 2024

In any case, this is interesting work, and I think its a step in the right direction.

I plan to build it locally and take it for a spin, to make sure I understand what its doing. @dingxiangfei2009, I may ping you on Zulip with follow-up questions.

@dingxiangfei2009
Copy link
Contributor Author

@pnkfelix Thank you for reviewing! Yes, the debuginfo test may very much depend on how the upvars are eventually represented. I figured that I shall fix them after we could settle with our approach first.

Please feel free to message on Zulip!

@bors
Copy link
Contributor

bors commented Feb 6, 2024

☔ The latest upstream changes (presumably #120361) made this pull request unmergeable. Please resolve the merge conflicts.

@@ -1105,11 +1168,25 @@ fn compute_layout<'tcx>(
// around inside coroutines, so it doesn't matter which variant
Copy link
Member

@pnkfelix pnkfelix Mar 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment was cut off in the patch, it starts off with:

            // Note that if a field is included in multiple variants, we will
            // just use the first one here. That's fine; fields do not move
            // around inside coroutines, so it doesn't matter which variant
            // index we access them by.

I had started writing this PR comment with a question asking if this code comment needs revision for consistency with this change. After reflecting on things, i think the whole picture here remains coherent, but now I want to double check my interpretation.

Before this patch, the upvars were disjoint from the fields carried across yields of a generator/coroutine, so the comment above simply didn't apply to upvars.

After this patch, the upvars are now mapped into the Unresumed variant of a generator/coroutine. But that isn't inconsistent with the comment above ... as long as the Unresumed variant is itself the first variant... right?

Copy link
Member

@pnkfelix pnkfelix Mar 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the fact that the Unresumed variant is the first doesn't probably matter either.

The comment above is saying 'we pick the first', but it is also saying that the choice is irrelevant, since all variants that use the field should be using the same offset for that field.

So the crucial point is that the Unresumed variant should be adhering to that same invariant, that any uses of the upvar in other variants must likewise choose the same offset for that upvar in its respective variant.

@@ -1221,6 +1298,21 @@ fn elaborate_coroutine_drops<'tcx>(tcx: TyCtxt<'tcx>, body: &mut Body<'tcx>) {
);
}
elaborator.patch.apply(body);
for block_data in body.basic_blocks_mut() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some code up above this (or called by one of the methods above) that was in principle responsible for dropping the upvars, that is now made obsolete under the new regime you have introduced here?

@pnkfelix
Copy link
Member

pnkfelix commented Mar 1, 2024

Thanks for your patience @dingxiangfei2009 . This all looks good to me; I wouldn't mind getting an answer to my question about the new handling of dropping upvar state, but I also think this is ready to land, assuming it doesn't conflict with @compiler-errors 's ongoing work on async closures etc.

@pnkfelix
Copy link
Member

pnkfelix commented Mar 1, 2024

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 1, 2024
@pnkfelix
Copy link
Member

pnkfelix commented Mar 1, 2024

@bors try @rust-timer queue

@bors
Copy link
Contributor

bors commented Mar 1, 2024

🔒 Merge conflict

This pull request and the master branch diverged in a way that cannot be automatically merged. Please rebase on top of the latest master branch, and let the reviewer approve again.

How do I rebase?

Assuming self is your fork and upstream is this repository, you can resolve the conflict following these steps:

  1. git checkout coroutine-upvar (switch to your branch)
  2. git fetch upstream master (retrieve the latest master)
  3. git rebase upstream/master -p (rebase on top of it)
  4. Follow the on-screen instruction to resolve conflicts (check git status if you got lost).
  5. git push self coroutine-upvar --force-with-lease (update this PR)

You may also read Git Rebasing to Resolve Conflicts by Drew Blessing for a short tutorial.

Please avoid the "Resolve conflicts" button on GitHub. It uses git merge instead of git rebase which makes the PR commit history more difficult to read.

Sometimes step 4 will complete without asking for resolution. This is usually due to difference between how Cargo.lock conflict is handled during merge and rebase. This is normal, and you should still perform step 5 to update this PR.

Error message
Auto-merging tests/ui/coroutine/clone-impl.stderr
Auto-merging tests/ui/async-await/issue-70935-complex-spans.stderr
Auto-merging tests/ui/async-await/future-sizes/future-as-arg.rs
Auto-merging tests/ui/async-await/async-await-let-else.stderr
Auto-merging compiler/rustc_ty_utils/src/layout.rs
Auto-merging compiler/rustc_mir_transform/src/dest_prop.rs
Auto-merging compiler/rustc_mir_transform/src/coroutine.rs
Auto-merging compiler/rustc_mir_dataflow/src/rustc_peek.rs
Auto-merging compiler/rustc_middle/src/ty/sty.rs
Auto-merging compiler/rustc_middle/src/ty/layout.rs
Auto-merging compiler/rustc_middle/src/mir/tcx.rs
Auto-merging compiler/rustc_const_eval/src/transform/validate.rs
Auto-merging compiler/rustc_const_eval/src/interpret/step.rs
Auto-merging compiler/rustc_codegen_ssa/src/mir/rvalue.rs
Auto-merging compiler/rustc_codegen_llvm/src/debuginfo/metadata/enums/mod.rs
Auto-merging compiler/rustc_codegen_llvm/src/debuginfo/metadata.rs
Auto-merging compiler/rustc_codegen_cranelift/src/base.rs
Auto-merging compiler/rustc_borrowck/src/type_check/mod.rs
CONFLICT (content): Merge conflict in compiler/rustc_borrowck/src/type_check/mod.rs
Auto-merging compiler/rustc_abi/src/lib.rs
Automatic merge failed; fix conflicts and then commit the result.

@pnkfelix
Copy link
Member

pnkfelix commented Mar 1, 2024

Oh, also: Just in case its potentially of interest to others, part of the reason this took me so long to get to is that I wanted a pair of pernos.co runs, one from before and one from after this PR, to make it easier for me to explore the control flow interactively (and to help me understand some of the differences in output from our internal compiler instrumentation like print-type-sizes)

Here's links to those in case anyone else wants to inspect them:

@dingxiangfei2009
Copy link
Contributor Author

Progress update: I am still on it to polish it up. I will try to get the rust-timer numbers this week.

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Apr 10, 2024

☔ The latest upstream changes (presumably #123708) made this pull request unmergeable. Please resolve the merge conflicts.

@dingxiangfei2009
Copy link
Contributor Author

@bors try @rust-timer queue

@bors
Copy link
Contributor

bors commented Apr 10, 2024

@dingxiangfei2009: 🔑 Insufficient privileges: not in try users

@rust-timer

This comment has been minimized.

@dingxiangfei2009
Copy link
Contributor Author

I see that @pnkfelix attempted to run a timer job earlier. Would you like to trigger the job again, given that there hasn't been any merge conflict now?

@tmiasko
Copy link
Contributor

tmiasko commented Apr 10, 2024

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 10, 2024
@bors
Copy link
Contributor

bors commented Apr 10, 2024

⌛ Trying commit a68481f with merge 6b8fde6...

bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 10, 2024
…<try>

Relocate coroutine upvars into Unresumed state

Related to rust-lang#62958

This PR is an attempt to address the async/coroutine size issue by allowing independent def-use/liveness analysis on individual upvars in coroutines. It has appeared to address partially the size doubling issue introduced by the use of upvars.

However, there are caveats detailed in the following list that I would like to address before turning this draft in.
- The treatment here towards the `ty::Coroutine` in MIR passes is unfortunately "messier" than my liking, which is something I definitely want to change. I propose to promote upvars into `Body<'tcx>` along with `local_decls`, so that we can safely handle them safely. I would happily open a new separate PR to improve the upvar management.
- It is not a generic solution, yet. For instance, we are still doubling the size in the example of rust-lang#62958. If we insert a pass before MIR type analysis to remove unnecessary drops, which we can, that particular size doubling will be solved. However, if a `Future` upvar is alive across more than one yield points, that upvar is still ineligible. It makes sense because we would like to minimize moving of variant fields. How to handle these upvars is not the focus of this PR for now.

Out of expectation of possible change in the high level plan, I am keeping this as a draft in hope of invoking conversations. 🙇

cc `@pnkfelix` for the context.
@bors
Copy link
Contributor

bors commented Apr 10, 2024

☀️ Try build successful - checks-actions
Build commit: 6b8fde6 (6b8fde695cc11a7d81125d3222430c419d510027)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (6b8fde6): comparison URL.

Overall result: ❌ regressions - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.6% [0.2%, 0.8%] 3
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-2.9% [-2.9%, -2.9%] 1
Improvements ✅
(secondary)
-2.8% [-2.8%, -2.8%] 1
All ❌✅ (primary) -2.9% [-2.9%, -2.9%] 1

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
1.2% [1.2%, 1.2%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
1.8% [1.8%, 1.8%] 3
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.3% [-0.3%, -0.3%] 2
All ❌✅ (primary) - - 0

Bootstrap: 674.485s -> 672.751s (-0.26%)
Artifact size: 318.39 MiB -> 318.44 MiB (0.02%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 10, 2024
@@ -288,7 +288,7 @@ where
}

None if dump_enabled(tcx, A::NAME, def_id) => {
create_dump_file(tcx, ".dot", false, A::NAME, &pass_name.unwrap_or("-----"), body)?
create_dump_file(tcx, ".dot", true, A::NAME, &pass_name.unwrap_or("-----"), body)?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't object to this change, but it isn't strictly speaking necessary for this PR, is it?

pub struct TransferFunction<'a, T>(pub &'a mut T);
pub struct TransferFunction<'a, T> {
pub trans: &'a mut T,
pub upvars: Option<(Local, usize)>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember if earlier iterations on this PR had these changes to the various dataflow impls; its possible they did and I just overlooked them.

But seeing them now with fresh eyes, it just seems unfortunate that so many dataflow implementations seem like they are expected to duplicate this upvars: Option<(Local, usize)> state pattern. I would be very concerned about whether it is a footgun to forget to do it, and also whether it makes all of the dataflow systems harder to maintain.

Having said that, I also don't have a better suggestion about what to do for the short term. It probably is something to bring up with the broader team, in terms of explaining why the pattern is necessary and seeing if anyone has ideas for another way to accomplish the objective.

@pnkfelix
Copy link
Member

pnkfelix commented Apr 12, 2024

I'm going to nominate this PR for discussion at next week's T-compiler meeting.

The essential question I want to pose is: Are we okay with the upvars: Option<(Local, usize)> this change is adding to various dataflow analyses, at least in the short term so that we can make forward progress on the problem of generator size explosion ?

Or should we wait to land this PR until after we have a discussion between @dingxiangfei2009, @pnkfelix, and whichever team members are interested in any of {dataflow, MIR representation, generator representation} to try to tease how what the "right answer" is here.

@rustbot label: +I-compiler-nominated

@rustbot rustbot added the I-compiler-nominated Nominated for discussion during a compiler team meeting. label Apr 12, 2024
@pnkfelix
Copy link
Member

Okay, the team discussed briefly at our team meeting.

@dingxiangfei2009 has opened up a dedicated zulip topic to discuss the matter, at: https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Seeking.20opinions.20about.20lifting.20upvars.20in.20coroutine.20and.20fu.2E.2E.2E

In the above zulip topic, @dingxiangfei2009 wrote:

I am still investigating if we can split the BitSet domain into Locals and Upvars. Despite the extensive exposure to assumptions that only true Locals exists, I believe that it is still doable, so that existing and new analyses can be sound again, whether they need to handle upvars or not.

In the team discussion, we generally agreed that we'd prefer to see @dingxiangfei2009 explore that split-bitset alternative design, because the approach in this PR seems more brittle than we would like, assuming there are better alternatives.

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 18, 2024
@pnkfelix
Copy link
Member

@rustbot label: -I-compiler-nominated

@rustbot rustbot removed the I-compiler-nominated Nominated for discussion during a compiler team meeting. label Apr 18, 2024
@oskgo
Copy link
Contributor

oskgo commented Jul 26, 2024

Triage: Closing since this is replaced by #127522.

Note: if you are going to continue please open the PR BEFORE you push to it, else you won't be able to reopen - this is a quirk of github.
Thanks for your contribution.

@rustbot label: +S-inactive

@oskgo oskgo closed this Jul 26, 2024
@rustbot rustbot added the S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. label Jul 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.