Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a VN-based dead store removal phase #77990

Merged
merged 11 commits into from
Nov 14, 2022

Conversation

SingleAccretion
Copy link
Contributor

@SingleAccretion SingleAccretion commented Nov 7, 2022

This new phase will iterate over the stores referenced through SSA descriptors, and delete those which do not change the local's value, determined via VN's selection mechanism.

TP: ~0.1%.
SPMI diffs (Win-x64): -5K for benchmarks.run, -24K for libraries.pmi.

Overall minor impact on both counts, but the amount of code is not large too.

Detailed TP regression breakdown:

Base: 52494649362, Diff: 52555823204, +0.1165%

Compiler::optVNBasedDeadStoreRemoval                             : 20273376 : NA          : 28.04% : +0.0386%
Compiler::fgValueNumber                                          : 11366343 : +13.93%     : 15.72% : +0.0217%
ValueNumStore::VNForFunc                                         : 8873142  : +1.52%      : 12.27% : +0.0169%
ValueNumStore::GetAllocChunk                                     : 3852024  : +1.90%      : 5.33%  : +0.0073%
JitHashTable<unsigned int,unsigned int,>::Set                    : 3062006  : +6.79%      : 4.23%  : +0.0058%
ValueNumStore::VNForMapSelectWork                                : 3023527  : +3.65%      : 4.18%  : +0.0058%
ValueNumStore::VnForConst<int,ValueNumStore::VNMap<int,>>        : 2481512  : +8.15%      : 3.43%  : +0.0047%
ArenaAllocator::allocateMemory                                   : 1951190  : +0.19%      : 2.70%  : +0.0037%
JitExpandArray<unsigned __int64 *>::EnsureCoversInd              : 1326343  : +0.83%      : 1.83%  : +0.0025%
ValueNumStore::GetVNFunc1Map                                     : 1275570  : +11.92%     : 1.76%  : +0.0024%
DoPhase                                                          : 1221592  : +0.88%      : 1.69%  : +0.0023%
LIR::Range::Delete                                               : 912609   : +164434.05% : 1.26%  : +0.0017%
JitHashTable<VNDefFuncApp<1>, unsigned int>::Grow                : 833936   : +7.69%      : 1.15%  : +0.0016%
JitExpandArray<ValueNumStore::VNDefFuncApp<2> >::EnsureCoversInd : 808398   : +10.77%     : 1.12%  : +0.0015%
LclVarDsc::HasGCPtr                                              : 777500   : +353.94%    : 1.08%  : +0.0015%
ValueNumStore::VNZeroForType                                     : -1151448 : -3.39%      : 1.59%  : -0.0022%

The impact of the zero-init fix is more than I'd like, but it is not clear if it can be ameliorated without compromising correctness; the logic is not trivial.

The impact of fetching ZeroObj VNs is also somewhat surprising; one more reason to rework the "init block" IR shape as it exists today to have something STRUCT-typed on the RHS.

All that said, the improvements from #77655 should cover for this, with a bit left still.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Nov 7, 2022
@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Nov 7, 2022
@ghost
Copy link

ghost commented Nov 7, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

This new phase will iterate over the stores references through SSA descriptors, and delete those which do not change the local's value, determined via VN's selection mechanism.

TP impact: ~0.05%.
SPMI diffs (Win-x64): -4K for benchmarks.run, -23K for libraries.pmi.

Author: SingleAccretion
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

Fewer regressions; a bit faster TP-wise.

Diffs (libraries.pmi) for the full implementation:
---------------------------------------------------------------------------------
 2,236 contexts with diffs (1,871 improvements, 111 regressions)
   -25,583/+1,467 bytes

Diffs (libraries.pmi) for partial definitions only:
---------------------------------------------------------------------------------
 1,687 contexts with diffs (1,683 improvements, 3 regressions)
   -23,661/+25 bytes

TP impact about the same, ~0.05% against ~0.045%.
@SingleAccretion
Copy link
Contributor Author

SingleAccretion commented Nov 7, 2022

As one would suspect, we're bumping here into bugs with how simplistic is VN's logic for whether something will be zero-initialized in the prolog is. Will take some iterations to sort this out.

Edit: done.

@runfoapp runfoapp bot mentioned this pull request Nov 7, 2022
(The "lvaIsOSRLocal" check in codegen was redundant)
@SingleAccretion
Copy link
Contributor Author

SingleAccretion commented Nov 8, 2022

@dotnet/jit-contrib

(Edit: naturally, we should stress and fuzz this)

@jakobbotsch
Copy link
Member

/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress, Fuzzlyn

@azure-pipelines
Copy link

Azure Pipelines successfully started running 3 pipeline(s).

@jakobbotsch
Copy link
Member

Not sure what happened in the win-arm64 Fuzzlyn run, the artifacts were apparently not published in the expected location. I checked the partitions manually and saw no errors.

@SingleAccretion
Copy link
Contributor Author

Looks like there are stress failures that will need to be investigated.

  1. System.Drawing on ARM64.
  2. baseservices.threading on macOS ARM64 (possibly unrelated).

@SingleAccretion
Copy link
Contributor Author

SingleAccretion commented Nov 9, 2022

The System.Drawing tests are a clear case of silent bad codegen. Unfortunately, without an ARM64 device, it is not trivial to understand where exactly the failure comes from. The pattern of tests failures suggests the problem is somewhere in the auto-generated interop code, and it is of a very particular nature (most test results are 2x of what they should be).

Will try PMIing the assembly manually next.

Edit: notably, the tests in question only fail with TC on...

Edit: the local debugging plan didn't pan out. Will have to use the CI to get more information.

Edit: more CI debugging did not reproduce the issue. Curious.

@SingleAccretion
Copy link
Contributor Author

SingleAccretion commented Nov 10, 2022

@jakobbotsch Let's run another libraries stress here. I've run the System.Drawing tests 3 times in the supposedly same configuration in #PR, and couldn't reproduce the failures. So either:

  1. The failure is intermittent (and infrequent); quite possible given TC=1.
  2. It was, somehow, a one-off.
  3. The setup in #PR is wrong.

So, if we see it fail again, it was most likely 3. If not... Well, I am not sure at this point.

I have also pushed a debug-only commit to capture the dump if we do see it fail.

@jakobbotsch
Copy link
Member

/azp run runtime-coreclr libraries-jitstress

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jakobbotsch
Copy link
Member

I'll see if I can repro the failure on win-arm64 some time on the weekend or next week.

@SingleAccretion
Copy link
Contributor Author

No luck, the stress was clean. Will revert the debug commit.

@jakobbotsch
Copy link
Member

I was not able to repro the failures after running the exact bits that failed in CI on my win-arm64 machine in a loop for a while, so I think it can be ignored (maybe some kind of GDI bug?).

@SingleAccretion
Copy link
Contributor Author

Thank you for looking! In the meantime I found a reference to the second failure here: #72365 (comment).

@jakobbotsch
Copy link
Member

/azp run runtime-coreclr superpmi-diffs, runtime-coreclr superpmi-replay

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@jakobbotsch jakobbotsch merged commit 73f2bd4 into dotnet:main Nov 14, 2022
@jakobbotsch
Copy link
Member

Awesome work, thank you!

@SingleAccretion SingleAccretion deleted the SSA-Dead-Code branch November 14, 2022 17:59
@EgorBo
Copy link
Member

EgorBo commented Nov 24, 2022

Improvements on win-x64: dotnet/perf-autofiling-issues#10064

@ghost ghost locked as resolved and limited conversation to collaborators Jan 2, 2023
@jeffhandley jeffhandley added the blog-candidate Completed PRs that are candidate topics for blog post coverage label Mar 14, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI blog-candidate Completed PRs that are candidate topics for blog post coverage community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants