-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Set naive likelihoods when initializing preds #98192
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsPart of #93020. Per discussion on #98054, set "naive" edge likelihoods (in other words, assume every successor edge is equally likely to be taken) in cc @dotnet/jit-contrib
|
src/coreclr/jit/fgflow.cpp
Outdated
|
||
const unsigned numSucc = blockPred->NumSucc(); | ||
assert(numSucc > 0); | ||
flow->setLikelihood(1.0 / numSucc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AndyAyersMS you mentioned the likelihood calculation should consider the edge's duplicate count. I ran into issues with BBJ_COND
blocks with true/false branches to the same target, as NumSucc
handles this case by returning 1 for such blocks. In such cases, if we calculate the likelihood as (1.0 / numSucc) / dupCount
, the edge likelihood overflows to (1.0 / 1) * 2 = 2.0
. So leaving the dup count out of the calculation seems to work for that case, but I think this will give us wonky likelihoods for BBJ_SWITCH
blocks with duplicate successors.
Is it possible to encounter BBJ_SWITCH
blocks with duplicate successors this early on in the JIT? If so, we can use BasicBlock::NumSucc(Compiler *)
to get the number of unique switch successors to get the edge likelihoods right, though it'll probably cost a bit more TP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For switches having dups is pretty common.
Maybe you just need to special-case BBJ_COND
and check for the odd cases where both targets are the same block.
Diff results for #98192Throughput diffsThroughput diffs for windows/arm64 ran on linux/x64Overall (+0.03% to +0.05%)
MinOpts (+0.03% to +0.11%)
FullOpts (+0.03% to +0.05%)
Details here |
Diff results for #98192Assembly diffsAssembly diffs for linux/arm64 ran on windows/x64Diffs are based on 1,610,272 contexts (368,644 MinOpts, 1,241,628 FullOpts). MISSED contexts: 3,428 (0.21%) Overall (-16 bytes)
FullOpts (-16 bytes)
Assembly diffs for linux/x64 ran on windows/x64Diffs are based on 1,620,764 contexts (360,162 MinOpts, 1,260,602 FullOpts). MISSED contexts: 3,086 (0.19%) Overall (-15 bytes)
FullOpts (-15 bytes)
Assembly diffs for osx/arm64 ran on windows/x64Diffs are based on 1,733,061 contexts (561,303 MinOpts, 1,171,758 FullOpts). MISSED contexts: 3,460 (0.20%) Overall (-16 bytes)
FullOpts (-16 bytes)
Assembly diffs for windows/arm64 ran on windows/x64Diffs are based on 1,477,297 contexts (263,527 MinOpts, 1,213,770 FullOpts). MISSED contexts: 3,464 (0.23%) Overall (-16 bytes)
FullOpts (-16 bytes)
Assembly diffs for windows/x64 ran on windows/x64Diffs are based on 1,999,231 contexts (587,594 MinOpts, 1,411,637 FullOpts). MISSED contexts: 3,657 (0.18%) Overall (-2 bytes)
FullOpts (-2 bytes)
Details here Assembly diffs for linux/arm ran on windows/x86Diffs are based on 1,449,330 contexts (345,734 MinOpts, 1,103,596 FullOpts). MISSED contexts: 55,656 (3.70%) Overall (+0 bytes)
FullOpts (+0 bytes)
Assembly diffs for windows/x86 ran on windows/x86Diffs are based on 1,618,717 contexts (327,626 MinOpts, 1,291,091 FullOpts). MISSED contexts: base: 11,019 (0.68%), diff: 11,022 (0.68%) Overall (+38 bytes)
FullOpts (+38 bytes)
Details here Throughput diffsThroughput diffs for linux/arm64 ran on windows/x64Overall (+0.03% to +0.05%)
MinOpts (+0.03% to +0.11%)
FullOpts (+0.03% to +0.05%)
Throughput diffs for linux/x64 ran on windows/x64Overall (+0.03% to +0.05%)
MinOpts (+0.03% to +0.15%)
FullOpts (+0.03% to +0.06%)
Throughput diffs for osx/arm64 ran on windows/x64Overall (+0.03% to +0.05%)
MinOpts (+0.03% to +0.11%)
FullOpts (+0.03% to +0.05%)
Throughput diffs for windows/x64 ran on windows/x64Overall (+0.03% to +0.06%)
MinOpts (+0.03% to +0.15%)
FullOpts (+0.03% to +0.06%)
Details here Throughput diffs for linux/arm ran on windows/x86Overall (+0.03% to +0.04%)
MinOpts (+0.03% to +0.10%)
FullOpts (+0.03% to +0.04%)
Throughput diffs for windows/x86 ran on windows/x86Overall (+0.03% to +0.05%)
MinOpts (+0.02% to +0.16%)
FullOpts (+0.03% to +0.05%)
Details here Throughput diffs for linux/arm64 ran on linux/x64Overall (+0.03% to +0.05%)
MinOpts (+0.03% to +0.12%)
FullOpts (+0.03% to +0.06%)
Throughput diffs for linux/x64 ran on linux/x64Overall (+0.04% to +0.06%)
MinOpts (+0.03% to +0.16%)
FullOpts (+0.04% to +0.06%)
Details here |
Diff results for #98192Throughput diffsThroughput diffs for linux/arm64 ran on linux/x64Overall (+0.02% to +0.04%)
MinOpts (+0.02% to +0.09%)
FullOpts (+0.02% to +0.04%)
Throughput diffs for linux/x64 ran on linux/x64Overall (+0.03% to +0.04%)
MinOpts (+0.03% to +0.14%)
FullOpts (+0.03% to +0.04%)
Throughput diffs for osx/arm64 ran on linux/x64Overall (+0.02% to +0.04%)
MinOpts (+0.02% to +0.11%)
FullOpts (+0.02% to +0.04%)
Throughput diffs for windows/arm64 ran on linux/x64Overall (+0.02% to +0.04%)
MinOpts (+0.02% to +0.10%)
FullOpts (+0.02% to +0.04%)
Details here |
Diff results for #98192Assembly diffsAssembly diffs for linux/arm64 ran on windows/x64Diffs are based on 1,610,272 contexts (368,644 MinOpts, 1,241,628 FullOpts). MISSED contexts: 3,428 (0.21%) Overall (-16 bytes)
FullOpts (-16 bytes)
Assembly diffs for linux/x64 ran on windows/x64Diffs are based on 1,620,764 contexts (360,162 MinOpts, 1,260,602 FullOpts). MISSED contexts: 3,086 (0.19%) Overall (-15 bytes)
FullOpts (-15 bytes)
Assembly diffs for osx/arm64 ran on windows/x64Diffs are based on 1,733,061 contexts (561,303 MinOpts, 1,171,758 FullOpts). MISSED contexts: 3,460 (0.20%) Overall (-16 bytes)
FullOpts (-16 bytes)
Assembly diffs for windows/arm64 ran on windows/x64Diffs are based on 1,477,297 contexts (263,527 MinOpts, 1,213,770 FullOpts). MISSED contexts: 3,464 (0.23%) Overall (-16 bytes)
FullOpts (-16 bytes)
Assembly diffs for windows/x64 ran on windows/x64Diffs are based on 1,999,231 contexts (587,594 MinOpts, 1,411,637 FullOpts). MISSED contexts: 3,657 (0.18%) Overall (-2 bytes)
FullOpts (-2 bytes)
Details here Assembly diffs for linux/arm ran on windows/x86Diffs are based on 1,449,330 contexts (345,734 MinOpts, 1,103,596 FullOpts). MISSED contexts: 55,656 (3.70%) Overall (+0 bytes)
FullOpts (+0 bytes)
Assembly diffs for windows/x86 ran on windows/x86Diffs are based on 1,618,717 contexts (327,626 MinOpts, 1,291,091 FullOpts). MISSED contexts: base: 11,019 (0.68%), diff: 11,022 (0.68%) Overall (+38 bytes)
FullOpts (+38 bytes)
Details here Throughput diffsThroughput diffs for linux/arm64 ran on windows/x64Overall (+0.02% to +0.04%)
MinOpts (+0.02% to +0.10%)
FullOpts (+0.02% to +0.04%)
Throughput diffs for linux/x64 ran on windows/x64Overall (+0.03% to +0.04%)
MinOpts (+0.03% to +0.14%)
FullOpts (+0.03% to +0.04%)
Throughput diffs for osx/arm64 ran on windows/x64Overall (+0.02% to +0.04%)
MinOpts (+0.02% to +0.11%)
FullOpts (+0.02% to +0.04%)
Throughput diffs for windows/arm64 ran on windows/x64Overall (+0.02% to +0.04%)
MinOpts (+0.02% to +0.10%)
FullOpts (+0.02% to +0.04%)
Throughput diffs for windows/x64 ran on windows/x64Overall (+0.03% to +0.05%)
MinOpts (+0.03% to +0.13%)
FullOpts (+0.03% to +0.05%)
Details here Throughput diffs for linux/arm64 ran on linux/x64Overall (+0.02% to +0.04%)
MinOpts (+0.02% to +0.09%)
FullOpts (+0.02% to +0.04%)
Throughput diffs for linux/x64 ran on linux/x64Overall (+0.02% to +0.04%)
MinOpts (+0.02% to +0.12%)
FullOpts (+0.02% to +0.04%)
Details here Throughput diffs for linux/arm ran on windows/x86Overall (+0.02% to +0.03%)
MinOpts (+0.02% to +0.09%)
FullOpts (+0.02% to +0.03%)
Throughput diffs for windows/x86 ran on windows/x86Overall (+0.02% to +0.03%)
MinOpts (+0.01% to +0.14%)
FullOpts (+0.02% to +0.03%)
Details here |
@AndyAyersMS PTAL. TP is slightly more reasonable now. I took a look at |
Part of #93020. Per discussion on #98054, set "naive" edge likelihoods (in other words, assume every successor edge is equally likely to be taken) in
fgAddRefPred
when initializing preds.cc @dotnet/jit-contrib