Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid boxing in Enum.HasFlag in Tier0 #89348

Merged
merged 8 commits into from
Jul 27, 2023
Merged

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Jul 22, 2023

We see a lot of cases when methods stay in Tier0 despite being called a lot in large apps (the "tier1 promotion queue" problem) so we'd better remove allocations in Tier0 too where possible, I gave up on this opt in #77357 but I still think it's useful:

bool Test(MyEnum e) => e.HasFlag(MyEnum.B);

Main (Tier0):

; Assembly listing for method Program:Test(int):bool (Tier0)	
       push     rbp	
       sub      rsp, 48	
       lea      rbp, [rsp+30H]	
       xor      eax, eax	
       mov      qword ptr [rbp-08H], rax	
       mov      qword ptr [rbp-10H], rax	
       mov      dword ptr [rbp+10H], ecx	
       mov      rcx, 0x7FFAE2C3B208	
       call     CORINFO_HELP_NEWSFAST  ;; <--- Allocation
       mov      gword ptr [rbp-08H], rax	
       mov      rcx, gword ptr [rbp-08H]	
       mov      eax, dword ptr [rbp+10H]	
       mov      dword ptr [rcx+08H], eax	
       mov      rcx, 0x7FFAE2C3B208
       call     CORINFO_HELP_NEWSFAST  ;; <--- Allocation
       mov      gword ptr [rbp-10H], rax	
       mov      rcx, gword ptr [rbp-10H]	
       mov      dword ptr [rcx+08H], 1	
       mov      rcx, gword ptr [rbp-08H]	
       mov      rdx, gword ptr [rbp-10H]	
       call     [System.Enum:HasFlag(System.Enum):bool:this]	
       nop	
       add      rsp, 48	
       pop      rbp	
       ret	
; Total bytes of code 103

PR (Tier0):

; Assembly listing for method Program:Test(int):bool (Tier0)
       push     rbp
       sub      rsp, 32
       lea      rbp, [rsp+20H]
       xor      eax, eax
       mov      qword ptr [rbp-08H], rax
       mov      qword ptr [rbp-10H], rax
       mov      dword ptr [rbp+10H], ecx
       mov      eax, dword ptr [rbp+10H]
       mov      dword ptr [rbp-14H], eax
       mov      eax, dword ptr [rbp-14H]
       and      eax, 1
       cmp      eax, 1
       sete     al
       movzx    rax, al
       add      rsp, 32
       pop      rbp
       ret
; Total bytes of code 50

@ghost ghost assigned EgorBo Jul 22, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 22, 2023
@ghost
Copy link

ghost commented Jul 22, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

We see a lot of cases when methods stay in Tier0 despite being called a lot in large apps (the "tier1 promotion queue" problem) so we'd better remove allocations in Tier0 too where possible, I gave up on this opt in #77357 but I still think it's useful:

bool Test(MyEnum e) => e.HasFlag(MyEnum.B);

Main:

; Assembly listing for method Program:Test(int):bool (Tier0)	
       push     rbp	
       sub      rsp, 48	
       lea      rbp, [rsp+30H]	
       xor      eax, eax	
       mov      qword ptr [rbp-08H], rax	
       mov      qword ptr [rbp-10H], rax	
       mov      dword ptr [rbp+10H], ecx	
       mov      rcx, 0x7FFAE2C3B208	
       call     CORINFO_HELP_NEWSFAST  ;; <--- Allocation
       mov      gword ptr [rbp-08H], rax	
       mov      rcx, gword ptr [rbp-08H]	
       mov      eax, dword ptr [rbp+10H]	
       mov      dword ptr [rcx+08H], eax	
       mov      rcx, 0x7FFAE2C3B208
       call     CORINFO_HELP_NEWSFAST  ;; <--- Allocation
       mov      gword ptr [rbp-10H], rax	
       mov      rcx, gword ptr [rbp-10H]	
       mov      dword ptr [rcx+08H], 1	
       mov      rcx, gword ptr [rbp-08H]	
       mov      rdx, gword ptr [rbp-10H]	
       call     [System.Enum:HasFlag(System.Enum):bool:this]	
       nop	
       add      rsp, 48	
       pop      rbp	
       ret	
; Total bytes of code 103

PR:

; Assembly listing for method Program:Test(int):bool (Tier0)
       push     rbp
       sub      rsp, 32
       lea      rbp, [rsp+20H]
       xor      eax, eax
       mov      qword ptr [rbp-08H], rax
       mov      qword ptr [rbp-10H], rax
       mov      dword ptr [rbp+10H], ecx
       mov      eax, dword ptr [rbp+10H]
       mov      dword ptr [rbp-14H], eax
       mov      eax, dword ptr [rbp-14H]
       and      eax, 1
       cmp      eax, 1
       sete     al
       movzx    rax, al
       add      rsp, 32
       pop      rbp
       ret
; Total bytes of code 50
Author: EgorBo
Assignees: EgorBo
Labels:

area-CodeGen-coreclr

Milestone: -

@EgorBo
Copy link
Member Author

EgorBo commented Jul 22, 2023

@MihuBot -tier0

@EgorBo
Copy link
Member Author

EgorBo commented Jul 22, 2023

@dotnet/jit-contrib PTAL, simple change - no SPMI diffs (new contexts), jit-diffs in tier0: Total bytes of delta: -2158 (-0.00 % of base)

EgorBo and others added 2 commits July 22, 2023 21:09
Co-authored-by: Michał Petryka <35800402+MichalPetryka@users.noreply.github.com>
@EgorBo
Copy link
Member Author

EgorBo commented Jul 25, 2023

@jakobbotsch @dotnet/jit-contrib PTAL, simple change

Should fix the concerns raised in AvaloniaUI/Avalonia#4873 (comment)

EgorBo and others added 2 commits July 26, 2023 16:32
Co-authored-by: Jakob Botsch Nielsen <Jakob.botsch.nielsen@gmail.com>
@EgorBo
Copy link
Member Author

EgorBo commented Jul 26, 2023

Acutally, I don't think we need to guard this under debug/minopts - it won't decrease quality of debug. Maybe we shouldn't share them for debug/minopts at all

@jakobbotsch
Copy link
Member

Acutally, I don't think we need to guard this under debug/minopts - it won't decrease quality of debug. Maybe we shouldn't share them for debug/minopts at all

Presumably there is a reason the code exists in the first place. It would be good to verify the original scenarios that caused the code to be added do not get regressed with these changes.

Also cc @AndyAyersMS

Copy link
Member

@jakobbotsch jakobbotsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me with the above note -- it would be nice to double check whether there were some particular programs that caused this sharing to be added in the first place.

Would also like to hear @AndyAyersMS's opinion.

@dotnet dotnet deleted a comment from azure-pipelines bot Jul 27, 2023
@dotnet dotnet deleted a comment from azure-pipelines bot Jul 27, 2023
@jakobbotsch
Copy link
Member

/azp run runtime-coreclr superpmi-diffs

@dotnet dotnet deleted a comment from azure-pipelines bot Jul 27, 2023
@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jakobbotsch
Copy link
Member

Seems like there are build failures.

@EgorBo
Copy link
Member Author

EgorBo commented Jul 27, 2023

Seems like there are build failures.

Oops, fixed.

it would be nice to double check whether there were some particular programs that caused this sharing to be added in the first place.

I wonder if sharing actually makes debug experience worse.

@AndyAyersMS
Copy link
Member

I don't believe there is any impact on debugging as the box temp is always a (evaluation) stack object.

I think the sharing bit is from way back; I always thought it was there to avoid some code size / frame bloat. If you disable sharing entirely presumably you might see some of this in SPMI.

@EgorBo EgorBo merged commit 84a65ca into dotnet:main Jul 27, 2023
125 checks passed
@EgorBo EgorBo deleted the EnumHasFlag-boxing branch July 27, 2023 20:33
@ghost ghost locked as resolved and limited conversation to collaborators Aug 27, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants