Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: A small CQ improvement for Equals/StartsWith unrolling #76439

Merged
merged 4 commits into from
Oct 2, 2022

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Sep 30, 2022

A quick fix for a CQ problem noticed by @stephentoub:

static bool IsHttps(string text) => text.StartsWith("https://", StringComparison.OrdinalIgnoreCase);
; Method Program:IsHttps(System.String):bool
 G_M40142_IG01:              
       C5F877               vzeroupper 
 G_M40142_IG02:              
       83790808             cmp      dword ptr [rcx+08H], 8
       7C25                 jl       SHORT G_M40142_IG04
 G_M40142_IG03:              
       C5F910410C           vmovupd  xmm0, xmmword ptr [rcx+0CH]
       C5F9EB052A000000     vpor     xmm0, xmm0, xmmword ptr [reloc @RWD00]
       C5F9EF0532000000     vpxor    xmm0, xmm0, xmmword ptr [reloc @RWD16]
       C4E27917C0           vptest   xmm0, xmm0
       0F94C0               sete     al
       0FB6C0               movzx    rax, al
       EB02                 jmp      SHORT G_M40142_IG05
 G_M40142_IG04:              
       33C0                 xor      eax, eax
 G_M40142_IG05:              
-      0FB6C0               movzx    rax, al
-G_M40142_IG06:              
       C3                   ret      
RWD00  	dq	0020002000200020h, 0000000000000020h
RWD16  	dq	0070007400740068h, 002F002F003A0073h
; Total bytes of code: 49

Some nice diffs.

@ghost ghost assigned EgorBo Sep 30, 2022
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Sep 30, 2022
@ghost
Copy link

ghost commented Sep 30, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

A quick fix for a CQ problem noticed by @stephentoub:

; Method Program:IsHttps(System.String):bool
 G_M40142_IG01:              
       C5F877               vzeroupper 
 G_M40142_IG02:              
       83790808             cmp      dword ptr [rcx+08H], 8
       7C25                 jl       SHORT G_M40142_IG04
 G_M40142_IG03:              
       C5F910410C           vmovupd  xmm0, xmmword ptr [rcx+0CH]
       C5F9EB052A000000     vpor     xmm0, xmm0, xmmword ptr [reloc @RWD00]
       C5F9EF0532000000     vpxor    xmm0, xmm0, xmmword ptr [reloc @RWD16]
       C4E27917C0           vptest   xmm0, xmm0
       0F94C0               sete     al
       0FB6C0               movzx    rax, al
       EB02                 jmp      SHORT G_M40142_IG05
 G_M40142_IG04:              
       33C0                 xor      eax, eax
 G_M40142_IG05:              
-      0FB6C0               movzx    rax, al
-G_M40142_IG06:              
       C3                   ret      
RWD00  	dq	0020002000200020h, 0000000000000020h
RWD16  	dq	0070007400740068h, 002F002F003A0073h
; Total bytes of code: 49
Author: EgorBo
Assignees: EgorBo
Labels:

area-CodeGen-coreclr

Milestone: -

@EgorBo EgorBo marked this pull request as ready for review September 30, 2022 16:40
@EgorBo
Copy link
Member Author

EgorBo commented Sep 30, 2022

@dotnet/jit-contrib PTAL, single-line change with nice diffs diff example: https://www.diffchecker.com/uJZJ51cv

@@ -526,15 +526,15 @@ GenTree* Compiler::impExpandHalfConstEquals(GenTreeLclVar* data,
GenTreeColon* lenCheckColon = gtNewColonNode(TYP_INT, indirCmp, gtNewFalse());

// For StartsWith we use GT_GE, e.g.: `x.Length >= 10`
lenCheckNode = gtNewQmarkNode(TYP_INT, gtNewOperNode(cmpOp, TYP_INT, lengthFld, elementsCount), lenCheckColon);
lenCheckNode = gtNewQmarkNode(TYP_BOOL, gtNewOperNode(cmpOp, TYP_INT, lengthFld, elementsCount), lenCheckColon);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a standard IR shape for qmarks? TYP_BOOL typed node is quite uncommon in the JIT IR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What ends up happening to the qmark when it is bool typed? How does this change the IR post qmark rationalization?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume they're pretty legal till Lower, in this case we get rid of QMARK in morph always

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakobbotsch DIFF for the snippet in the description: https://www.diffchecker.com/QeDP5n5k (Main vs PR)
as you can see it creates an assertion (value is 0..255) for LCL we spill QMARK to. Then that assertion helps to get rid of CAST node.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be interesting to see if there would be even better diffs by generalizing IntegralRange::ForNode slightly instead to handle qmarks (by union, I guess). Maybe commas too. From cursory glance the subrange assertion is coming from that.

This change is probably ok but it does look strange to have TYP_BOOL qmark with TYP_INT relop as 'then' branch, but I suppose we know that relops always leave the upper 31 bits cleared.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakobbotsch hm, let's see what CI thinks about it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to be handled in IntegralRange::ForNode - no new diffs so far, so it's just a more verbose change now 🙂 maybe will help some future code (if we won't get rid of QMARK at that point yet)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't expect it for qmark, but maybe if you also handle commas 🙂

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't expect it for qmark, but maybe if you also handle commas 🙂

No new diffs (for libraries.pmi)

@AndyAyersMS
Copy link
Member

FWIW, I am increasingly thinking we should get rid of qmarks immediately after importation. So would rather not see us build in cleverness that would have to be compensated for elsewhere. Can't local assertion prop get this (if suitably enhanced?)

@jakobbotsch
Copy link
Member

Can't local assertion prop get this (if suitably enhanced?)

Seems like it would be difficult to get this with local assertion prop after qmark rationalization, since the assignments to the local and the cast being eliminated would be in different basic blocks. But maybe global assertion prop could be enhanced to remove some of these casts.

@AndyAyersMS
Copy link
Member

Can't local assertion prop get this (if suitably enhanced?)

Seems like it would be difficult to get this with local assertion prop after qmark rationalization, since the assignments to the local and the cast being eliminated would be in different basic blocks. But maybe global assertion prop could be enhanced to remove some of these casts.

I thought qmark expansion was done after morph? In which case local prop would not be hindered by control flow, but the qmark assertion reconciliation might need some enhancing.

@AndyAyersMS
Copy link
Member

FWIW, I am increasingly thinking we should get rid of qmarks immediately after importation. So would rather not see us build in cleverness that would have to be compensated for elsewhere. Can't local assertion prop get this (if suitably enhanced?)

And just to show I don't often follow my own advice -- I am experimenting with allowing forward sub to work on qmarks.... similar to the ideas you had for modifying importation of isinst if we know it is feeding a compare.

@jakobbotsch
Copy link
Member

I thought qmark expansion was done after morph? In which case local prop would not be hindered by control flow, but the qmark assertion reconciliation might need some enhancing.

What I mean is if we remove/early expand qmarks in the future then local assertion prop is going to stop being able to recognize this pattern in a quite fundamental manner. FWIW, the current change in IntegralRange::ForNode is essentially teaching local assertion prop to be smarter (by proxy, assertion prop is delegating the subrange assertion work to this function).

@AndyAyersMS
Copy link
Member

What I mean is if we remove/early expand qmarks in the future then local assertion prop is going to stop being able to recognize this pattern in a quite fundamental manner.

Right. But we really ought to teach morph to walk the blocks in reverse postorder and propagate facts forward cross-block when it can, it ought not to cost all that much... (?). I tried this for the simple case where the block falls through and has a non-join bbNext and got some nice diffs but also some annoying regressions.

@AndyAyersMS
Copy link
Member

FWIW, I am increasingly thinking we should get rid of qmarks immediately after importation. So would rather not see us build in cleverness that would have to be compensated for elsewhere. Can't local assertion prop get this (if suitably enhanced?)

And just to show I don't often follow my own advice -- I am experimenting with allowing forward sub to work on qmarks.... similar to the ideas you had for modifying importation of isinst if we know it is feeding a compare.

Amusingly enough my change this is hitting a very similar kind of issue, morph is putting a subrange cast over a qmark

Assertion prop in BB03:
Copy     Assertion: V08 == V00, index = #01
               [000023] -----------                         *  LCL_VAR   ref    V00 arg0         
GenTreeNode creates assertion:
               [000039] -A-X-+-----                         *  ASG       int   
In BB03 New Local Subrange Assertion: V03 in [0..255], index = #03

fgMorphTree BB03, STMT00006 (after)
               [000039] -A-X-+-----                         *  ASG       int   
               [000038] D----+-N---                         +--*  LCL_VAR   int    V03 loc1         
               [000212] ---X-+-----                         \--*  CAST      int <- bool <- int
               [000032] ---X-+-----                            \--*  QMARK     int   
               [000031] J--X-+-N---    if                         +--*  GT        int   
               [000020] ---X-+-----                               |  +--*  IND       int   

@EgorBo
Copy link
Member Author

EgorBo commented Oct 1, 2022

Does it look good now?

@AndyAyersMS
Copy link
Member

Interesting that this leads to diffs with minopts.

@AndyAyersMS
Copy link
Member

Diffs (just x64 asp.net) of your change merged onto mine looks exactly like your standalone asp.net diffs.

So, no overlap there at least.

Diffs are based on 69,704 contexts (36,907 MinOpts, 32,797 FullOpts).

MISSED contexts: base: 22, diff: 22

Overall (-3,079 bytes)
Collection Base size (bytes) Diff size (bytes)
aspnet.run.windows.x64.checked.mch 22,933,298 -3,079
MinOpts (-944 bytes)
Collection Base size (bytes) Diff size (bytes)
aspnet.run.windows.x64.checked.mch 10,592,883 -944
FullOpts (-2,135 bytes)
Collection Base size (bytes) Diff size (bytes)
aspnet.run.windows.x64.checked.mch 12,340,415 -2,135
Details
Collection Diffed contexts MinOpts FullOpts Contexts with diffs Missed, base Missed, diff
aspnet.run.windows.x64.checked.mch 69,704 36,907 32,797 620 22 22

jit-analyze output

aspnet.run.windows.x64.checked.mch

To reproduce these diffs on Windows x64:

superpmi.py asmdiffs -target_os windows -target_arch x64 -arch x64

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 22933298 (overridden on cmd)
Total bytes of diff: 22930219 (overridden on cmd)
Total bytes of delta: -3079 (-0.01 % of base)
    diff is an improvement.
    relative diff is an improvement.
Detail diffs


Top file improvements (bytes):
        -368 : 69683.dasm (-6.45% of base)
        -368 : 38224.dasm (-6.45% of base)
        -170 : 41419.dasm (-1.34% of base)
        -170 : 37887.dasm (-1.34% of base)
        -131 : 37997.dasm (-0.65% of base)
        -131 : 37901.dasm (-0.65% of base)
        -130 : 23377.dasm (-1.66% of base)
        -130 : 49997.dasm (-1.67% of base)
        -100 : 38157.dasm (-8.14% of base)
         -55 : 51531.dasm (-2.03% of base)
         -55 : 37987.dasm (-0.44% of base)
         -48 : 34025.dasm (-0.55% of base)
         -24 : 51881.dasm (-1.23% of base)
         -21 : 702.dasm (-0.67% of base)
         -21 : 8293.dasm (-0.83% of base)
         -19 : 25021.dasm (-2.75% of base)
         -19 : 66476.dasm (-2.76% of base)
         -19 : 25494.dasm (-1.23% of base)
         -16 : 60586.dasm (-1.63% of base)
         -16 : 34773.dasm (-1.63% of base)

47 total files with Code Size differences (47 improved, 0 regressed), 20 unchanged.

Top method improvements (bytes):
        -368 (-6.45% of base) : 69683.dasm - Npgsql.TypeMapping.BuiltInTypeHandlerResolver:ResolveByDataTypeName(System.String):Npgsql.Internal.TypeHandling.NpgsqlTypeHandler:this
        -368 (-6.45% of base) : 38224.dasm - Npgsql.TypeMapping.BuiltInTypeHandlerResolver:ResolveByDataTypeName(System.String):Npgsql.Internal.TypeHandling.NpgsqlTypeHandler:this
        -170 (-1.34% of base) : 41419.dasm - Npgsql.NpgsqlConnectionStringBuilder:ToCanonicalKeyword(System.String):System.String:this
        -170 (-1.34% of base) : 37887.dasm - Npgsql.NpgsqlConnectionStringBuilder:ToCanonicalKeyword(System.String):System.String:this
        -131 (-0.65% of base) : 37901.dasm - Npgsql.NpgsqlConnectionStringBuilder:GeneratedSetter(System.String,System.Object):int:this
        -131 (-0.65% of base) : 37997.dasm - Npgsql.NpgsqlConnectionStringBuilder:RemoveGenerated(System.String):bool:this
        -130 (-1.66% of base) : 23377.dasm - System.Text.RegularExpressions.RegexCompiler:EmitMatchCharacterClass(System.String):this
        -130 (-1.67% of base) : 49997.dasm - System.Text.RegularExpressions.RegexCompiler:EmitMatchCharacterClass(System.String):this
        -100 (-8.14% of base) : 38157.dasm - Npgsql.PostgresTypes.PostgresBaseType:TranslateInternalName(System.String):System.String
         -55 (-0.44% of base) : 37987.dasm - Npgsql.NpgsqlConnectionStringBuilder:TryGetValueGenerated(System.String,byref):bool:this
         -55 (-2.03% of base) : 51531.dasm - Npgsql.NpgsqlSchema:GetSchema(Npgsql.NpgsqlConnection,System.String,System.String[],bool,System.Threading.CancellationToken):System.Threading.Tasks.Task`1[System.Data.DataTable]
         -48 (-0.55% of base) : 34025.dasm - Npgsql.NpgsqlConnectionStringBuilder:ToCanonicalKeyword(System.String):System.String:this
         -24 (-1.23% of base) : 51881.dasm - Npgsql.EntityFrameworkCore.PostgreSQL.Storage.Internal.NpgsqlTypeMappingSource:FindBaseMapping(byref):Microsoft.EntityFrameworkCore.Storage.RelationalTypeMapping:this
         -21 (-0.67% of base) : 702.dasm - System.Text.Json.JsonWriterHelper:ToUtf8(System.ReadOnlySpan`1[ubyte],System.Span`1[ubyte],byref,byref):int
         -21 (-0.83% of base) : 8293.dasm - System.Text.Json.JsonWriterHelper:ToUtf8(System.ReadOnlySpan`1[ubyte],System.Span`1[ubyte],byref,byref):int
         -19 (-2.75% of base) : 25021.dasm - System.IO.Enumeration.FileSystemName:TranslateWin32Expression(System.String):System.String
         -19 (-2.76% of base) : 66476.dasm - System.IO.Enumeration.FileSystemName:TranslateWin32Expression(System.String):System.String
         -19 (-1.23% of base) : 25494.dasm - System.Xml.Linq.ElementWriter:WriteElement(System.Xml.Linq.XElement):this
         -16 (-1.63% of base) : 60586.dasm - System.Linq.Enumerable:TryGetSingle[System.__Canon](System.Collections.Generic.IEnumerable`1[System.__Canon],System.Func`2[System.__Canon,bool],byref):System.__Canon
         -16 (-1.63% of base) : 34773.dasm - System.Linq.Enumerable:TryGetSingle[System.__Canon](System.Collections.Generic.IEnumerable`1[System.__Canon],System.Func`2[System.__Canon,bool],byref):System.__Canon

Top method improvements (percentages):
        -100 (-8.14% of base) : 38157.dasm - Npgsql.PostgresTypes.PostgresBaseType:TranslateInternalName(System.String):System.String
          -8 (-8.00% of base) : 52835.dasm - <>c:<.ctor>b__3_0(System.Reflection.MethodInfo):bool:this
          -8 (-8.00% of base) : 69467.dasm - <>c:<.ctor>b__3_0(System.Reflection.MethodInfo):bool:this
          -8 (-7.69% of base) : 38417.dasm - Newtonsoft.Json.Converters.RegexConverter:CanConvert(System.Type):bool:this
        -368 (-6.45% of base) : 69683.dasm - Npgsql.TypeMapping.BuiltInTypeHandlerResolver:ResolveByDataTypeName(System.String):Npgsql.Internal.TypeHandling.NpgsqlTypeHandler:this
        -368 (-6.45% of base) : 38224.dasm - Npgsql.TypeMapping.BuiltInTypeHandlerResolver:ResolveByDataTypeName(System.String):Npgsql.Internal.TypeHandling.NpgsqlTypeHandler:this
          -3 (-4.92% of base) : 52697.dasm - <>c:<ProcessEntityTypeAnnotations>b__2_0(System.String):bool:this
          -3 (-4.55% of base) : 51946.dasm - <>c:<DiscoverKeyProperties>b__7_0(Microsoft.EntityFrameworkCore.Metadata.IConventionProperty):bool:this
          -8 (-4.02% of base) : 34500.dasm - <>c:<.ctor>b__3_0(System.Reflection.MethodInfo):bool:this
          -8 (-4.02% of base) : 60167.dasm - <>c:<.ctor>b__3_0(System.Reflection.MethodInfo):bool:this
          -1 (-3.57% of base) : 68737.dasm - Microsoft.Extensions.Internal.ValueStopwatch:get_IsActive():bool:this
          -1 (-3.57% of base) : 32570.dasm - Microsoft.Extensions.Internal.ValueStopwatch:get_IsActive():bool:this
          -1 (-3.57% of base) : 15861.dasm - System.Runtime.DependentHandle:get_IsAllocated():bool:this
          -1 (-3.57% of base) : 18577.dasm - System.Runtime.DependentHandle:get_IsAllocated():bool:this
          -1 (-3.57% of base) : 11869.dasm - System.Runtime.InteropServices.GCHandle:get_IsAllocated():bool:this
          -1 (-3.57% of base) : 4309.dasm - System.Runtime.InteropServices.GCHandle:get_IsAllocated():bool:this
          -1 (-3.45% of base) : 30027.dasm - Microsoft.Extensions.Caching.Memory.MemoryCacheOptions:get_HasSizeLimit():bool:this
          -1 (-3.45% of base) : 68235.dasm - Microsoft.Extensions.Caching.Memory.MemoryCacheOptions:get_HasSizeLimit():bool:this
         -19 (-2.76% of base) : 66476.dasm - System.IO.Enumeration.FileSystemName:TranslateWin32Expression(System.String):System.String
         -19 (-2.75% of base) : 25021.dasm - System.IO.Enumeration.FileSystemName:TranslateWin32Expression(System.String):System.String


@EgorBo
Copy link
Member Author

EgorBo commented Oct 1, 2022

Interesting that this leads to diffs with minopts.

Looks like Morph still performs transformations without OptimizationsEnabled checks, we might want to audit all transformations there

@EgorBo EgorBo merged commit 044caf6 into dotnet:main Oct 2, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Nov 1, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants