Creating comma temps differently for SubMulDiv morph #69770

TIHan · 2022-05-25T02:16:49Z

Description

This changes how comma temps are created for SubMulDiv morph.

Example code:

(multiplier * seed2 + increment) % Modulus;

In morph, this would turn into:

(let tmp1 = multiplier * seed2 + increment) in tmp1) - (tmp1 / Modulus) * Modulus

This PR morphs it into:

let tmp1 = multiplier * seed2 + increment in
tmp1 - (tmp1 / Modulus) * Modulus

Basically, we are moving the comma temps to be the root of the expression instead of leafs.

Acceptance Criteria

CI passes
Investigate a possible generic way of lifting GT_COMMAs up to be root nodes instead of leaf nodes.
- This might be a bit harder to do in a generic way. For now, let's not worry about it.

Diffs

ghost · 2022-05-25T02:16:57Z

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

At the moment, this is an experiment.

This changes how comma temps are created for SubMulDiv morph.

Example code:

(multiplier * seed2 + increment) % Modulus;

In morph, this would turn into:

(let tmp1 = multiplier * seed2 + increment) in tmp1) - (tmp1 / Modulus) * Modulus

This PR morphs it into:

let tmp1 = multiplier * seed2 + increment in
tmp1 - (tmp1 / Modulus) * Modulus

Author:	TIHan
Assignees:	TIHan
Labels:	`area-CodeGen-coreclr`
Milestone:	-

…sCloneableInvariantOrLocal

TIHan · 2022-05-26T00:21:23Z

@dotnet/jit-contrib This is ready. The diffs will show some regressions, but they are not detrimental. The improvements on both x64 and ARM64 look good, especially ARM64.

TIHan · 2022-05-27T17:53:41Z

@kunalspathak @AndyAyersMS - I feel good about this change and it's ready. It should help optimizing 'mod' since I'll be able to do the transformation before rationalization.

TIHan · 2022-05-27T17:53:41Z

@kunalspathak @AndyAyersMS - I feel good about this change and it's ready. It should help optimizing 'mod' since I'll be able to do the transformation before rationalization.

src/coreclr/jit/morph.cpp

AndyAyersMS

Looks good.

(but: you have formatting to fix).

JulieLeeMSFT · 2022-05-27T23:14:07Z

The diffs will show some regressions, but they are not detrimental. The improvements on both x64 and ARM64 look good, especially ARM64.

@TIHan can you show the perf results in this PR?

TIHan · 2022-05-31T19:47:27Z

Here is the link to the diffs: Diffs

kunalspathak · 2022-06-01T13:41:52Z

Seems there is significant regression introduced on arm64:

        1380 (0.88 % of base) : 4937.dasm - u8rem:TestEntryPoint():int
        1340 (0.80 % of base) : 4917.dasm - i8rem:TestEntryPoint():int
         976 (0.59 % of base) : 264436.dasm - u4rem:TestEntryPoint():int
         128 (0.24 % of base) : 16098.dasm - lclfldrem:Main():int
          16 (4.12 % of base) : 173626.dasm - Test_10w5d.testout1:Func_0_1_1_6_2():double
          16 (4.12 % of base) : 176286.dasm - Test_10w5d.testout1:Func_0_1_1_6_2():double
          12 (0.81 % of base) : 58712.dasm - ILGEN_0x13230206:Method_0xce57b468(long,double,byte,long):int
          12 (1.58 % of base) : 76667.dasm - ILGEN_0x3aa9c940:main():int
          12 (0.36 % of base) : 45954.dasm - ILGEN_0x977f9ed2:Method_0xf6b7353b():float
           8 (4.35 % of base) : 78111.dasm - DevDiv_590771:ILGEN_METHOD(long,ushort,long,int,ushort,long):byte
           4 (1.39 % of base) : 77616.dasm - ILGEN_0x152f1077:Method_0x2763af56(long):int

kunalspathak · 2022-06-01T13:43:32Z

Seems there is significant regression introduced on arm64:

just noticed your message about the regression in #69770 (comment)

kunalspathak · 2022-06-01T13:48:32Z

Seems like we are not CSEing the % operation for this:

    public Packet256Tracer(int width, int height)
    {
        if ((width % VectorPacket256.Packet256Size) != 0)
        {
            width += VectorPacket256.Packet256Size - (width % VectorPacket256.Packet256Size);
        }
        Width = width;
        Height = height;
    }

kunalspathak · 2022-06-01T14:04:26Z

src/coreclr/jit/morph.cpp

@@ -13833,6 +13877,28 @@ GenTree* Compiler::fgMorphMultiOp(GenTreeMultiOp* multiOp)
 //    division will be used, in that case this transform allows CSE to
 //    eliminate the redundant div from code like "x = a / 3; y = a % 3;".
 //
+//    Before:
+//        *  RETURN    int


Something doesn't feel right in Before: tree. Did it mean to have (V00 * V00) % V01. Same in After:.

(V00 * V00) % V01 is intentional.

In the after tree, you see that it takes (V00 * V00) and hoists it out:

// +--* ASG int // | +--* LCL_VAR int V03 tmp1 // | \--* MUL int // | +--* LCL_VAR int V00 arg0 // | \--* LCL_VAR int V00 arg0

As spoke offline, change MUL to BINOP and add a note that it is applicable for any BINOP that is safe to clone.

TIHan · 2022-06-01T17:51:01Z

Seems there is significant regression introduced on arm64:

It does look significant, but when I looked at the top regressions, the actual code is really wild with a lot of casts and checked contexts - something you really would not see in user code.

TIHan · 2022-06-01T17:52:46Z

Seems like we are not CSEing the % operation for this:

That is a good catch. I'm surprised by this one - I'm even more surprised how it was able to CSE it before because one of the ops would have a GT_COMMA in it.

kunalspathak

LGTM

kunalspathak · 2022-06-01T18:16:19Z

Seems like we are not CSEing

My before and after were flipped. I think it looks good.

TIHan · 2022-06-01T18:17:42Z

As a note from @kunalspathak 's comment regarding the % CSE in this image:
https://user-images.githubusercontent.com/12488060/171420114-f42a8c64-54ca-476f-a9c8-43b13c6775fe.png
It is actually reversed; the left side is the diff whereas the right side was the base. This means that the change actually makes CSE work in this example :)

Creating comma temps differently for SubMulDiv morph

cf18c28

ghost assigned TIHan May 25, 2022

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 25, 2022

TIHan added 3 commits May 25, 2022 11:36

Fixing some formatting. Added comments. Renamed fgMustMakeTemp to fgI…

c8f74e0

…sCloneableInvariantOrLocal

Renaming

1f3f601

Putting back in a check

c6f31d3

TIHan marked this pull request as ready for review May 25, 2022 21:44

Formatting

de80c00

TIHan mentioned this pull request May 26, 2022

ARM64 - Always morph GT_MOD #68885

Merged

2 tasks

AndyAyersMS reviewed May 27, 2022

View reviewed changes

src/coreclr/jit/morph.cpp Show resolved Hide resolved

src/coreclr/jit/morph.cpp Show resolved Hide resolved

src/coreclr/jit/morph.cpp Show resolved Hide resolved

Adding comments

0ef1ec7

AndyAyersMS approved these changes May 27, 2022

View reviewed changes

TIHan added 2 commits May 31, 2022 12:50

Formatting

40cc56c

Merge remote-tracking branch 'upstream/main' into submuldiv-change

5e32888

kunalspathak reviewed Jun 1, 2022

View reviewed changes

kunalspathak approved these changes Jun 1, 2022

View reviewed changes

TIHan merged commit 5772d54 into dotnet:main Jun 1, 2022

This was referenced Jun 4, 2022

Use gtEffectiveVal for GT_ADD op1 in optCreateAssertion #70228

Merged

JIT: Invalid results/assertion errors with modulo ops #70333

Closed

ghost locked as resolved and limited conversation to collaborators Jul 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Creating comma temps differently for SubMulDiv morph #69770

Creating comma temps differently for SubMulDiv morph #69770

TIHan commented May 25, 2022 •

edited

Loading

ghost commented May 25, 2022

TIHan commented May 26, 2022

TIHan commented May 27, 2022

TIHan commented May 27, 2022

AndyAyersMS left a comment •

edited

Loading

JulieLeeMSFT commented May 27, 2022

TIHan commented May 31, 2022

kunalspathak commented Jun 1, 2022

kunalspathak commented Jun 1, 2022

kunalspathak commented Jun 1, 2022

kunalspathak Jun 1, 2022

TIHan Jun 1, 2022

kunalspathak Jun 1, 2022

TIHan commented Jun 1, 2022

TIHan commented Jun 1, 2022

kunalspathak left a comment

kunalspathak commented Jun 1, 2022

TIHan commented Jun 1, 2022

Creating comma temps differently for SubMulDiv morph #69770

Creating comma temps differently for SubMulDiv morph #69770

Conversation

TIHan commented May 25, 2022 • edited Loading

ghost commented May 25, 2022

TIHan commented May 26, 2022

TIHan commented May 27, 2022

TIHan commented May 27, 2022

AndyAyersMS left a comment • edited Loading

Choose a reason for hiding this comment

JulieLeeMSFT commented May 27, 2022

TIHan commented May 31, 2022

kunalspathak commented Jun 1, 2022

kunalspathak commented Jun 1, 2022

kunalspathak commented Jun 1, 2022

kunalspathak Jun 1, 2022

Choose a reason for hiding this comment

TIHan Jun 1, 2022

Choose a reason for hiding this comment

kunalspathak Jun 1, 2022

Choose a reason for hiding this comment

TIHan commented Jun 1, 2022

TIHan commented Jun 1, 2022

kunalspathak left a comment

Choose a reason for hiding this comment

kunalspathak commented Jun 1, 2022

TIHan commented Jun 1, 2022

TIHan commented May 25, 2022 •

edited

Loading

AndyAyersMS left a comment •

edited

Loading