JIT: fix case where RBO leads to an invalid CSE #88159

AndyAyersMS · 2023-06-28T20:21:37Z

If phi-based RBO bypasses a block with a memory PHI, it is possible for CSE to find invalid memory-based CSEs. An example of this is seen in the attached test case.

Ideally perhaps CSE would kill availability of these CSEs at the point where memory can change, but that seems difficult to arrange. Instead, we mark the bypased block as one that will not propagate any incoming CSEs, as the failures we know of require CSEs to flow back through this block.

Fixes #88091.

If phi-based RBO bypasses a block with a memory PHI, it is possible for CSE to find invalid memory-based CSEs. An example of this is seen in the attached test case. Ideally perhaps CSE would kill availability of these CSEs at the point where memory can change, but that seems difficult to arrange. Instead, we mark the bypased block as one that will not propagate any incoming CSEs, as the failures we know of require CSEs to flow back through this block. Fixes dotnet#88091.

AndyAyersMS · 2023-06-28T20:25:06Z

@dotnet/jit-contrib PTAL

Consider this as an interim fix, I would like to get something plausible in now and perhaps find a better fix down the road.

Not sure who to best tag for a review, so am going to arbitrarily pick @jakobbotsch, but happy for anone else to weigh in too.

Should be fairly minimal diffs, either code size increases from lack of costly CSEs, or code size decreases from lack of cheap CSEs leading to less prolog/epilog code.

Diffs

Small size increase on arm64, smaller decrease on x64.

jakobbotsch

Seems like a reasonable surgical fix to me.

ghost · 2023-06-28T21:20:19Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

If phi-based RBO bypasses a block with a memory PHI, it is possible for CSE to find invalid memory-based CSEs. An example of this is seen in the attached test case.

Ideally perhaps CSE would kill availability of these CSEs at the point where memory can change, but that seems difficult to arrange. Instead, we mark the bypased block as one that will not propagate any incoming CSEs, as the failures we know of require CSEs to flow back through this block.

Fixes #88091.

Author:	AndyAyersMS
Assignees:	-
Labels:	`area-CodeGen-coreclr`
Milestone:	-

markples · 2023-06-28T21:48:09Z

src/tests/JIT/Regression/JitBlue/Runtime_88091/Runtime_88091.cs

+    static int Main()
+    {
+        int result = 0;
+        try 
+        {
+            Problem(data);
+            result = 100;
+        }
+        catch (Exception e)
+        {
+            Console.WriteLine($"Failed: {e.Message}");
+            result = -1;
+        }
+        return result;
+    }


Make this not Main (convention has been TestEntryPoint but it isn't semantically meaningful) and mark it with [Fact] (using Xunit;)

Optionally (not recommended here), instead make Problem a [Theory] and figure out ClassData for the input.

Optionally (recommend), make the entire method body Problem(data). The infrastructure will handle exceptions and the 100 isn't required at that point (can just return void)

Like so?

using XUnit; class Runtime_88091 { [Fact] static int Test() => Problem(data);

Yes, but (I missed this the first time) also make it (the method -and- class) public.

To verify type "88091" in the test tab of the azdo job for the PR and see if it finds the test. It won't if it isn't public, doesn't have [Fact], etc. There are a few checks in msbuild and the analyzers, but there are incomplete and I'm hoping to improve them so that you get direct feedback instead of the current situation.

The infrastructure will handle exceptions

As far as I've seen in #88006, it does mark the test as failed but it doesn't print the exception in the log in such case.

@markples azdo still doesn't show the test has been run. Anything obviously wrong to you? If not I will probably merge to get the fix in and we can sort it out later.

I'll see if I can fix the exception reporting.

The test is actually running. There seems to be something strange with the azdo test filter where it doesn't show up for a while. I don't know if it is time-based or if it's based on earlier queries. (I also didn't see it but then after looking for "JIT", "Regression", etc., it started showing up. I also see it in the helix log.)

Yeah I see it now. Thanks!

markples · 2023-06-28T21:57:47Z

src/coreclr/jit/redundantbranchopts.cpp

@@ -1588,6 +1595,29 @@ bool Compiler::optJumpThreadCore(JumpThreadInfo& jti)
        }
    }

+    // If this is a phi-based threading, and the block we're bypassing has
+    // a memory phi, and the new successors do not, mark the block with BBF_ALTERED_MEMORY_PHI


I'm guessing BBF_ALTERED_MEMORY_PHI was an earlier name of BBF_NO_CSE_IN?

Which I suppose ties in to my other question: technically this only needs to kill memory CSEs, is that correct? Presumably this is just simpler for the interim fix.

Ah., thanks for spotting that.

Yes, it could just kill memory CSEs, but it is actually tricky to figure out if a VN is memory dependent. If we could do that then we could instead kill memory CSEs in blocks with "memory havoc" and fix the bug a bit more surgically.

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 28, 2023

BruceForstall requested a review from jakobbotsch June 28, 2023 21:08

jakobbotsch approved these changes Jun 28, 2023

View reviewed changes

ghost assigned AndyAyersMS Jun 28, 2023

markples reviewed Jun 28, 2023

View reviewed changes

AndyAyersMS added 2 commits June 28, 2023 16:10

review feedback

9fb95a3

make test method public too

b120674

build-analysis bot mentioned this pull request Jun 29, 2023

windows-x86 release MSBuild_CMake is failing #88172

Closed

AndyAyersMS merged commit bba7a9c into dotnet:main Jun 29, 2023
132 of 135 checks passed

EgorBo mentioned this pull request Jul 6, 2023

[Perf] Windows/arm64: 1 Regression on 6/29/2023 7:46:59 PM dotnet/perf-autofiling-issues#19572

Closed

AndyAyersMS added this to the 8.0.0 milestone Jul 18, 2023

ghost locked as resolved and limited conversation to collaborators Aug 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT: fix case where RBO leads to an invalid CSE #88159

JIT: fix case where RBO leads to an invalid CSE #88159

AndyAyersMS commented Jun 28, 2023

AndyAyersMS commented Jun 28, 2023 •

edited

Loading

jakobbotsch left a comment

ghost commented Jun 28, 2023

markples Jun 28, 2023

AndyAyersMS Jun 28, 2023

markples Jun 28, 2023

MichalPetryka Jun 28, 2023

AndyAyersMS Jun 29, 2023

markples Jun 29, 2023

AndyAyersMS Jun 29, 2023

markples Jun 28, 2023

AndyAyersMS Jun 28, 2023

JIT: fix case where RBO leads to an invalid CSE #88159

JIT: fix case where RBO leads to an invalid CSE #88159

Conversation

AndyAyersMS commented Jun 28, 2023

AndyAyersMS commented Jun 28, 2023 • edited Loading

jakobbotsch left a comment

Choose a reason for hiding this comment

ghost commented Jun 28, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AndyAyersMS commented Jun 28, 2023 •

edited

Loading