Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JIT] Fix - Do not remove CAST nodes on store if the LCL_VAR is address exposed #85734

Merged
merged 8 commits into from
Jul 20, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
Do not remove CAST nodes on assignment if the LCL_VAR is a parameter.
  • Loading branch information
TIHan committed May 3, 2023
commit 004b61a0a5c095a88dbe674aff25b22045a69564
11 changes: 8 additions & 3 deletions src/coreclr/jit/morph.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -10189,9 +10189,14 @@ GenTree* Compiler::fgOptimizeCastOnAssignment(GenTreeOp* asg)
if (!effectiveOp1->OperIs(GT_IND, GT_LCL_VAR, GT_LCL_FLD))
return asg;

if (effectiveOp1->OperIs(GT_LCL_VAR) &&
!lvaGetDesc(effectiveOp1->AsLclVarCommon()->GetLclNum())->lvNormalizeOnLoad())
return asg;
if (effectiveOp1->OperIs(GT_LCL_VAR))
{
LclVarDsc* varDsc = lvaGetDesc(effectiveOp1->AsLclVarCommon()->GetLclNum());

// It is not safe to remove the cast for non-NormalizeOnLoad variables or parameters.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should mention the "why" - i. e. that this is a quirk/workaround for VN's handling of NOL and that we know this is in fact not a complete fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't that these issues are with VN's handling of NOLs; I could be wrong - it's very hard to determine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this issue isn't with VN's handling, then what is it with? We should understand the problem before we work around it.

Copy link
Contributor

@SingleAccretion SingleAccretion May 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well - we don't know about other bugs related to this, but if there are, surely we would want to find out what they are about as well.

FWIW, example from #85382 looks very much like the VN issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should understand the problem before we work around it.

I tried to describe the issue in the description regarding the mov elision.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems likely that a peephole in the emitter would be able to get a large portion of the benefit of that subrange assertion check without having to complicate the NOL semantics with how stores are handled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did experiment with adding a peephole to look at redundant movzx instructions that would help certain cases.

@jakobbotsch so that opt, I disabled it and #85382 did pass, but not #85081.

Here is what I am using to test:

DOTNET_TieredCompilation=0
DOTNET_JitStressRegs=8
using System.Runtime.CompilerServices;

public class Program
{
    public static long s_15;
    public static sbyte s_17;
    public static ushort s_21 = 36659;
    public static void Main()
    {
        s_15 = ~1;
        var x = (ushort)0;
        bool vr1 = M40(x);
    }

    public static bool M40(ushort arg0)
    {
        for (int var0 = 0; var0 < 2; var0++)
        {
            arg0 = 65535;
            arg0 &= (ushort)(s_15++ >> s_17);
            arg0 %= s_21;
        }

        System.Console.WriteLine(arg0);

        Test.MainTest();

        return false;
    }
}


public class Test
{
    public class Program
    {
        public static short s_2;

        [MethodImpl(MethodImplOptions.NoInlining)]
        public static void Consume(int x) { }

        [MethodImpl(MethodImplOptions.NoInlining)]
        public static int M8(byte arg0)
        {
            s_2 = 1;
            arg0 = (byte)(-s_2);
            var vr1 = arg0 & arg0;
            Consume(vr1);
            return vr1;
        }
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    public static int MainTest()
    {
        var x = (byte)1;
        var result = Test.Program.M8(x);
        if (result != 255)
        {
            return 0;
        }
        Console.WriteLine("Success");
        return 100;
    }
}

A successful run will print:

28876
Success

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I too took a look at the dump for #85382 and the problem (in my interpretation) is that remorphing invalidates previous assumption as made by and acted on in assertion prop (@jakobbotsch - did you mean this too?):

  1. We have NOL = CAST<short>(...).
  2. Local assertion prop takes a dependency on this, does not add cast for a downstream use.
  3. Remorph invalidates the assumption by removing the cast (or, rather making it into a CAST<int> one).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(@jakobbotsch - did you mean this too?):

Yes, except I think the optimization made by local assertion prop makes the model of NOL locals much more complex than it should be (i.e. it means your explanation of NOL locals above is really not the fully story).
Local assertion prop here does not only rely on what it proves by seeing the subrange assertion. It also relies on normalization done by genUnspillRegIfNeeded, GT_LCL_VAR codegen and args homing to make this work.
In other words: the assertion only meaningfully describes the upper bits if it just so happens that the local ended up being in a 4-byte home for the time between the assignment that created it and the use.

It's very subtle and hard to reason about this in the face of interacting optimizations like this one (another issue I dug up with an interacting optimization: #66624). So if we could get most of the benefits of this optimization through an emitter peephole, and simplify morph to always normalize, then I think that would be much preferable to the current state of things.

Copy link
Member

@jakobbotsch jakobbotsch May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, restricting the fix to parameters is not enough, it also needs to pessimize struct fields. The following example fails in the same way:

public static bool M40()
{
    S s = default;
    for (int var0 = 0; var0 < 2; var0++)
    {
        s.U = 65535;
        s.U &= (ushort)(s_15++ >> s_17);
        s.U %= s_21;
    }

    System.Console.WriteLine(s.U);
    return false;
}

public struct S { public ushort U; }

if (!varDsc->lvNormalizeOnLoad() || varDsc->lvIsParam)
return asg;
}

if (op2->gtOverflow())
return asg;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ public static int M8(byte arg0)
}
}

[ActiveIssue("https://github.com/dotnet/runtime/issues/85081")]
[Fact]
public static int TestEntryPoint() {
var result = Test.Program.M8(1);
if (result != 255)
Expand Down
51 changes: 51 additions & 0 deletions src/tests/JIT/Regression/JitBlue/Runtime_85382/Runtime_85382.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

using System;
using System.Collections;
using System.Collections.Generic;
using System.Runtime.CompilerServices;
using Xunit;

public class Test
{
// This is trying to verify that we properly zero-extend on all platforms.
public class Program
{
public static long s_15;
public static sbyte s_17;
public static ushort s_21 = 36659;
public static int Test()
{
s_15 = ~1;
return M40(0);
}

[MethodImpl(MethodImplOptions.NoInlining)]
public static void Consume(ushort x) { }

[MethodImpl(MethodImplOptions.NoInlining)]
public static int M40(ushort arg0)
{
for (int var0 = 0; var0 < 2; var0++)
{
arg0 = 65535;
arg0 &= (ushort)(s_15++ >> s_17);
arg0 %= s_21;
}

Consume(arg0);

if (arg0 != 28876)
{
return 0;
}
return 100;
}
}

[Fact]
public static int TestEntryPoint() {
return Test.Program.Test();
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<Optimize>True</Optimize>
</PropertyGroup>
<ItemGroup>
<Compile Include="$(MSBuildProjectName).cs" />
</ItemGroup>
</Project>