Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Morph Vector.Create(0) to Vector.Zero #63821

Merged
merged 68 commits into from
Feb 17, 2022

Conversation

TIHan
Copy link
Contributor

@TIHan TIHan commented Jan 14, 2022

Addresses #63490

Currently, this PR is dependent on #62933 due to using IsFloatPositiveZero.


Description

This PR will normalize Vector.Create(0) calls to simply become Vector.Zero calls, for example:

Vector64.Create(0);

will turn into

Vector64<int>.Zero;

This happens in the morph phase.

Acceptance Criteria


Some ARM64 diffs

 G_M17803_IG02:        ; gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
-            movi    v16.2s, #0x00
-            cmeq    v16.2s, v0.2s, v16.2s
+            cmeq    v16.2s, v0.2s, #0
             mov     v0.8b, v16.8b

 G_M54162_IG02:        ; gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
-            movi    v16.4s, #0x00
-            cmeq    v16.4s, v0.4s, v16.4s
+            cmeq    v16.4s, v0.4s, #0
             mov     v0.16b, v16.16b

Some x64 diffs
(The diff isn't really a regression, a move got shuffled around, but wanted to mention it)
diff

src/coreclr/jit/lowerxarch.cpp Outdated Show resolved Hide resolved
@TIHan
Copy link
Contributor Author

TIHan commented Jan 25, 2022

This is very close to being ready; only 1 test regressed in perf on x64:

-       vxorps   xmm0, xmm0, xmm0
+       vmovupd  xmm0, xmmword ptr [reloc @RWD96]
        vmovapd  xmmword ptr [rbp-60H], xmm0
-       vxorps   xmm0, xmm0, xmm0
+       vmovupd  xmm0, xmmword ptr [reloc @RWD96]

Will investigate.

Edit: This was fixed by reverting back the lowering change. Seems we need to have similar optimizations in morph and lowering.

@TIHan
Copy link
Contributor Author

TIHan commented Jan 25, 2022

@dotnet/jit-contrib this is ready again.

@tannergooding tannergooding self-requested a review February 8, 2022 05:52
Copy link
Member

@tannergooding tannergooding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just a nit and a question around whether a check is still good to keep.

@@ -7651,16 +7651,20 @@ inline bool GenTree::IsSIMDZero() const
//
inline bool GenTree::IsFloatPositiveZero() const
{
return IsCnsFltOrDbl() && !IsCnsNonZeroFltOrDbl();
if (IsCnsFltOrDbl())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this change? Seems IsCnsNonZeroFltOrDbl already does this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, but we decided that !IsCnsNonZeroFltOrDbl() is a bit confusing; an explicit implementation seemed to be clearer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add a comment explaining why you're not using !IsCns ...

@@ -14349,6 +14432,13 @@ GenTree* Compiler::fgMorphMultiOp(GenTreeMultiOp* multiOp)
}
#endif // defined(FEATURE_HW_INTRINSICS) && defined(TARGET_XARCH)

#ifdef FEATURE_HW_INTRINSICS
if (multiOp->OperIsHWIntrinsic() && !optValnumCSE_phase)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is !optValnumCSE_phase check needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was added because I was deleting nodes; however, I'm not deleting the nodes anymore and instead using ResetHWIntrinsic. I do feel safer having that guard there in-case more optimizations get added in fgOptimizeHWIntrinsic.

Read more in the discussion: #63821 (comment)

@TIHan
Copy link
Contributor Author

TIHan commented Feb 15, 2022

@jakobbotsch , since you approved this PR: #65028 - and it depends on this one, I'm wondering if you approve of this one as well.

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just a couple minor comments.

@@ -7651,16 +7651,20 @@ inline bool GenTree::IsSIMDZero() const
//
inline bool GenTree::IsFloatPositiveZero() const
{
return IsCnsFltOrDbl() && !IsCnsNonZeroFltOrDbl();
if (IsCnsFltOrDbl())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add a comment explaining why you're not using !IsCns ...

src/coreclr/jit/gentree.h Outdated Show resolved Hide resolved
@jakobbotsch
Copy link
Member

@jakobbotsch , since you approved this PR: #65028 - and it depends on this one, I'm wondering if you approve of this one as well.

Sorry I didn't get to it yesterday @TIHan, but looks like Andy unblocked you.

Copy link
Member

@jakobbotsch jakobbotsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as well.

@TIHan
Copy link
Contributor Author

TIHan commented Feb 16, 2022

No worries, thank you all for the reviews!

@TIHan TIHan merged commit 8c406af into dotnet:main Feb 17, 2022
@TIHan TIHan deleted the vector-64-128-256-Create-to-get_Zero branch February 17, 2022 07:54
@ghost ghost locked as resolved and limited conversation to collaborators Mar 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Normalize Vector64/Vector128.Create(0)/Create(0,0, ..) to Vector64/Vector128.Zero
7 participants