Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARM64-SVE: Fix hwintrinsics flags #107791

Merged
merged 1 commit into from
Sep 13, 2024

Conversation

a74nh
Copy link
Contributor

@a74nh a74nh commented Sep 13, 2024

ARM64-SVE: Fix hwintrinsics flags

Spotted during implementation of #107459

SQADD and SQSUB have two variants, predicated and unpredicated. Currently we only support SVE1, so must use the unpredicated version. Remove RMW and low mask flags, because these are for the predicated version.

ConvertMaskToVector does not have an explicit mask, but ConvertVectorToMask does. Fix these.

Add assert checks to IsLowMaskedOperation

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Sep 13, 2024
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Sep 13, 2024
@a74nh a74nh added arm-sve Work related to arm64 SVE/SVE2 support community-contribution Indicates that the PR has been added by a community member and removed community-contribution Indicates that the PR has been added by a community member labels Sep 13, 2024
@a74nh a74nh marked this pull request as ready for review September 13, 2024 12:42
@a74nh
Copy link
Contributor Author

a74nh commented Sep 13, 2024

This is ready.

@dotnet/arm64-contrib @kunalspathak

@a74nh
Copy link
Contributor Author

a74nh commented Sep 13, 2024

Difference between this PR and HEAD is that HEAD will mark some registers as delay free when it doesn't need to.

@a74nh
Copy link
Contributor Author

a74nh commented Sep 13, 2024

Confirmed stress tests of the hwintrinsics tests fully pass.

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a question about HW_Flag_ExplicitMaskedOperation.

@@ -204,7 +204,8 @@ enum HWIntrinsicFlag : unsigned int
// The intrinsic uses a mask in arg1 to select elements present in the result
HW_Flag_ExplicitMaskedOperation = 0x20000,

// The intrinsic uses a mask in arg1 to select elements present in the result, and must use a low register.
// The intrinsic uses a mask in arg1 (either explicitly, embdedd or optionally embedded) to select elements present
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: something to fix in follow-up PR

Suggested change
// The intrinsic uses a mask in arg1 (either explicitly, embdedd or optionally embedded) to select elements present
// The intrinsic uses a mask in arg1 (either explicitly, embedded or optionally embedded) to select elements present

HARDWARE_INTRINSIC(Sve, ConvertMaskToVector, -1, 1, {INS_sve_mov, INS_sve_mov, INS_sve_mov, INS_sve_mov, INS_sve_mov, INS_sve_mov, INS_sve_mov, INS_sve_mov, INS_sve_mov, INS_sve_mov}, HW_Category_Helper, HW_Flag_Scalable|HW_Flag_ExplicitMaskedOperation)
HARDWARE_INTRINSIC(Sve, ConvertVectorToMask, -1, 2, {INS_sve_cmpne, INS_sve_cmpne, INS_sve_cmpne, INS_sve_cmpne, INS_sve_cmpne, INS_sve_cmpne, INS_sve_cmpne, INS_sve_cmpne, INS_sve_cmpne, INS_sve_cmpne}, HW_Category_Helper, HW_Flag_Scalable|HW_Flag_ReturnsPerElementMask|HW_Flag_LowMaskedOperation)
HARDWARE_INTRINSIC(Sve, ConvertMaskToVector, -1, 1, {INS_sve_mov, INS_sve_mov, INS_sve_mov, INS_sve_mov, INS_sve_mov, INS_sve_mov, INS_sve_mov, INS_sve_mov, INS_sve_mov, INS_sve_mov}, HW_Category_Helper, HW_Flag_Scalable)
HARDWARE_INTRINSIC(Sve, ConvertVectorToMask, -1, 2, {INS_sve_cmpne, INS_sve_cmpne, INS_sve_cmpne, INS_sve_cmpne, INS_sve_cmpne, INS_sve_cmpne, INS_sve_cmpne, INS_sve_cmpne, INS_sve_cmpne, INS_sve_cmpne}, HW_Category_Helper, HW_Flag_Scalable|HW_Flag_ExplicitMaskedOperation|HW_Flag_ReturnsPerElementMask|HW_Flag_LowMaskedOperation)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing HW_Flag_ExplicitMaskedOperation from MaskToVector and putting it VectorToMask makes sense, but wondering why it worked so far? probably we do not care of this flag and can be eliminated?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spoke offline. As expected, this was never used for these 2 intrinsics, but good to have for correctness.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll never use it because the converts will never be embedded

@kunalspathak kunalspathak merged commit 924fc2a into dotnet:main Sep 13, 2024
111 checks passed
@kunalspathak
Copy link
Member

/backport to release/9.0

Copy link
Contributor

Started backporting to release/9.0: https://github.com/dotnet/runtime/actions/runs/10853493972

@a74nh a74nh deleted the lsra_addsaturate_github branch September 16, 2024 08:31
jtschuster pushed a commit to jtschuster/runtime that referenced this pull request Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI arm-sve Work related to arm64 SVE/SVE2 support community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants