-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prep CompareInfo for spanification #32385
Prep CompareInfo for spanification #32385
Conversation
src/libraries/Common/src/Interop/Windows/Kernel32/Interop.Globalization.cs
Outdated
Show resolved
Hide resolved
src/libraries/Common/src/Interop/Windows/Kernel32/Interop.Globalization.cs
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Globalization/CompareInfo.Windows.cs
Outdated
Show resolved
Hide resolved
Over the next few days I'll get the perf measurements for the change with SetLastError = true. Been heads-down with other work. I don't expect this to have significant impact on typical API calls to Would it be a strange behavioral difference to check last error only in debug builds? If we went this route I imagine we'd want to |
Yep. |
Aside from the |
src/libraries/Common/src/Interop/Windows/Kernel32/Interop.Globalization.cs
Outdated
Show resolved
Hide resolved
I added minor comments. LGTM otherwise. Thanks for your effort with it. |
I believe all the feedback is now addressed. Waiting for CI to go green. |
internal static extern int FindStringOrdinal( | ||
uint dwFindStringOrdinalFlags, | ||
char* lpStringSource, | ||
int cchSource, | ||
char* lpStringValue, | ||
int cchValue, | ||
int bIgnoreCase); | ||
bool bIgnoreCase); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will add marshaling frame. Does it have effect on performance for small strings?
You can use upper-case BOOL
to get bool that is both zero-overhead and self-describing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean apply [MarshalAs(BOOL)]
to avoid the marshaling frame?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, we have an actual internal BOOL
enum. Got it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I almost want to define BOOL.FromBoolean(bool value) => Unsafe.As<bool, BOOL>(ref value)
to avoid the branching at the call sites. But that's probably an issue for a different PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it does not work. bool and BOOL have different sizes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, Unsafe.As<bool, byte>(ref value)
. The same thing we tried back in dotnet/coreclr#16138 but was killed.
@@ -211,11 +200,22 @@ private unsafe int CompareString(ReadOnlySpan<char> string1, string string2, Com | |||
|
|||
string? localeName = _sortHandle != IntPtr.Zero ? null : _sortName; | |||
|
|||
// CompareStringEx may try to dereference the first character of its input, even if an explicit | |||
// length of 0 is specified. To work around potential AVs we'll always ensure zero-length inputs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really? What is the input that it happens for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create a new C++ application and drop in this code. It will AV the process.
auto result = CompareStringEx(
LOCALE_NAME_INVARIANT,
NORM_LINGUISTIC_CASING,
L"x",
1,
(LPCWCH)0xdeadbeef,
0,
nullptr,
nullptr,
0);
Edit: I reported it to the Windows team and it's being fixed in a future release. But we pessimistically have to work around it for now.
@@ -211,11 +200,22 @@ private unsafe int CompareString(ReadOnlySpan<char> string1, string string2, Com | |||
|
|||
string? localeName = _sortHandle != IntPtr.Zero ? null : _sortName; | |||
|
|||
// CompareStringEx may try to dereference the first character of its input, even if an explicit | |||
// length of 0 is specified. To work around potential AVs we'll always ensure zero-length inputs | |||
// are normalized to a null-terminated empty string. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have test coverage for all these empty / null string corner cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The unit tests that are part of #1514 (from which this PR was broken out as a more manageable chunk) cover this scenario. In fact, those unit tests are where the problem was originally discovered. :)
Edit: Looks like https://github.com/dotnet/runtime/blob/master/src/libraries/System.Globalization/tests/CompareInfo/CompareInfoTests.Compare.cs also would cover this scenario when the larger "don't early-exit inappropriately" PR comes online.
I'm a little confused as to why CI is failing here. There are some things that look suspicious. From the Helix logs: /private/tmp/helix/working/A96E0923/w/AB700940/e /private/tmp/helix/working/A96E0923/w/AB700940/e
Discovering: System.Memory.Tests (method display = ClassAndMethod, method display options = None)
Discovered: System.Memory.Tests (found 2523 of 2547 test cases)
Starting: System.Memory.Tests (parallel test collections = on, max threads = 4)
Process terminated. Assertion failed.
at System.Globalization.CompareInfo.IndexOfOrdinalCore(ReadOnlySpan`1 source, ReadOnlySpan`1 value, Boolean ignoreCase, Boolean fromBeginning) in /_/src/libraries/System.Private.CoreLib/src/System/Globalization/CompareInfo.Unix.cs:line 48
at System.Globalization.CompareInfo.IndexOfOrdinal(String source, String value, Int32 startIndex, Int32 count, Boolean ignoreCase) in /_/src/libraries/System.Private.CoreLib/src/System/Globalization/CompareInfo.cs:line 1115
at System.Tests.StringTests.IndexOf_AllSubstrings(String s, String value, Int32 startIndex, StringComparison comparison) in /_/src/libraries/Common/tests/Tests/System/StringTests.cs:line 2801 Since the failure is at CompareInfo.Unix.cs, line 48, the assertion that's failing is the According to the stack trace the caller is CompareInfo.cs, line 1115, but there's no relevant code at that line. It's just another There are only three places where we call @jkotas, @tarekgh, any thoughts as to why CI might report such an error? And why the error would occur only on OSX and not any other Unix platform? |
I do not see anything obvious. It can be a JIT bug or data corruption - lots of unsafe code in globalization. |
maybe try to add some messages to the asserts in the involved files which may help give some clue. |
The other idea could be trying repro it on OSX machine |
I'll investigate a little more with the assumption that the stack trace is lying. :) |
This removes many (not all, see #8890) of the string-based p/invoke wrappers in
CompareInfo
and replaces them with span-based wrappers.There's a little more error checking around the actual p/invokes themselves, but otherwise the intent of this PR is that there's no behavioral change to the existing surface area, and there's no new API being exposed.
This PR serves as the shared foundation for upcoming work such as #25428, #27912, and #27935.
This PR also reverts a small optimization we did to
SortKey
in #31779, since I was already touchingSortKey
-related code paths here.