-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize HashCode.AddBytes
for inputs larger than 16 bytes.
#70095
Conversation
I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label. |
Tagging subscribers to this area: @dotnet/area-system-runtime Issue DetailsThe This PR avoids this queueing logic, instead reading input and directly updating the hash's state in batches of 16 bytes, if the input is large enough. Benchmarks show significant improvements: BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19044.1706 (21H2)
Intel Core i7-7700HQ CPU 2.80GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores
.NET SDK=7.0.100-preview.5.22267.11
[Host] : .NET 7.0.0 (7.0.22.26611), X64 RyuJIT
DefaultJob : .NET 7.0.0 (7.0.22.26611), X64 RyuJIT
|
@stephentoub or somebody else, can you please take a look? |
Co-authored-by: Tanner Gooding <tagoo@outlook.com>
Thanks for the guidance on Discord @tannergooding, can you take another look? |
I am investigating the test failure. |
Test failures are unrelated. |
Co-authored-by: Stephen Toub <stoub@microsoft.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
It's slightly cheaper to call AddBytes before Add(int).
The
HashCode.AddBytes
method is optimized to read four bytes at a time and feed them to the privateHashCode.Add(int)
method, but this method is not that much optimized in processing large amounts of input; because the xxHash algorithm works in batches of 16 bytes, this implementation has to queue the integers in separate fields until four of them are accumulated and then update the hash's state, and that whole logic has a couple of branches.This PR avoids this queueing logic, instead reading input and directly updating the hash's state in batches of 16 bytes, if the input is large enough.
Benchmarks show significant improvements:
I ran them by copy-pasting the changed
HashCode
class to a file in my benchmark project, and comparing it to .NET'sHashCode
. TheUnaligned
benchmarks first add an integer before callingAddBytes
(to test the impact of the additional logic that empties theHashCode
's cache.