-
-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor hashtable use linear probing #15
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
… 14 entries with AVX512, AVX2, SSE4 and a simple linear search support
…vx512,avx2,sse4,loop} functions, not fully tested
…TH to match the naming convention
…used during the search phase to speed up the lookup
…instruction set variable to hint which instruction set should be selected
…e real world they don't get to perform exactly X searches on exactly the same data exactly sequentially, this approach regenerates the data each time and is more realistic
…ir own files, need to properly handle AVX2 compilation flags targetted per src file
…fy the define containing the main bench
…ke_config.*, rename version.cmake module in cmake_config.cmake, improve message logging, add a custom target to automatically update the cmake_config.c file on every build to correctly update the build date/time
… via benches or testing)
… variables to include the re-generated file at build time
…tialized, add a static variable to ensure the code is executed only once
…he compiler will always able to emit avx2/avx instructions
… instead of function pointers to pick the best implementation option at runtime
…e cases that would case slowdowns
…e convention for the methods and move the shared code to an external support c file
…he keys for the update bench
…collect hashtable stats and update state code
…y/values onto the bucket and add support disabling the locks (switch to atomic operations)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR contains the required refactoring to switch to a 13 slots per bucket model where each slot is identified by an half of the hash and uses accelerated AVX2/AVX, if available, to use 2 SIMD instructions to search among the slots.
Each bucket is cache-line aligned hence the number of slots is limited, although from the tests and benchmarks even with an hashtable with 133821673 buckets the number of slots used, after having inserted 120439505 keys (load factor 0.9) is 9 therefore the upper limit of 13 can be considered reasonably high.
This refactoring includes a couple of feature flags to:
This code is based of another branch containing the required changes to use chaining but after an initial implementation and benchmark I have decided to switch back to linear probing but with a different slightly implementation to improve the overall performances.
From an initial bench on an AMD EPYC 7502P with 128GB of memory (the additional stats need to be reviewed, they aren't calculated properly)