Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor hashtable use linear probing #15

Merged
merged 89 commits into from
Jun 8, 2020
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
89 commits
Select commit Hold shift + click to select a range
bfc66a6
Initial implementation of the hash search algorithm in the array with…
danielealbano May 31, 2020
55d89e2
Initial drop of the cachelines-based hashtable implementation
danielealbano May 31, 2020
03e881f
Style fix
danielealbano May 31, 2020
20f7a9d
Initial benchmarking support for the hashtable_support_hash_search_{a…
danielealbano May 31, 2020
248be49
Prefix cmake custom variables with the projectname
danielealbano May 31, 2020
6722f40
Drop more code related to the cacheline-based hashtable
danielealbano May 31, 2020
a80afd5
Rename HASHTABLE_INLINE_KEY_MAX_SIZE to HASHTABLE_KEY_INLINE_MAX_LENG…
danielealbano May 31, 2020
d224862
Add support to store a rpefix of the key into the KV structure to be …
danielealbano May 31, 2020
520370b
Import portable snippet cpu library ( https://github.com/nemequ/porta…
danielealbano May 31, 2020
3f4c92c
Drop gcc builtins to make the code more portable, drop the preferred …
danielealbano May 31, 2020
0726c6b
Drop the AVX512 and the SSE linear search algorithm implementation
danielealbano May 31, 2020
30e13c2
Replace avx2 search algorithm implementation with the branchless prov…
danielealbano May 31, 2020
b0a76cb
Drop AVX512 and SEE benchs
danielealbano May 31, 2020
494e0f1
Drop unused header
danielealbano Jun 1, 2020
1449d42
Refactor the code to better test the two search implementation, in th…
danielealbano Jun 1, 2020
e300fcb
Split out the avx2/loop hash search algorithm implementation into the…
danielealbano Jun 1, 2020
33bb3e9
Add the ad-hoc compilation flags for the avx2 hash serch algorithm im…
danielealbano Jun 1, 2020
08341c0
Improve benchmark structure
danielealbano Jun 1, 2020
b196a46
Rely on the build system to decide if this the avx2 variant has to be…
danielealbano Jun 2, 2020
dfe2963
Implement the AVX hash search algorithm variant
danielealbano Jun 2, 2020
32103ac
Add the AVX search algorithm to the auto-selection function
danielealbano Jun 2, 2020
cd87975
Rework the build system
danielealbano Jun 2, 2020
86c8716
Add the AVX version of the search algorithm to the benches and simpli…
danielealbano Jun 2, 2020
22c33d7
Rename TARGET_ARCH in CACHEGRAND_ARCH_TARGET, rename version.* in cma…
danielealbano Jun 2, 2020
71a0928
Expose the CMAKE_BUILD_TYPE variable
danielealbano Jun 2, 2020
aed75ea
Reorganise the cmake_config.h file and fix cpp support (when included…
danielealbano Jun 2, 2020
bf482bf
Split the buildstep invoked as custom dependency target in its own cm…
danielealbano Jun 2, 2020
6343ad6
Update the src cmakefile script to use the new cmake_config_c exposed…
danielealbano Jun 2, 2020
21f4898
Refactor the hash search functions to be able to ignore some matches …
danielealbano Jun 6, 2020
fb5ad77
Select the right hash search implementation when the hashtable is ini…
danielealbano Jun 6, 2020
8a1d00f
Drop some more cachelines related code
danielealbano Jun 6, 2020
7ef46fb
Explicitly drop the volatile attribute
danielealbano Jun 6, 2020
4b3ecf3
Refactor the data structures to support chains of rings containing 14…
danielealbano Jun 6, 2020
0a6e118
Refactor hashtable data initialization code to support the new data s…
danielealbano Jun 6, 2020
472dcdb
Refactor the get operation and the support operations (slot hash / ke…
danielealbano Jun 6, 2020
c7da799
Refactor the hashtable get tests to check the new structures and impl…
danielealbano Jun 6, 2020
324f6c4
Drop more cachelines related code (cachelines_to_probe* / buckets_cou…
danielealbano Jun 6, 2020
699fc98
Update the get op benchmark of the hashtable to support the new struc…
danielealbano Jun 7, 2020
a09ae9a
Style fix
danielealbano Jun 7, 2020
ad2c718
Drop clang support (it's necessary to stick with gcc to emit cmpxchg1…
danielealbano Jun 7, 2020
efea617
Re-organise the bucket structure to have the write_lock using only 1 …
danielealbano Jun 7, 2020
4a3b82d
Add a prefix_key struct in the hastable_bucket_key_value struct and u…
danielealbano Jun 7, 2020
f99cb38
Fix naming convention (atomic is appended after)
danielealbano Jun 7, 2020
ac770ff
Rename skip_indexes in skip_indexes_mask
danielealbano Jun 7, 2020
3438f46
When compiling code for debugging disable inlineing and any kind of o…
danielealbano Jun 7, 2020
cafbfd7
Fix benches dependencies
danielealbano Jun 7, 2020
100a585
Implement bucket locking, unlocking and fetching with write lock (use…
danielealbano Jun 7, 2020
6ae7c03
Refactor the set implementation to rely on the new data structures an…
danielealbano Jun 7, 2020
05fa6ca
Fix atomic invocation
danielealbano Jun 7, 2020
b830c4b
Fix lock check
danielealbano Jun 7, 2020
0f2372e
Add thread yielding if bucket locked or unable to acquire bucket
danielealbano Jun 7, 2020
a93fad5
Expose if the bucket has been initialized upon the invocation of hash…
danielealbano Jun 7, 2020
a93b65b
Fix the mask implemented in the HASHTABLE_BUCKET_WRITE_LOCK_SET macro
danielealbano Jun 7, 2020
352efbf
Add hashtable_support_op testing
danielealbano Jun 7, 2020
eced0ee
Add testing for hashtable_support_hash_half
danielealbano Jun 7, 2020
6adf743
Initialize found to false and pass back to the caller the found bucket
danielealbano Jun 7, 2020
23049a8
Style fix
danielealbano Jun 7, 2020
1cf4cbc
Update the hashtable_op_set tests
danielealbano Jun 7, 2020
293ad18
Fix hashtable_op_set "set 2 slots" test
danielealbano Jun 7, 2020
b60bfc6
Update how the t1ha library is built, includes t1ha0, t1ha1 and t1ha…
danielealbano Jun 8, 2020
b59fab5
Fix t1ha linking
danielealbano Jun 8, 2020
4cc7695
Fix how the avx2/avx search algorithm gets build, if it's on x86_64 t…
danielealbano Jun 8, 2020
d2e10e1
Implement the search algorithms for the 8 slots version and use ifunc…
danielealbano Jun 8, 2020
5e6af66
Switch to use t1ha0, after a number of tests the load factor almost d…
danielealbano Jun 8, 2020
6276d49
Drop pre-check on the write lock
danielealbano Jun 8, 2020
2e4e6e8
Lock always before checkinf the chain_first_ring is null to avoid edg…
danielealbano Jun 8, 2020
af18d87
Fix the search or create key (hashtable_support_op_search_key_or_crea…
danielealbano Jun 8, 2020
1623cd1
Improve testing
danielealbano Jun 8, 2020
78ee645
Fix hashtable structures
danielealbano Jun 8, 2020
2a2ab70
Update the hashtable support hash search benchmark to use the new nam…
danielealbano Jun 8, 2020
55bf660
Improve how the load factor is calculated
danielealbano Jun 8, 2020
11d9a70
Drop the code related to the cachelines / loadfactor calculation
danielealbano Jun 8, 2020
cd9e3fc
Use the set_thread_affinity function from bench-support.c
danielealbano Jun 8, 2020
66d3c98
Fix headers
danielealbano Jun 8, 2020
2c4372e
Drops keys pre-generation, better to generate random keys on every it…
danielealbano Jun 8, 2020
ab5104b
Add the ability to collect hashtable stats (load factor, buckets usag…
danielealbano Jun 8, 2020
c879f37
Check for set errors
danielealbano Jun 8, 2020
561154f
Fix index variable type and fix how the hashtable is prefilled with t…
danielealbano Jun 8, 2020
5151349
Run the benches at least 10 times
danielealbano Jun 8, 2020
59d0640
Allow a maximum of 4 threads per core on the benches
danielealbano Jun 8, 2020
3cb9501
Rename the support function in the hashtable set bench and share the …
danielealbano Jun 8, 2020
5248f3a
Enforce -O3 compilation flag when building for non-debug and fix style
danielealbano Jun 8, 2020
33c17c5
Refactor the hashtable to use linear probing, add support to embed ke…
danielealbano Jun 8, 2020
dd05e1e
Fix mmap result check
danielealbano Jun 8, 2020
1fdbf4f
Drop useless fixtures
danielealbano Jun 8, 2020
0b2b1e4
Update the tests to test the linear probing
danielealbano Jun 8, 2020
be352e0
Update the defaults (no locks, key/values embedded)
danielealbano Jun 8, 2020
0f1ef1a
Update the tests to support the new data structure
danielealbano Jun 8, 2020
a891287
Re-enable locks, with atomic ops 6 times slower
danielealbano Jun 8, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Initial drop of the cachelines-based hashtable implementation
  • Loading branch information
danielealbano committed May 31, 2020
commit 55d89e29c1f543af1d5a1a40e46705d1b3006e96
26 changes: 0 additions & 26 deletions src/hashtable/hashtable.h
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,6 @@ extern "C" {
#define HASHTABLE_INLINE_KEY_MAX_SIZE 23
#define HASHTABLE_PRIMENUMBERS_COUNT 38
#define HASHTABLE_PRIMENUMBERS_MAX 4294967291U
#define HASHTABLE_CONFIG_CACHELINES_TO_PROBE_COUNT HASHTABLE_PRIMENUMBERS_COUNT
#define HASHTABLE_CONFIG_CACHELINES_PRIMENUMBERS_MAP_SIZE 12

#define HASHTABLE_PRIMENUMBERS_LIST \
42U, /* not a prime number, but it's the answer! */ \
Expand Down Expand Up @@ -66,19 +64,6 @@ extern "C" {
3429704039U, \
4294967291U

#define HASHTABLE_CONFIG_CACHELINES_PRIMENUMBERS_MAP \
{ 42U, 2U }, \
{ 3389U, 4U }, \
{ 7639U, 6U }, \
{ 17203U, 7U }, \
{ 26813U, 8U }, \
{ 40213U, 9U }, \
{ 458377U, 10U }, \
{ 2320651U, 12U }, \
{ 17622551U, 16U }, \
{ 89214403U, 17U }, \
{ 133821599U, 18U }, \
{ HASHTABLE_PRIMENUMBERS_MAX, 32U }

typedef uint8_t hashtable_bucket_key_value_flags_t;
typedef uint32_t hashtable_bucket_hash_t;
Expand Down Expand Up @@ -110,22 +95,12 @@ enum {
#define HASHTABLE_BUCKET_KEY_VALUE_IS_EMPTY(flags) \
(flags == 0)

/**
* Configuration of the map between the hashtable size and the cachelines to probe when searching / assigning hashes
*/
struct hashtable_config_cachelines_to_probe {
hashtable_bucket_count_t hashtable_size;
uint16_t cachelines_to_probe;
};
typedef struct hashtable_config_cachelines_to_probe hashtable_config_cachelines_to_probe_t;

/**
* Configuration of the hashtable
*/
struct hashtable_config {
uint64_t initial_size;
bool can_auto_resize;
hashtable_config_cachelines_to_probe_t cachelines_to_probe[HASHTABLE_CONFIG_CACHELINES_TO_PROBE_COUNT];
};
typedef struct hashtable_config hashtable_config_t;

Expand Down Expand Up @@ -158,7 +133,6 @@ typedef struct hashtable_bucket_key_value hashtable_bucket_key_value_t;
struct hashtable_data {
hashtable_bucket_count_t buckets_count;
hashtable_bucket_count_t buckets_count_real;
uint16_t cachelines_to_probe;
uint64_t t1ha2_seed;
bool can_be_deleted;
size_t hashes_size;
Expand Down
12 changes: 0 additions & 12 deletions src/hashtable/hashtable_data.c
Original file line number Diff line number Diff line change
Expand Up @@ -13,18 +13,6 @@

static const char* TAG = "hashtable/data";

uint16_t hashtable_data_cachelines_to_probe_from_buckets_count(
hashtable_config_t* hashtable_config,
hashtable_bucket_count_t buckets_count) {
hashtable_config_cachelines_to_probe_t* list = hashtable_config->cachelines_to_probe;
for(uint64_t index = 0; index < HASHTABLE_CONFIG_CACHELINES_TO_PROBE_COUNT; index++) {
if (list[index].hashtable_size == buckets_count) {
return list[index].cachelines_to_probe;
}
}

return 0;
}

hashtable_data_t* hashtable_data_init(hashtable_bucket_count_t buckets_count, uint16_t cachelines_to_probe) {
hashtable_bucket_count_t buckets_count_real = 0;
Expand Down
10 changes: 4 additions & 6 deletions src/hashtable/hashtable_data.h
Original file line number Diff line number Diff line change
Expand Up @@ -5,16 +5,14 @@
extern "C" {
#endif

uint16_t hashtable_data_cachelines_to_probe_from_buckets_count(
hashtable_config_t* hashtable_config,
hashtable_data_t* hashtable_data_init(
hashtable_bucket_count_t buckets_count);

hashtable_data_t* hashtable_data_init(
hashtable_bucket_count_t buckets_count,
uint16_t cachelines_to_probe);
void hashtable_data_free_buckets(
hashtable_data_t* hashtable_data);

void hashtable_data_free(
volatile hashtable_data_t* hashtable_data);
hashtable_data_t* hashtable_data);

#ifdef __cplusplus
}
Expand Down
35 changes: 0 additions & 35 deletions src/hashtable/hashtable_support_index.c
Original file line number Diff line number Diff line change
Expand Up @@ -7,43 +7,8 @@
#include "hashtable_support_index.h"
#include "hashtable_support_primenumbers.h"


uint64_t hashtable_support_index_rounddown_to_cacheline(uint64_t number) {
return number -
(number % HASHTABLE_HASHES_PER_CACHELINE);
}

uint64_t hashtable_support_index_roundup_to_cacheline_to_probe(uint64_t number, uint16_t cachelines_to_probe) {
return
hashtable_support_index_rounddown_to_cacheline(number) +
(HASHTABLE_HASHES_PER_CACHELINE * cachelines_to_probe);
}

hashtable_bucket_index_t hashtable_support_index_from_hash(
hashtable_bucket_count_t buckets_count,
hashtable_bucket_hash_t hash) {
return hashtable_support_primenumbers_mod(hash, buckets_count);
}

void hashtable_support_index_calculate_neighborhood_from_index(
hashtable_bucket_index_t index,
uint16_t cachelines_to_probe,
hashtable_bucket_index_t *index_neighborhood_begin,
hashtable_bucket_index_t *index_neighborhood_end) {
*index_neighborhood_begin = hashtable_support_index_rounddown_to_cacheline(index);
*index_neighborhood_end = hashtable_support_index_roundup_to_cacheline_to_probe(index, cachelines_to_probe) - 1;
}

void hashtable_support_index_calculate_neighborhood_from_hash(
hashtable_bucket_count_t buckets_count,
hashtable_bucket_hash_t hash,
uint16_t cachelines_to_probe,
hashtable_bucket_index_t *index_neighborhood_begin,
hashtable_bucket_index_t *index_neighborhood_end) {
uint64_t index = hashtable_support_index_from_hash(buckets_count, hash);
hashtable_support_index_calculate_neighborhood_from_index(
index,
cachelines_to_probe,
index_neighborhood_begin,
index_neighborhood_end);
}
23 changes: 0 additions & 23 deletions src/hashtable/hashtable_support_index.h
Original file line number Diff line number Diff line change
Expand Up @@ -5,33 +5,10 @@
extern "C" {
#endif

#define HASHTABLE_HASHES_PER_CACHELINE (int)(HASHTABLE_CACHELINE_LENGTH / HASHTABLE_HASH_SIZE)
#define HASHTABLE_CACHELINE_LENGTH 64
#define HASHTABLE_HASH_SIZE sizeof(hashtable_bucket_hash_t)

uint64_t hashtable_support_index_rounddown_to_cacheline(uint64_t number);

uint64_t hashtable_support_index_roundup_to_cacheline_to_probe(
uint64_t number,
uint16_t cachelines_to_probe);

hashtable_bucket_index_t hashtable_support_index_from_hash(
hashtable_bucket_count_t buckets_count,
hashtable_bucket_hash_t hash);

void hashtable_support_index_calculate_neighborhood_from_index(
hashtable_bucket_index_t index,
uint16_t cachelines_to_probe,
hashtable_bucket_index_t *index_neighborhood_begin,
hashtable_bucket_index_t *index_neighborhood_end);

void hashtable_support_index_calculate_neighborhood_from_hash(
hashtable_bucket_count_t buckets_count,
hashtable_bucket_hash_t hash,
uint16_t cachelines_to_probe,
hashtable_bucket_index_t *index_neighborhood_begin,
hashtable_bucket_index_t *index_neighborhood_end);

#ifdef __cplusplus
}
#endif
Expand Down