-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
maint: Remove use of folly/hash/Hash.h #1506
Conversation
auto seed = hash(id_); | ||
boost::hash_combine(seed, version_id_); | ||
boost::hash_combine(seed, creation_ts_); | ||
boost::hash_combine(seed, content_hash_); | ||
boost::hash_combine(seed, key_type_); | ||
boost::hash_combine(seed, index_start_); | ||
boost::hash_combine(seed, index_end_); | ||
hash_ = seed; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the returned values of get_cached_hash
must not change for backward compatibility (e.g. to be able to retrieve segments which have been stored in databases in previous versions).
Is there a way we could test that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that's the case, however I agree we should check. I believe we have version backwards and forwards compatibility checks, I'll find out where they are and point you at them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
The backwards and forwards compatibility checks are performed by .github/workflows/persistent_storage.yml
.
c42f0dc
to
0484479
Compare
Signed-off-by: Ian Thomas <ianthomas23@gmail.com>
0484479
to
58f0a75
Compare
I'd like to see some numbers confirming that the performance of the boost implementation is at least comparable to folly's. Should be sufficient to generate a massive vector of |
I've benchmarked as suggested using this code void BM_hash_AtomKey(benchmark::State& state) {
int n = state.range(0);
std::vector<AtomKeyImpl> v;
for (int i = 0; i < n; i++) {
auto k = AtomKeyBuilder().version_id(i).start_index(i*10).end_index(i*100).build<KeyType::TABLE_INDEX>("");
v.push_back(k);
}
for (auto _ : state) {
for (auto k : v) {
benchmark::DoNotOptimize(hash(k));
}
}
}
BENCHMARK(BM_hash_AtomKey)->Args({100'000})->Args({1'000'000})->Args({10'000'000}); Results for release build of
and for this branch (commit
So this PR's use of |
Reference Issues/PRs
Fixes the 5th item of the folly replacement plan, issue #1412.
What does this implement or fix?
This removes the single include of
folly/hash/Hash.h
. The only explicit use of this isfolly::hash::hash_combine
which I have replaced with use ofboost::hash_combine
.It would alternatively be possible to vendor the 3 or so
folly
functions, but I have rejected this idea. One of the functions ishash_128_to_64
and is taken from cityhash which is essentially unmaintained for 11 years. I would rather use the actively maintainedboost
.Any other comments?
There are some tests of
folly::hash::commutative_hash_combine
intest_hash.cpp
, but this function is not used anywhere else in the ArcticDB code base so I have removed the test code as not being relevant.