-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Query Perforamance #3
Conversation
I pushed the code for the benchmark on the kevina/faster-query-benchmark for lack of a better place. |
1 similar comment
@whyrusleeping let me know if you have any questions I am not sure how much will speedup AllKeysChan() in go-ipfs as there is some additional stuff going on but this is a good start in my opinion. |
I tested it with go-ipfs and |
filepath.Walk() sorts the names and calls lstat() on each entry, both of which are unnecessary when all we need are the key names in random order. For large datastores can improve the performance by about 50%.
f613904
to
2727908
Compare
1 similar comment
Note: Readdirnames does not return entries for "." or ".." but it a good idea to skip dirs/files starting with '.' anyway.
1 similar comment
Ah, so this works because we essentially 'hard coding' the sharding depth. Got it. My only concern is whether this works properly on windows. Do you have a windows computer you can test this on? |
@whyrusleeping I'm afraid I don't have easy access to a windows computer to test this; however, I don't see why it would not work. |
Tested on a windows cloud box, seems to work just fine there. LGTM |
filepath.Walk() sorts the names and calls lstat() on each entry, both of which are unnecessary when all we need are the key names in random order.
Instead I just use the fact that the directory layout is fixed and just use Readdirnames.
With a sample datastore with 100000 small keys the speedup is around 2.2:
When there are only 10000 keys the speedup is around 1.3: