Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Query Perforamance #3

Merged
merged 2 commits into from
Nov 14, 2016
Merged

Improve Query Perforamance #3

merged 2 commits into from
Nov 14, 2016

Conversation

kevina
Copy link
Contributor

@kevina kevina commented Nov 10, 2016

filepath.Walk() sorts the names and calls lstat() on each entry, both of which are unnecessary when all we need are the key names in random order.

Instead I just use the fact that the directory layout is fixed and just use Readdirnames.

With a sample datastore with 100000 small keys the speedup is around 2.2:

100000 ORIG 3443.543417 (+/- 63.378649) ms
100000 OPT 1558.430242 (+/- 28.740522) ms

When there are only 10000 keys the speedup is around 1.3:

10000 ORIG 199.053565 (+/- 12.956540) ms
10000 OPT 151.882705 (+/- 8.869852) ms

@kevina kevina added the status/in-progress In progress label Nov 10, 2016
kevina added a commit that referenced this pull request Nov 10, 2016
@kevina
Copy link
Contributor Author

kevina commented Nov 10, 2016

I pushed the code for the benchmark on the kevina/faster-query-benchmark for lack of a better place.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling f61390433692be0b411e8f06696e8de04da547e5 on kevina/faster-query into * on master*.

1 similar comment
@coveralls
Copy link

Coverage Status

Changes Unknown when pulling f61390433692be0b411e8f06696e8de04da547e5 on kevina/faster-query into * on master*.

@kevina
Copy link
Contributor Author

kevina commented Nov 10, 2016

@whyrusleeping let me know if you have any questions

I am not sure how much will speedup AllKeysChan() in go-ipfs as there is some additional stuff going on but this is a good start in my opinion.

@Kubuxu Kubuxu self-assigned this Nov 10, 2016
@kevina kevina changed the title Improve Query Perforamcne Improve Query Perforamance Nov 10, 2016
@kevina
Copy link
Contributor Author

kevina commented Nov 10, 2016

I tested it with go-ipfs and TEST_NO_FUSE=1 make test_short passes.

filepath.Walk() sorts the names and calls lstat() on each entry, both of
which are unnecessary when all we need are the key names in random order.

For large datastores can improve the performance by about 50%.
@coveralls
Copy link

Coverage Status

Coverage increased (+0.7%) to 60.069% when pulling 2727908 on kevina/faster-query into f5bb609 on master.

1 similar comment
@coveralls
Copy link

Coverage Status

Coverage increased (+0.7%) to 60.069% when pulling 2727908 on kevina/faster-query into f5bb609 on master.

Note: Readdirnames does not return entries for "." or ".." but it a
good idea to skip dirs/files starting with '.' anyway.
@coveralls
Copy link

Coverage Status

Coverage increased (+1.0%) to 60.41% when pulling ce88f7a on kevina/faster-query into f5bb609 on master.

1 similar comment
@coveralls
Copy link

Coverage Status

Coverage increased (+1.0%) to 60.41% when pulling ce88f7a on kevina/faster-query into f5bb609 on master.

@whyrusleeping
Copy link
Member

Ah, so this works because we essentially 'hard coding' the sharding depth. Got it.

My only concern is whether this works properly on windows. Do you have a windows computer you can test this on?

@kevina
Copy link
Contributor Author

kevina commented Nov 14, 2016

@whyrusleeping I'm afraid I don't have easy access to a windows computer to test this; however, I don't see why it would not work.

@whyrusleeping
Copy link
Member

Tested on a windows cloud box, seems to work just fine there. LGTM

@whyrusleeping whyrusleeping merged commit be314e5 into master Nov 14, 2016
@whyrusleeping whyrusleeping deleted the kevina/faster-query branch November 14, 2016 22:10
@whyrusleeping whyrusleeping removed the status/in-progress In progress label Nov 14, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants