Improve Query Perforamance #3

kevina · 2016-11-10T05:03:57Z

filepath.Walk() sorts the names and calls lstat() on each entry, both of which are unnecessary when all we need are the key names in random order.

Instead I just use the fact that the directory layout is fixed and just use Readdirnames.

With a sample datastore with 100000 small keys the speedup is around 2.2:

100000 ORIG 3443.543417 (+/- 63.378649) ms
100000 OPT 1558.430242 (+/- 28.740522) ms

When there are only 10000 keys the speedup is around 1.3:

10000 ORIG 199.053565 (+/- 12.956540) ms
10000 OPT 151.882705 (+/- 8.869852) ms

kevina · 2016-11-10T05:06:05Z

I pushed the code for the benchmark on the kevina/faster-query-benchmark for lack of a better place.

coveralls · 2016-11-10T05:08:03Z

Changes Unknown when pulling f61390433692be0b411e8f06696e8de04da547e5 on kevina/faster-query into * on master*.

coveralls · 2016-11-10T05:08:03Z

Changes Unknown when pulling f61390433692be0b411e8f06696e8de04da547e5 on kevina/faster-query into * on master*.

kevina · 2016-11-10T05:08:17Z

@whyrusleeping let me know if you have any questions

I am not sure how much will speedup AllKeysChan() in go-ipfs as there is some additional stuff going on but this is a good start in my opinion.

kevina · 2016-11-10T16:24:31Z

I tested it with go-ipfs and TEST_NO_FUSE=1 make test_short passes.

filepath.Walk() sorts the names and calls lstat() on each entry, both of which are unnecessary when all we need are the key names in random order. For large datastores can improve the performance by about 50%.

coveralls · 2016-11-12T02:24:48Z

Coverage increased (+0.7%) to 60.069% when pulling 2727908 on kevina/faster-query into f5bb609 on master.

coveralls · 2016-11-12T02:24:48Z

Coverage increased (+0.7%) to 60.069% when pulling 2727908 on kevina/faster-query into f5bb609 on master.

Note: Readdirnames does not return entries for "." or ".." but it a good idea to skip dirs/files starting with '.' anyway.

coveralls · 2016-11-12T21:18:50Z

Coverage increased (+1.0%) to 60.41% when pulling ce88f7a on kevina/faster-query into f5bb609 on master.

coveralls · 2016-11-12T21:18:50Z

Coverage increased (+1.0%) to 60.41% when pulling ce88f7a on kevina/faster-query into f5bb609 on master.

whyrusleeping · 2016-11-14T21:42:51Z

Ah, so this works because we essentially 'hard coding' the sharding depth. Got it.

My only concern is whether this works properly on windows. Do you have a windows computer you can test this on?

kevina · 2016-11-14T21:48:25Z

@whyrusleeping I'm afraid I don't have easy access to a windows computer to test this; however, I don't see why it would not work.

whyrusleeping · 2016-11-14T22:10:45Z

Tested on a windows cloud box, seems to work just fine there. LGTM

kevina added the status/in-progress In progress label Nov 10, 2016

kevina added a commit that referenced this pull request Nov 10, 2016

Benchmark for pull request #3.

4de0065

Kubuxu self-assigned this Nov 10, 2016

kevina changed the title ~~Improve Query Perforamcne~~ Improve Query Perforamance Nov 10, 2016

Don't use filepath.Walk for Query method.

2727908

filepath.Walk() sorts the names and calls lstat() on each entry, both of which are unnecessary when all we need are the key names in random order. For large datastores can improve the performance by about 50%.

kevina force-pushed the kevina/faster-query branch from f613904 to 2727908 Compare November 12, 2016 02:21

Bug fixes and Tweaks.

ce88f7a

Note: Readdirnames does not return entries for "." or ".." but it a good idea to skip dirs/files starting with '.' anyway.

kevina mentioned this pull request Nov 13, 2016

Improve performance of Blockstore.AllKeysChan() ipfs/kubo#3376

Closed

whyrusleeping merged commit be314e5 into master Nov 14, 2016

whyrusleeping deleted the kevina/faster-query branch November 14, 2016 22:10

whyrusleeping removed the status/in-progress In progress label Nov 14, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Query Perforamance #3

Improve Query Perforamance #3

kevina commented Nov 10, 2016

kevina commented Nov 10, 2016

coveralls commented Nov 10, 2016

coveralls commented Nov 10, 2016

kevina commented Nov 10, 2016

kevina commented Nov 10, 2016

coveralls commented Nov 12, 2016

coveralls commented Nov 12, 2016

coveralls commented Nov 12, 2016

coveralls commented Nov 12, 2016

whyrusleeping commented Nov 14, 2016

kevina commented Nov 14, 2016

whyrusleeping commented Nov 14, 2016

Improve Query Perforamance #3

Improve Query Perforamance #3

Conversation

kevina commented Nov 10, 2016

kevina commented Nov 10, 2016

coveralls commented Nov 10, 2016

coveralls commented Nov 10, 2016

kevina commented Nov 10, 2016

kevina commented Nov 10, 2016

coveralls commented Nov 12, 2016

coveralls commented Nov 12, 2016

coveralls commented Nov 12, 2016

coveralls commented Nov 12, 2016

whyrusleeping commented Nov 14, 2016

kevina commented Nov 14, 2016

whyrusleeping commented Nov 14, 2016