Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache RawData for Commit, Tag, & Tree, fixes #6 #28

Merged
merged 1 commit into from
Dec 30, 2018

Conversation

sameer
Copy link
Contributor

@sameer sameer commented Dec 18, 2018

A rawdata []byte member is added to each type. It is computed on the first call to RawData().
Also noticed a few odd indentations and fixed them.

Let me know if there are any changes that need to be made, feedback is welcome.

@sameer sameer mentioned this pull request Dec 18, 2018
commit.go Outdated Show resolved Hide resolved
tree.go Outdated Show resolved Hide resolved
Copy link
Member

@magik6k magik6k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind writing a quick benchmark to compare this vs master?

@sameer
Copy link
Contributor Author

sameer commented Dec 18, 2018

Sure, should that be done as a *testing.B in git_test.go, or should I just try it out against the linux kernel or something with a before & after?

@magik6k
Copy link
Member

magik6k commented Dec 18, 2018

Whatever works best for you and you think should show a difference is fine.

@sameer
Copy link
Contributor Author

sameer commented Dec 18, 2018

So I tried to run it against the go-ipfs repo and interestingly it crashes the ipfs daemon with 17:18:41.589 ERROR flatfs: too many open files, retrying in 100ms flatfs.go:378.

Instead I ran it against just ipfs/ipfs and got the following:
After:

real    0m3.028s
user    0m0.610s
sys     0m0.304s

Before (rm -rf'ed .git/ipld):

real    0m3.961s
user    0m0.522s
sys     0m0.281s

I can run it a few more times just to make sure.

@sameer
Copy link
Contributor Author

sameer commented Dec 18, 2018

Ok, I had a chance to do more comprehensive testing. All the files in the repo are already added to my ipfs daemon, so they should return fast. The bench script just does rm -rf .git/ipld/ && git push ipld:: master, and I ran it with perf stat -r 100 ./bench.sh.
Before:

 Performance counter stats for './bench.sh' (100 runs):

            554.14 msec task-clock:u              #    0.623 CPUs utilized            ( +-  0.85% )
                 0      context-switches:u        #    0.000 K/sec
                 0      cpu-migrations:u          #    0.000 K/sec
             2,680      page-faults:u             # 4840.519 M/sec                    ( +-  1.63% )
     1,121,182,658      cycles:u                  # 2025221.108 GHz                   ( +-  0.29% )
     1,280,351,427      instructions:u            #    1.14  insn per cycle           ( +-  0.13% )
       227,040,471      branches:u                # 410109049.439 M/sec               ( +-  0.14% )
         4,581,354      branch-misses:u           #    2.02% of all branches          ( +-  0.32% )

            0.8896 +- 0.0205 seconds time elapsed  ( +-  2.31% )

After:

 Performance counter stats for './bench-fast.sh' (100 runs):

            567.96 msec task-clock:u              #    0.646 CPUs utilized            ( +-  0.86% )
                 0      context-switches:u        #    0.000 K/sec
                 0      cpu-migrations:u          #    0.000 K/sec
             2,678      page-faults:u             # 4719.887 M/sec                    ( +-  1.80% )
     1,131,191,140      cycles:u                  # 1993463.988 GHz                   ( +-  0.33% )
     1,281,825,576      instructions:u            #    1.13  insn per cycle           ( +-  0.14% )
       227,317,086      branches:u                # 400594036.690 M/sec               ( +-  0.16% )
         4,627,311      branch-misses:u           #    2.04% of all branches          ( +-  0.31% )

            0.8797 +- 0.0203 seconds time elapsed  ( +-  2.31% )

There doesn't really seem to be any difference when I did this test. Is there some better way to do one?

@magik6k
Copy link
Member

magik6k commented Dec 19, 2018

Most of this will be I/O. I'd use testing.B, load test data like in https://github.com/ipfs/go-ipld-git/blob/master/git_test.go#L22, and then call RawData in a loop b.N times

@sameer
Copy link
Contributor Author

sameer commented Dec 19, 2018

Ok, I've written a BenchmarkRawData and ran it against master and the PR.
After:

[purisame@arbeitslosigkeit go-ipld-git]$ go test -bench BenchmarkRawData -count 10
goos: linuxs/80/2992c4220de19a90767f3000a79a31b98d0df7fa
goarch: amd64
BenchmarkRawData-4           500           2797081 ns/op
BenchmarkRawData-4           500           2822694 ns/op
BenchmarkRawData-4           500           2896903 ns/op
BenchmarkRawData-4           500           2824651 ns/op
BenchmarkRawData-4           500           3189323 ns/op
BenchmarkRawData-4           500           2895799 ns/op
BenchmarkRawData-4           500           2917160 ns/op
BenchmarkRawData-4           500           2975555 ns/op
BenchmarkRawData-4           500           2938396 ns/op
BenchmarkRawData-4           500           3065296 ns/op
PASS
ok      _/home/purisame/go-ipld-git     17.654s

Before:

[purisame@arbeitslosigkeit go-ipld-git]$ go test -bench BenchmarkRawData -count 10
goos: linuxs/80/2992c4220de19a90767f3000a79a31b98d0df7fa
goarch: amd64
BenchmarkRawData-4           500           3112162 ns/op
BenchmarkRawData-4           500           3089852 ns/op
BenchmarkRawData-4           500           3273750 ns/op
BenchmarkRawData-4           500           3093242 ns/op
BenchmarkRawData-4           500           3080540 ns/op
BenchmarkRawData-4           500           3055796 ns/op
BenchmarkRawData-4           500           3101081 ns/op
BenchmarkRawData-4           500           2995909 ns/op
BenchmarkRawData-4           500           3049960 ns/op
BenchmarkRawData-4           500           3011387 ns/op
PASS
ok      _/home/purisame/go-ipld-git     18.589s

@sameer
Copy link
Contributor Author

sameer commented Dec 26, 2018

Are there any other changes I should make for this PR?

@magik6k magik6k self-requested a review December 26, 2018 12:32
Copy link
Member

@magik6k magik6k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

@magik6k magik6k merged commit 705872d into ipfs:master Dec 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants