-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add benchmarks to compare protocompile perf and memory usage against other compilers #64
Conversation
1a9524d
to
0344eb4
Compare
if !t.e.c.RetainASTs { | ||
file.RemoveAST() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one significant improvement for memory usage in protocompile
. It allows the AST to be dropped early in the process, so a large compilation doesn't accumulate a large amount of ASTs pinned on the heap.
@@ -0,0 +1,21 @@ | |||
module github.com/jhump/protocompile/internal/benchmarks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this a separate module so that go test ./...
in the repo root would exclude these.
Also, I didn't want to pull in github.com/jhump/protoreflect
as a dependency in the root go.mod
.
@@ -373,7 +378,7 @@ type srcLocs struct { | |||
protoreflect.SourceLocations | |||
file *result | |||
locs []protoreflect.SourceLocation | |||
index map[interface{}]protoreflect.SourceLocation | |||
index map[interface{}]int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was another big win for reducing memory usage in protocompile
.
@@ -43,7 +43,12 @@ type packageSymbols struct { | |||
children map[protoreflect.FullName]*packageSymbols | |||
files map[protoreflect.FileDescriptor]struct{} | |||
symbols map[protoreflect.FullName]symbolEntry | |||
exts map[protoreflect.FullName]map[protoreflect.FieldNumber]ast.SourcePos | |||
exts map[extNumber]ast.SourcePos |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looked like low-hanging fruit -- nested maps are often less dense/more wasteful than a single map. But files typically don't have enough extensions for this to be a material source of memory usage. So this wasn't really a noticeable improvement. I left the change in anyway since it made the insertion logic cleaner and the tests a little less verbose.
0344eb4
to
9edddae
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice!
use ( | ||
. | ||
./internal/benchmarks | ||
./internal/tools |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
…gs); run benchmarks in CI, too
This includes some tweaks to reduce memory usage in
protocompile
, too.Perf Summary:
protocompile
is about 20% faster thanprotoparse
.protocompile
allocates about 14% less memory thanprotoparse
(* caveats below).protocompile
's advantage in speed and memory is more pronounced: 50% faster and about 35% less allocation.protoc
(protocompile
natively; andprotoparse
using "chunks" that are run concurrently like Buf CLI currently does) . Thoughprotoc
is the clear winner in single-threaded mode (not unexpected since it's optimized C++).protoc
benchmark. Since it just shells out to an external executable, no memory is really used inside the Go process.)Caveats:
protocompile
is about 26-28% larger thanprotoparse
.protoparse
does linking in batch, with large tables;protocompile
does it file-at-a-time with tables per file).protocompile
implements;protoparse
does not). But even when not generating source code info, the size difference persists.protocompile
, on the other hand, is less than half as big asprotoparse
.protocompile
can release the AST as it goes (as soon as it's no longer needed, after a file is complete), so the total memory pressure for a large compilation is just the total for descriptor sizes. But inprotoparse
, the total memory pressure for a large compilation is the descriptor sizes plus the AST sizes. With that in mind, the total memory pressure forprotocompile
is a little less than half that ofprotoparse
.This includes my own implementation of code to measure the memory usage of an object. It tries to avoid double-counting memory by using a range tree to track regions of counted memory. I couldn't find anything that does this with some Google-searching... but maybe I missed it? Is there something that exists already like this? The tests also run
runtime.GC()
and print the total heap usage, as a sanity check that the measured size is reasonable (since it is close to the reported total heap usage).I'm using
googleapis
since I thought a large batch of files was a more interesting performance test than trying to do something like a micro-benchmark (which wouldn't be able to exercise the same variety and mix of call paths). This downloads it (much like similar tests in the buf repo).Here's the output of running tests and benchmarks in the new package:
Fixes TCN-545