add benchmarks to compare protocompile perf and memory usage against other compilers #64

jhump · 2022-10-11T22:32:38Z

This includes some tweaks to reduce memory usage in protocompile, too.

Perf Summary:

protocompile is about 20% faster than protoparse.
protocompile allocates about 14% less memory than protoparse (* caveats below).
When skipping source code info, protocompile's advantage in speed and memory is more pronounced: 50% faster and about 35% less allocation.
Due to parallelism, both are faster than protoc (protocompile natively; and protoparse using "chunks" that are run concurrently like Buf CLI currently does) . Though protoc is the clear winner in single-threaded mode (not unexpected since it's optimized C++).
(Ignore the memory stats for the protoc benchmark. Since it just shells out to an external executable, no memory is really used inside the Go process.)

Caveats:

The actual descriptor result for protocompile is about 26-28% larger than protoparse.
- This could be the use of many smaller maps instead of fewer larger ones (protoparse does linking in batch, with large tables; protocompile does it file-at-a-time with tables per file).
- At first I suspected it's the way source code info is modeled in the v2 API (which protocompile implements; protoparse does not). But even when not generating source code info, the size difference persists.
The AST representation in protocompile, on the other hand, is less than half as big as protoparse.
Also, protocompile can release the AST as it goes (as soon as it's no longer needed, after a file is complete), so the total memory pressure for a large compilation is just the total for descriptor sizes. But in protoparse, the total memory pressure for a large compilation is the descriptor sizes plus the AST sizes. With that in mind, the total memory pressure for protocompile is a little less than half that of protoparse.

This includes my own implementation of code to measure the memory usage of an object. It tries to avoid double-counting memory by using a range tree to track regions of counted memory. I couldn't find anything that does this with some Google-searching... but maybe I missed it? Is there something that exists already like this? The tests also run runtime.GC() and print the total heap usage, as a sanity check that the measured size is reasonable (since it is close to the reported total heap usage).

I'm using googleapis since I thought a large batch of files was a more interesting performance test than trying to do something like a micro-benchmark (which wouldn't be able to exercise the same variety and mix of call paths). This downloads it (much like similar tests in the buf repo).

Here's the output of running tests and benchmarks in the new package:

> go test . -bench . -benchmem -v
Downloaded https://github.com/googleapis/googleapis/archive/cb6fbe8784479b22af38c09a5039d8983e894566.tar.gz; 6409073 bytes.
Expanded archive into 5362 files.
3944 total source files found in googleapis (29762935 bytes).
=== RUN   TestGoogleapis_Protocompile_Memory
    benchmark_test.go:515: (heap used: 377788896 bytes)
    benchmark_test.go:520: memory used: 330460518 bytes
--- PASS: TestGoogleapis_Protocompile_Memory (3.28s)
=== RUN   TestGoogleapis_Protocompile_Memory_NoSourceInfo
    benchmark_test.go:515: (heap used: 124096232 bytes)
    benchmark_test.go:520: memory used: 99613298 bytes
--- PASS: TestGoogleapis_Protocompile_Memory_NoSourceInfo (2.00s)
=== RUN   TestGoogleapis_Protocompile_ASTMemory
    benchmark_test.go:515: (heap used: 220590128 bytes)
    benchmark_test.go:520: memory used: 210632961 bytes
--- PASS: TestGoogleapis_Protocompile_ASTMemory (3.05s)
=== RUN   TestGoogleapis_Protoparse_Memory
    benchmark_test.go:515: (heap used: 299199496 bytes)
    benchmark_test.go:520: memory used: 258438616 bytes
--- PASS: TestGoogleapis_Protoparse_Memory (5.88s)
=== RUN   TestGoogleapis_Protoparse_Memory_NoSourceInfo
    benchmark_test.go:515: (heap used: 95685520 bytes)
    benchmark_test.go:520: memory used: 81485001 bytes
--- PASS: TestGoogleapis_Protoparse_Memory_NoSourceInfo (4.14s)
=== RUN   TestGoogleapis_Protoparse_ASTMemory
    benchmark_test.go:515: (heap used: 481703504 bytes)
    benchmark_test.go:520: memory used: 428383786 bytes
--- PASS: TestGoogleapis_Protoparse_ASTMemory (5.44s)
=== RUN   TestMeasuringTapeInsert
--- PASS: TestMeasuringTapeInsert (0.00s)
=== RUN   TestMeasuringTapeMeasure
--- PASS: TestMeasuringTapeMeasure (0.02s)
=== RUN   TestNumBuckets
--- PASS: TestNumBuckets (0.00s)
goos: darwin
goarch: arm64
pkg: github.com/jhump/protocompile/internal/benchmarks
BenchmarkGoogleapis_Protocompile
BenchmarkGoogleapis_Protocompile-10                   	       2	 935789396 ns/op	1936122840 B/op	29299976 allocs/op
BenchmarkGoogleapis_Protocompile_Canonical
BenchmarkGoogleapis_Protocompile_Canonical-10         	       2	 941203666 ns/op	2027678888 B/op	29467291 allocs/op
BenchmarkGoogleapis_Protocompile_NoSourceInfo
BenchmarkGoogleapis_Protocompile_NoSourceInfo-10      	       2	 669197125 ns/op	1232668852 B/op	18323988 allocs/op
BenchmarkGoogleapis_Protoparse
BenchmarkGoogleapis_Protoparse-10                     	       1	1153496875 ns/op	2298790208 B/op	40921638 allocs/op
BenchmarkGoogleapis_Protoparse_NoSourceInfo
BenchmarkGoogleapis_Protoparse_NoSourceInfo-10        	       2	1021327542 ns/op	1893843760 B/op	35178044 allocs/op
BenchmarkGoogleapis_Protoc
BenchmarkGoogleapis_Protoc-10                         	       1	1622085958 ns/op	  661088 B/op	      74 allocs/op
BenchmarkGoogleapis_Protoc_NoSourceInfo
BenchmarkGoogleapis_Protoc_NoSourceInfo-10            	       1	1316672500 ns/op	  661072 B/op	      73 allocs/op
BenchmarkGoogleapis_Protocompile_SingleThreaded
BenchmarkGoogleapis_Protocompile_SingleThreaded-10    	       1	3618983167 ns/op	1935788240 B/op	29296333 allocs/op
BenchmarkGoogleapis_Protoparse_SingleThreaded
BenchmarkGoogleapis_Protoparse_SingleThreaded-10      	       1	3141516333 ns/op	2038783304 B/op	36412126 allocs/op
PASS
ok  	github.com/jhump/protocompile/internal/benchmarks	219.629s

Fixes TCN-545

jhump · 2022-10-11T22:37:28Z

compiler.go

+	if !t.e.c.RetainASTs {
+		file.RemoveAST()
+	}


This is one significant improvement for memory usage in protocompile. It allows the AST to be dropped early in the process, so a large compilation doesn't accumulate a large amount of ASTs pinned on the heap.

jhump · 2022-10-11T22:38:19Z

internal/benchmarks/go.mod

@@ -0,0 +1,21 @@
+module github.com/jhump/protocompile/internal/benchmarks


I made this a separate module so that go test ./... in the repo root would exclude these.

Also, I didn't want to pull in github.com/jhump/protoreflect as a dependency in the root go.mod.

jhump · 2022-10-11T22:38:57Z

linker/descriptors.go

@@ -373,7 +378,7 @@ type srcLocs struct {
 	protoreflect.SourceLocations
 	file  *result
 	locs  []protoreflect.SourceLocation
-	index map[interface{}]protoreflect.SourceLocation
+	index map[interface{}]int


This was another big win for reducing memory usage in protocompile.

jhump · 2022-10-11T22:50:38Z

linker/symbols.go

@@ -43,7 +43,12 @@ type packageSymbols struct {
 	children map[protoreflect.FullName]*packageSymbols
 	files    map[protoreflect.FileDescriptor]struct{}
 	symbols  map[protoreflect.FullName]symbolEntry
-	exts     map[protoreflect.FullName]map[protoreflect.FieldNumber]ast.SourcePos
+	exts     map[extNumber]ast.SourcePos


This looked like low-hanging fruit -- nested maps are often less dense/more wasteful than a single map. But files typically don't have enough extensions for this to be a material source of memory usage. So this wasn't really a noticeable improvement. I left the change in anyway since it made the insertion logic cleaner and the tests a little less verbose.

…other compilers

pkwarren

Very nice!

pkwarren · 2022-10-12T16:35:48Z

go.work

+use (
+	.
+	./internal/benchmarks
+	./internal/tools


internal/benchmarks/benchmark_test.go

…gs); run benchmarks in CI, too

jhump force-pushed the jh/add-benchmark branch from 1a9524d to 0344eb4 Compare October 11, 2022 22:48

jhump commented Oct 11, 2022

View reviewed changes

add benchmarks to compare protocompile perf and memory usage against …

9edddae

…other compilers

jhump force-pushed the jh/add-benchmark branch from 0344eb4 to 9edddae Compare October 11, 2022 22:56

jhump requested a review from pkwarren October 11, 2022 23:19

add benchmarks for when NOT generating source code info

f2ea581

jhump mentioned this pull request Oct 12, 2022

Use protocompile as the compiler bufbuild/buf#1463

Merged

pkwarren approved these changes Oct 12, 2022

View reviewed changes

jhump added 2 commits October 12, 2022 16:11

address review feedback, no need for repeated GC

a591749

enable linting in CI for new internal/benchmarks module (fix the thin…

cc90ba4

…gs); run benchmarks in CI, too

pkwarren approved these changes Oct 12, 2022

View reviewed changes

more DRY

c58cd07

jhump enabled auto-merge (squash) October 12, 2022 20:40

jhump merged commit 36825b1 into main Oct 12, 2022

jhump deleted the jh/add-benchmark branch October 12, 2022 20:44

jhump mentioned this pull request Mar 6, 2023

Retain custom options as known fields, even with custom descriptor.proto #109

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add benchmarks to compare protocompile perf and memory usage against other compilers #64

add benchmarks to compare protocompile perf and memory usage against other compilers #64

jhump commented Oct 11, 2022 •

edited

Loading

jhump Oct 11, 2022

jhump Oct 11, 2022

jhump Oct 11, 2022

jhump Oct 11, 2022

pkwarren left a comment

pkwarren Oct 12, 2022

		@@ -0,0 +1,21 @@
		module github.com/jhump/protocompile/internal/benchmarks

add benchmarks to compare protocompile perf and memory usage against other compilers #64

add benchmarks to compare protocompile perf and memory usage against other compilers #64

Conversation

jhump commented Oct 11, 2022 • edited Loading

jhump Oct 11, 2022

Choose a reason for hiding this comment

jhump Oct 11, 2022

Choose a reason for hiding this comment

jhump Oct 11, 2022

Choose a reason for hiding this comment

jhump Oct 11, 2022

Choose a reason for hiding this comment

pkwarren left a comment

Choose a reason for hiding this comment

pkwarren Oct 12, 2022

Choose a reason for hiding this comment

jhump commented Oct 11, 2022 •

edited

Loading