Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retain custom options as known fields, even with custom descriptor.proto #109

Merged

Conversation

jhump
Copy link
Member

@jhump jhump commented Mar 6, 2023

An issue was introduced in #97 -- if a custom descriptor.proto is used, then custom options may come across in the final compilation result as unrecognized fields. Here's a test failure in buf: link. The first commit in this PR is a repro case.

The issue was two-fold: the options in question come from public imports, and this only happens when an override set of options are used. This happens because we have to use a dynamic message to represent the override definition of the options. But the result is ultimately a descriptorpb.FileDescriptorProto, which needs a reference to the generated type (not a dynamic message). So in this case, we serialize the interpreted options message to bytes and then de-serialize that into the generated type.

This one's a little complicated. But the root issue is that we used linker.ResolverFromFile as if it correctly implemented visibility rules, but it didn't. So when de-serializing the bytes in the above conversion, it failed to recognize the custom option, that was visible to the file via a public import.

So now, it does implement visibility rules exactly: so it can resolve any symbol that is visible to the given file. That means any symbol defined in that file or in an import OR in a transitive public import.

We already had logic in here for the visibility rules, which were used for other parts of linking and type lookup. So I've now updated it so that everything runs on the same rails -- e.g. the resolver implementation in linker.ResolverFromFile calls into the same code that already correctly implemented the rules.

While consolidating, there were suddenly some unused/unnecessary methods. For example, the handful of Resolve* methods on the linker.Result were now redundant. So I've removed them. There was also an unexported method on the linker.Files interface, which was now unused so I removed it, too. I know these aren't backwards-compatible changes, but this is still pre-v1.0 and these are interfaces which users are highly unlikely to actually implement.

While going through this, I also took an opportunity to do some cleanup after #97. Instead of treating an override descriptor.proto as if it were an implicit dependency used by the linker, it's now an option for the options interpreter (which is the only place that uses it, so seemed more natural).

@jhump
Copy link
Member Author

jhump commented Mar 6, 2023

I also re-ran the benchmarks, to make sure the new code movement isn't reducing performance.

It turns out, it is not slower -- in fact it's a little faster (see original benchmarks here). However, all of the benchmarks are faster, and protocompile's has improved the least :/

Looking at the details of the protoparse executions, I have to guess that these are due to improvements in the Go compiler/runtime (these benchmarks were done with Go 1.20; the original's used Go 1.18). Even protoc has improved (in fact, it's improved by the largest margin). For that, I assume it's because I've updated to v22.0 (#100), and I know they've been making steady, small performance improvements in the parser/compile but also the C++ runtime, which indirectly speeds up the compiler, too (like porting code to use abseil instead of std, for better performance in hash maps and string manipulation).

Downloaded https://github.com/googleapis/googleapis/archive/cb6fbe8784479b22af38c09a5039d8983e894566.tar.gz; 6409073 bytes (3.482882625s).
Expanded archive into 5362 files.
3944 total source files found in googleapis (29762935 bytes).
goos: darwin
goarch: arm64
pkg: github.com/jhump/protocompile/internal/benchmarks
BenchmarkGoogleapisProtocompile
BenchmarkGoogleapisProtocompile-10                  	       2	 922729500 ns/op	1935503024 B/op	29267110 allocs/op
BenchmarkGoogleapisProtocompileCanonical
BenchmarkGoogleapisProtocompileCanonical-10         	       2	 931095688 ns/op	2027661604 B/op	29433792 allocs/op
BenchmarkGoogleapisProtocompileNoSourceInfo
BenchmarkGoogleapisProtocompileNoSourceInfo-10      	       2	 649715230 ns/op	1232107460 B/op	18292827 allocs/op
BenchmarkGoogleapisProtoparse
BenchmarkGoogleapisProtoparse-10                    	       1	1020831041 ns/op	2272219336 B/op	39747730 allocs/op
BenchmarkGoogleapisProtoparseNoSourceInfo
BenchmarkGoogleapisProtoparseNoSourceInfo-10        	       2	 886360104 ns/op	1867318468 B/op	34004294 allocs/op
BenchmarkGoogleapisProtoc
BenchmarkGoogleapisProtoc-10                        	       1	1357743917 ns/op	  662512 B/op	      80 allocs/op
BenchmarkGoogleapisProtocNoSourceInfo
BenchmarkGoogleapisProtocNoSourceInfo-10            	       1	1150266792 ns/op	  662448 B/op	      75 allocs/op
BenchmarkGoogleapisProtocompileSingleThreaded
BenchmarkGoogleapisProtocompileSingleThreaded-10    	       1	3402589584 ns/op	1935459440 B/op	29267355 allocs/op
BenchmarkGoogleapisProtoparseSingleThreaded
BenchmarkGoogleapisProtoparseSingleThreaded-10      	       1	2741090916 ns/op	2013953192 B/op	35342027 allocs/op
PASS
ok  	github.com/jhump/protocompile/internal/benchmarks	43.140s

@jhump jhump force-pushed the jh/retain-custom-options-even-with-custom-descriptor.proto branch from befc07f to 7f2f67d Compare March 6, 2023 22:21
@jhump jhump changed the title Retain custom options as known fields, even with custom descriptor.proto Retain custom options as known fields, even with custom descriptor.proto Mar 6, 2023
this allows us to remove several now-redundant methods on linker.Result
this also cleans up the way override descriptor.proto files were handled:
  - Instead of it being modeled as an implicit dependency and given to
    linker (which could then use it to lookup options type), it is now
    handled (more appopriately IMO) as an option for the interpreter.

this features numerous interface changes so is not technically backwards
compatible; but all interfaces changed were items that were not intended
to be implemented outside of the repo (and this is still pre-v1.0)
@jhump jhump force-pushed the jh/retain-custom-options-even-with-custom-descriptor.proto branch from 7f2f67d to 47cbde1 Compare March 6, 2023 22:28
@jhump jhump requested a review from pkwarren March 6, 2023 22:45
linker/resolve.go Outdated Show resolved Hide resolved
options/options.go Outdated Show resolved Hide resolved
@jhump jhump merged commit 49c8099 into main Mar 7, 2023
@jhump jhump deleted the jh/retain-custom-options-even-with-custom-descriptor.proto branch March 7, 2023 17:35
jhump added a commit that referenced this pull request Mar 7, 2023
…115)

In #109, trying to be clever, I moved the place where we block for an
override descriptor.proto to "just in time", before options are
interpreted. That seemed like a potential latency improvement to allow
compilation of the override descriptor.proto to overlap with the linking
of the file in question, if the file in question didn't actually
directly depend on descriptor.proto.

However, it blocks for the other result while still holding the semaphore!
So this opens up the possibility of deadlock 🤦. A possible remedy is to
release the semaphore before blocking, and re-acquiring when done.
But that seems too complicated/brittle: we already have code that must
release/re-acquire the semaphore, so it seems best to just keep that in
one place. Which means reverting the "clever" change and just waiting
for an override descriptor.proto while waiting for all other imports.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants