Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Differentiate Action Cache objects by instance name #148

Closed
wants to merge 1 commit into from

Conversation

gjasny
Copy link
Contributor

@gjasny gjasny commented Dec 28, 2019

Hello,

Every now and then a Clang update in our toolchain breaks the remote caching (usually fiffreneces in system headers like clang version numbers jumping from 7.0.0 to 7.1.0). A suggested work-around is to separate the caches.

For http this is possible with --remote_cache=http://localhost:8080/foo and for grpc by setting --remote_instance_name=bar.

For now the instance name only affects the action cache because in my use case the CAS content cannot differ.

I'd appreciate some feedback. (I'm also a golang beginner - please bear with me)

Thanks,
Gregor

Fixes: #15


return result, nil
}

func (s *grpcServer) maybeInline(inline bool, slice *[]byte, digest **pb.Digest, inlinedSoFar *int64) error {
func (s *grpcServer) maybeInline(instanceName string, inline bool, slice *[]byte, digest **pb.Digest, inlinedSoFar *int64) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the instanceName arg is useless, since these items are stored in the CAS and this patch doesn't separate CAS entries by instance.

Comment on lines +117 to +118
if !s.cache.Contains(cache.CAS, instanceName, (*digest).Hash) {
err := s.cache.Put(cache.CAS, instanceName, (*digest).Hash, (*digest).SizeBytes,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May as well pass an empty string for the instance, I guess.

@@ -90,8 +90,8 @@ func (s *grpcServer) BatchUpdateBlobs(ctx context.Context,
// Return the data for a blob, or an error. If the blob was not
// found, the returned error is errBlobNotFound. Only use this
// function when it's OK to buffer the entire blob in memory.
func (s *grpcServer) getBlobData(hash string, size int64) ([]byte, error) {
rdr, sizeBytes, err := s.cache.Get(cache.CAS, hash)
func (s *grpcServer) getBlobData(instanceName string, hash string, size int64) ([]byte, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The instanceName arg seems to be useless here too, maybe just skip it and use the empty string?

@@ -115,26 +115,26 @@ func (s *grpcServer) getBlobData(hash string, size int64) ([]byte, error) {
return data, rdr.Close()
}

func (s *grpcServer) getBlobResponse(digest *pb.Digest) *pb.BatchReadBlobsResponse_Response {
func (s *grpcServer) getBlobResponse(instanceName string, digest *pb.Digest) *pb.BatchReadBlobsResponse_Response {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

return nil
}

// Attempt to populate `resp`. Return errors for invalid requests, but
// otherwise attempt to return as many blobs as possible.
func (s *grpcServer) fillDirectories(resp *pb.GetTreeResponse, dir *pb.Directory, errorPrefix string) error {
func (s *grpcServer) fillDirectories(instanceName string, resp *pb.GetTreeResponse, dir *pb.Directory, errorPrefix string) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

@@ -67,35 +68,36 @@ func NewHTTPCache(cache cache.Cache, accessLogger cache.Logger, errorLogger cach
}

// Parse cache artifact information from the request URL
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we'd be returning two different strings, I think we should document what they are here now.

@mostynb
Copy link
Collaborator

mostynb commented Dec 29, 2019

Thanks for the PR! I will try to look through it properly in the next few days.

Every now and then a Clang update in our toolchain breaks the remote caching (usually fiffreneces in system headers like clang version numbers jumping from 7.0.0 to 7.1.0). A suggested work-around is to separate the caches.

Can you clarify what you mean by a toolchain update that "breaks the remote caching", and describe the suggested workaround a bit?

If you're referring to the fact that you get a bunch of cache misses after changing the toolchain, that is normal and expected.

@kragniz
Copy link
Contributor

kragniz commented Aug 19, 2020

@gjasny @mostynb is anyone currently planning on rebasing and picking this up again? It'd be great to be able to separate caches with this

@mostynb
Copy link
Collaborator

mostynb commented Aug 19, 2020

I don't have plans to work on this myself because I don't understand the use case well enough, and it could potentially be a breaking change for some users.

If a toolchain is updated, shouldn't that change the corresponding Action messages, and therefore use different ActionCache lookup hashes?

@kragniz
Copy link
Contributor

kragniz commented Aug 20, 2020

@mostynb the use case is pretty much just to work around bazelbuild/bazel#4558 without having to deploy multiple instances of bazel-remote. I imagine the behaviour change would be opt-in via a flag.

@mostynb
Copy link
Collaborator

mostynb commented Aug 20, 2020

So to summarise:

  • In some bazel client setups, actions can use unspecified input files, which means two different actions that produce different results can use the same ActionCache hash lookup key.
  • As a workaround, the client configuration can be made to use a cache instance name that is generated from the unspecified inputs (eg by hashing relevant toolchain names/versions), if the cache server can be configured to give different action cache hits based on the instance name.

Is that right? If so, I think I can come up with a smaller opt-in feature to allow this.

@kragniz
Copy link
Contributor

kragniz commented Aug 21, 2020

Exactly right, yes.

@mostynb
Copy link
Collaborator

mostynb commented Aug 25, 2020

@kragniz: please take a look at #339

@mostynb mostynb closed this Aug 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow per project cache
3 participants