Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REv3 idea: CAS.ExpandTree()? #140

Open
EdSchouten opened this issue Jun 1, 2020 · 16 comments
Open

REv3 idea: CAS.ExpandTree()? #140

EdSchouten opened this issue Jun 1, 2020 · 16 comments

Comments

@EdSchouten
Copy link
Collaborator

Build systems like Bazel currently treat directory outputs as being opaque. There is no support for addressing individual files contained within.

If Bazel wants to pass on a directory output from one action to another, it currently needs to download a Tree object from the CAS and expand it into individual Directory objects. These then need to be reuploaded into the CAS.

This roundtrip could be avoided by adding a dedicated RPC of the shape:

rpc ExpandTree(Digest) returns (Digest);

In this case, the argument corresponds to the digest of the Tree object. The return value corresponds to the digest of the root directory contained within the Tree object. The call should also bump the TTL on all resulting CAS objects.

Do we consider this to be enough of a problem that it makes sense to add such an RPC?

@ulfjack
Copy link
Collaborator

ulfjack commented Jun 1, 2020

That seems to be predicated on the assumption that Bazel is not going to allow addressing individual files in the future. That assumption does not seem correct to me. I'm not sure if there are features in Bazel around this right now, but it seems like a perfectly safe bet that there will be, even if Bazel team isn't currently planning to add them.

The reason for this is simple: adding unnecessary input files to an action is bad for build performance (reduces caching & increases action costs). This is particularly true the larger the output directory tree is, i.e., the more you're trying to save network bandwidth with this feature, the worse performance gets. Therefore, I would bet on Bazel adding features to allow addressing individual files or filtering files or something else along these lines that requires Bazel to have a local list of files.

@EdSchouten
Copy link
Collaborator Author

That seems to be predicated on the assumption that Bazel is not going to allow addressing individual files in the future.

No, that's not correct. It is predicated around the fact that in many cases directories declared using declare_directory() are simply passed on in literal form. If Bazel actually needs a subset of a Tree (which would likely only be used in certain cases), it is still capable of downloading the Tree object and carving it up as needed.

Even in the case where a Tree needs to be carved up, it might still be preferable to call ExpandTree() to at least store all subtrees that end up being used in literal form.

@ulfjack
Copy link
Collaborator

ulfjack commented Jun 1, 2020

My position is that rules will (should?) ~always use a subset of larger trees in order to avoid the performance costs. Maybe that's overly optimistic, but even so, if that's the recommended best practice, why spend significant effort on optimizing the other case?

If the tree needs to be carved up, why would you have ExpandTree as a separate API compared to requiring action execution to generate / store all the subtrees?

@EdSchouten
Copy link
Collaborator Author

So you're proposing that GetActionResult() is made slightly stronger, in that it only returns results if all directories stored in a Tree referenced by the ActionResult are also stored in the CAS as separate Directory objects? That sounds reasonable.

@peterebden
Copy link
Contributor

I've thought about suggesting something along these lines for a v3 proposal before, or at least sketching out the general problem of the asymmetry between the messages describing outputs & inputs and the work required to convert one to the other when another action wants to consume them.

So generally in favour, but would also question whether there is something we could do to make the two more similar (although that might be too big a change even for v3?).

@mostynb
Copy link
Contributor

mostynb commented Jun 1, 2020

So you're proposing that GetActionResult() is made slightly stronger, in that it only returns results if all directories stored in a Tree referenced by the ActionResult are also stored in the CAS as separate Directory objects? That sounds reasonable.

I assumed that this was already required, but looking at the GetActionResult() comment again now it is slightly ambiguous.

@EricBurnett
Copy link
Collaborator

EricBurnett commented Jun 1, 2020 via email

@EdSchouten
Copy link
Collaborator Author

To that end, we might instead consider updating OutputDirectory (Edit: mistakenly said DirectoryNode) to have both the digest of the tree (for access to the flattened form) and the digest of the directory node itself (for reuse as an input)? Then it'd be clear that if the server gives you a directory root digest back, all nodes must be in the CAS too by definition, and you can reuse it as-is. This could be optional in V2; upgraded to Required in V3?

👍 That sounds like a good idea.

(I'd also love to get rid of GetTree entirely; can't remember offhand why it was added back though and if it still has a purpose).

👍 to that as well. I can imagine that something like GetTree() makes sense to have on a worker where you might want to gather a full input root. Still, there's no need to let that be part of the build client -> cluster protocol.

@juergbi
Copy link
Contributor

juergbi commented Nov 12, 2020

Could it make sense to unify Directory and Tree by extending the DirectoryNode proto to support an embedded Directory message as alternative to the digest? This would allow clients and servers/workers to flatten directory trees where it makes sense without forcing a completely flat directory for large trees (as is currently the case for OutputDirectory). It would still allow reuse of complete Directory messages via digest.

This would provide consistency across all operations and we should no longer need GetTree() or ExpandTree(), i.e., the CAS API wouldn't need any knowledge of the Directory proto. I think this would be a net benefit with regards to REAPI complexity overall.

A possible disadvantage is that the flexibility would no longer provide a single canonical representation of a directory hierarchy. However, I'm not sure whether this would be a big issue in practice as long as individual clients use deterministic rules to decide embedding of subdirectories.

This would also address #165 and #170.

Any thoughts? Are there significant disadvantages I'm overlooking?

@illicitonion
Copy link
Contributor

I like it - the canonical digest of a Directory could still be as it is today, with implementations which inline subdirectories for transmission just needing to pay the extra cost of serialising them differently for digesting than for transmission.

@EdSchouten
Copy link
Collaborator Author

EdSchouten commented Nov 12, 2020

Are there significant disadvantages I'm overlooking?

The downside of such an approach is that there will no longer be a unique representation of a directory hierarchy. You can construct multiple identical directory layouts, but encode them differently:

  • Either pack all Directory objects together,
  • Store them all separately,
  • Have some directories packed, while others are stored separately.

It sacrifices the idea that a node in the Merkle tree of given contents has a unique digest. Couldn't this lead to lower AC hit rates if not all of the tools are in full agreement when to (un)pack directories in input roots?

Another issue with this approach is that the decision when to (un)pack is made unilaterally. The worker decides whether everything needs to be stored separately or together, whereas the client may have a different opinion on what would have been smart, looking at the build overall. For example, when Builds without the Bytes is turned off, it may be more preferable to receive a Tree-like object. When Builds without the Bytes is turned on, a plain Directory would have been sufficed (and be preferable).

This would provide consistency across all operations and we should no longer need GetTree() or ExpandTree(), i.e., the CAS API wouldn't need any knowledge of the Directory proto.

I would like us to go in a different direction here. If we ever want to support encryption (#133), it already makes sense to decompose CAS objects into two parts:

  • One encrypted part (e.g., file contents, file names in directory listings),
  • One decrypted list of references to child objects (e.g., for action -> command, command -> directory, directory -> directory, directory -> file references).

In such a model it would make a lot of sense to always store objects separate; never use something like Tree. The GetTree() call could be replaced by a generic GetTransitiveClosure() call to expand a full tree of CAS objects.

@illicitonion
Copy link
Contributor

Are there significant disadvantages I'm overlooking?

The downside of such an approach is that there will no longer be a unique representation of a directory hierarchy. You can construct multiple identical directory layouts, but encode them differently:

  • Either pack all Directory objects together,
  • Store them all separately,
  • Have some directories packed, while others are stored separately.

Couldn't this lead to lower AC hit rates if not all of the tools are in full agreement when to (un)pack directories in input roots?

I think this is fine as long as the digest algorithm is well specified. If there's a canonical form for encoding them for digest, and effectively we're just hiding hint information in a separate un-digested field, these worries go away. (Of course, some new worries around correctness of implementation now appear!)

@EdSchouten
Copy link
Collaborator Author

EdSchouten commented Nov 12, 2020

I think this is fine as long as the digest algorithm is well specified. If there's a canonical form for encoding them for digest, and effectively we're just hiding hint information in a separate un-digested field, these worries go away. (Of course, some new worries around correctness of implementation now appear!)

Let's assume there is a canonical form for digest. Now let's assume a worker gives us an output directory of which we don't know whether it is in canonical form. Wouldn't that require the client to download the full output directory hierarchy, so that it may be canonicalised by the client?

The entire goal of this PR was to introduce a mechanism where clients do not need to download full directory output hierarchies.

@EricBurnett
Copy link
Collaborator

The downside of such an approach is that there will no longer be a unique representation of a directory hierarchy. You can construct multiple identical directory layouts, but encode them differently:

I'd like to break this anyways, personally - see e.g. #141 (comment) . Fundamentally, what I've realized we actually need is approximately "For clients producing overlapping directory trees, they can easily get high cache hit rates" . But that's more about clients being internally-deterministic and well behaved - I don't think we actually see ~any cross-tool caching, and I don't think it actually matters. So we should optimize for e.g. allowing Bazel to produce trees in a way efficient for it to compose, and reasonably compact for expanders. And we should allow BuildBarn to do the same. But we don't really benefit from those trees being the same as far as I can tell, and I don't think most consumers of trees care if they're in canonical form or not? At most, they might care that the same tree from the same source is deterministic, to e.g. easily say "this output tree has not changed". But that's a lot more limited than what we enforce now, and requiring a fully-expanded non-overlapping tree seems to be holding us back in a lot of contexts.

To Daniel's point, there is still a canonical form that can optionally be used where needed. But I expect that's actually the exception and not the norm.

@ulfjack
Copy link
Collaborator

ulfjack commented Nov 12, 2020

I'm generally in favor of a more lenient approach and also in favor of unifying the two tree representations.

I wonder how difficult it would be for a RE system to detect cases where a client sends the same tree in different representations. Such a mechanism could be used to detect misbehaving clients.

@EricBurnett
Copy link
Collaborator

I wonder how difficult it would be for a RE system to detect cases where a client sends the same tree in different representations. Such a mechanism could be used to detect misbehaving clients.

Servers could canonicalize a sampling of trees, and report on rates of collisions? (E.g. X client produced tree Y which canonicalized to Z; different from previously seen tree Y' from client X that also canonicalized to Z). Might be a bit hard to expose in an actionable way, but shouldn't be too bad I don't think as the server could reasonably track the invocation ID / action ID from this sampling and be able to point directly back to examples where it was concretely observed. But one thing we'd have to learn observationally is what a "reasonable" rate of collisions is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants