Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise repository rule generation with re-entrancy #16162

Closed
Silic0nS0ldier opened this issue Aug 25, 2022 · 5 comments
Closed

Optimise repository rule generation with re-entrancy #16162

Silic0nS0ldier opened this issue Aug 25, 2022 · 5 comments
Labels
P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: feature request

Comments

@Silic0nS0ldier
Copy link
Contributor

Description of the feature request:

Instead of restarting repository rule fetching when a new dependency is discovered, pause execution and resume once it is available.

Subject to how this is achieved, it should provide;

  • A major performance boost to many repository rule implementations
  • More predictable performance
  • More options for optimisation (using dependency references early is now only necessary to better parallelise work, meaning there are more options for efficiently dealing with optional dependencies)

What underlying problem are you trying to solve with this feature?

Repository rules taking a long time to complete due to late dependency references triggering restarts. Starlark implementations can be optimised by touching dependencies early on, but in practice this is easily overlooked as;

  • Ruleset tests typically operate on a much smaller scale than the projects they are used for. As a result releases are often not optimised (because ruleset CI checks seem fine) and regressions are easy to introduce (because their impact cannot be captured by CI checks).
  • It isn't clear from the Starlark source if a given API call will trigger a restart, profiling and debugging is often necessary to identify the bottleneck (along with being aware of the restart behaviour, so the appropriate pattern is sought out).

Further, refactoring to address the performance penalty of restarts is itself a major time sink.

  • The refactoring is often non-trivial. Especially if abstractions are used.
  • Sometimes referencing dependencies early is a trade-off. Implementations which deal with external elements like OS integrations may have dependencies that are conditionally required, and can only be determined at a late stage of execution. The only way to avoid an expensive restart here is to eagerly fetch optional dependencies, which could hurt performance more overall (odds are this would be discovered after refactoring, and consider that different projects may present different performance characteristics).

Which operating system are you running Bazel on?

macOS 12.5.1

What is the output of bazel info release?

release 5.1.1- (@non-git)

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

With Nix, via https://github.com/NixOS/nixpkgs/archive/13e0d337037b3f59eccbbdf3bc1fe7b1e55c93fd.tar.gz

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

(private repo)

Have you found anything relevant by searching the web?

Any other information, logs, or outputs that you want to share?

No response

@sgowroji sgowroji added type: feature request team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. untriaged labels Aug 25, 2022
@fmeum
Copy link
Collaborator

fmeum commented Aug 25, 2022

I briefly started working on this once using the relatively new env.getState.

It essentially comes down to adding some form of continuation support to the Starlark interpreter. Unfortunately, the "Starlark stack" is a mix of custom data structures and the actual Java stack - since Java doesn't have continuations, one would have to maintain additional data structures that record the state of the Java stack for all operations that aren't safe to redo (e.g. that modify local variables).

Since the Starlark interpreter is so important for Bazel's performance, I think the main challenge would be implementing all of this while not regressing performance for the case of non-reentrant evaluation (see #15594 for a much smaller PR that already raised valid performance concerns). I fear that this might not be possible, which would then make a reentrant fork of the Starlark interpreter necessary that would only be used to evaluate repository rules.

That said, if anybody wants to work on this, let me know. But it's possible that an actual compiled representation for Starlark would make this effort a lot simpler.

@Wyverald
Copy link
Member

As @fmeum noted, this is actually not very easy to achieve at all, not without changes to the Starlark interpreter that allows us to store (and restart from) a call stack and a program counter. Either that, or something like yield in Java would work, but that's not likely to be available until 2023.

@Wyverald Wyverald added P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) and removed untriaged labels Aug 25, 2022
@fmeum
Copy link
Collaborator

fmeum commented Sep 5, 2023

This is a dupe of #10515, which is actively being worked on.

@Silic0nS0ldier
Copy link
Contributor Author

Clever approach taken in 1590dbc. I'll keep this open for the moment since its been noted that the thread pool cap of 100 (plus OS threading limits, which virtual will probably address once implemented) can be a problem vs. the existing restart-prone implementation.

@Wyverald
Copy link
Member

Closing in favor of #10515.

@Wyverald Wyverald closed this as not planned Won't fix, can't repro, duplicate, stale Jan 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: feature request
Projects
None yet
Development

No branches or pull requests

4 participants