-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bazel 7 Skymeld Regression #22233
Comments
Thanks for filing the bug. Could you please try out Bazel at HEAD to see if this is still an issue? We recently made some changes to this part of the code.
Hmm, part of skymeld's action conflict check is sequential for correctness reason, but this should not affect action execution. What's your The P/S: How large is your local action cache size? |
I built bazel from source on this sha
When executing on RBE (which this does) we set it to
Not sure I understand your question so correct me but I think you are talking about the disk cache? I haven't mentioned it but I tried quite a few things to narrow it to skymeld, one of which was disabling local disk cache with |
Could you please provide additional info on your machine? (OS, CPU type, memory, number of cores, ...), as well as the bazel JSON trace profile that you showed above?
The terminology is a bit conflated, but I meant the action cache i.e. Another issue possibly related to the chain of "acquiring semaphore" is is #20478 . Can you please try your build again with |
CPU OS
I want to be careful about that because it contains some private information like project names. Let me see if I can redact it in some way.
It still reproduces with
Same here. |
Could you also provide a diff of the time spent on |
Another question: did you see any action conflict being reported in your build? You can either look in the terminal output, or do a run with |
Description of the bug:
Executing
bazel test //... --keep_going
on our repository when upgrading from Bazel6.5.0
to7.0.2
(or7.1.1
) causes extreme slowdown.Appending
--experimental_merged_skyframe_analysis_execution=false
seems to fix the issue. Suggesting it's a problem with Skymeld. Note that we had skymeld enabled on Bazel6.5.0
without problems.The issue manifests as a very slow one-at-a-time execution with
checking cached actions
showing up on CLI.Bazel profile is full of these:
and traces like:
Looking at the JFR it seems like the time is spent here with a lot of map/string operations:
The problem seems to be stemming from this semaphore here.
We have a relatively large repo:
Analyzing: 175831 targets (74369 packages loaded, 764940 targets configured)
and I had to fix a bunch of things to move us to Bazel 7 so the bisect is not working great as the fixes are not backward compatible. I'm hoping this is enough for someone to know what's going on but happy to devote more time to identify the rootcause.Which category does this issue belong to?
Performance
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
I do not have a reproduction right now. Our repository is large and private, I'm happy to try and create something but would like to get some feedback on the issue first.
Which operating system are you running Bazel on?
Linux
What is the output of
bazel info release
?release 7.0.2
If
bazel info release
returnsdevelopment version
or(@non-git)
, tell us how you built Bazel.No response
What's the output of
git remote get-url origin; git rev-parse HEAD
?Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.
Yes,
6.5.0
seems to work correctly wheres7.0.2
fails but I was unable to bisect cleanly.Have you found anything relevant by searching the web?
There seems to be bunch of reports around
checking cached actions
but they seem unrelated (like #21712)I tried to search for Skymeld specific issues but only found the coverage issue here which would also be a blocker for us to upgrading but is unrealted.
Any other information, logs, or outputs that you want to share?
No response
The text was updated successfully, but these errors were encountered: