Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ci] Add OS version to action key in CI #14697

Merged
merged 1 commit into from
Sep 1, 2022

Conversation

milesdai
Copy link
Contributor

@milesdai milesdai commented Sep 1, 2022

This commit injects the OS version into a paramter that is used as part
of the action key computation. This should prevent artifacts from
different OS's from colliding in the remote cache.

Signed-off-by: Miles Dai milesdai@google.com

Fixes #14695

@milesdai milesdai marked this pull request as draft September 1, 2022 02:10
@milesdai milesdai requested review from a-will and removed request for rswarbrick September 1, 2022 02:10
@milesdai milesdai added Component:CI Continuous Integration (Azure Pipelines & Co.) SW:Build System labels Sep 1, 2022
@milesdai
Copy link
Contributor Author

milesdai commented Sep 1, 2022

Based on the discussion here, I decided to use the --remote_default_platform_properties for this.

@a-will
Copy link
Contributor

a-will commented Sep 1, 2022

Based on the discussion here, I decided to use the --remote_default_platform_properties for this.

Apparently, it's...

DEPRECATED. Please use exec_properties attribute instead.

Oh wait, sorry, this is a different thing. Now I better read again. 😂

But ostensibly, it looks like --remote_default_exec_properties would touch the desired key, whereas this one does the deprecated remote_execution_properties.

@alphan
Copy link
Contributor

alphan commented Sep 1, 2022

DEPRECATED. Please use exec_properties attribute instead.

The message of bazelbuild/bazel@6ec75ab (the commit that adds the new option and the deprecation message) might help.

@milesdai milesdai force-pushed the cache-sharding branch 2 times, most recently from fd34e57 to 0873aa0 Compare September 1, 2022 14:05
@milesdai
Copy link
Contributor Author

milesdai commented Sep 1, 2022

Changed to --remote_default_exec_properties - thanks all for the help! @alphan I'm adding you as a reviewer as well just to get some more eyes on the bash.

@milesdai milesdai requested a review from alphan September 1, 2022 14:07
@milesdai
Copy link
Contributor Author

milesdai commented Sep 1, 2022

The previous run did in fact work -- it's just that the FPGA build took almost 40 minutes. So on the plus side, it turns out we are indeed making good use of the caching. We'll just need this build to go through to re-seed the cache since we have effectively just invalidated everything in there now.

@a-will
Copy link
Contributor

a-will commented Sep 1, 2022

The previous run did in fact work -- it's just that the FPGA build took almost 40 minutes. So on the plus side, it turns out we are indeed making good use of the caching. We'll just need this build to go through to re-seed the cache since we have effectively just invalidated everything in there now.

Do we also need to make sure the opentitan platform has exec_properties defined (i.e. not picking up defaults), so the device-side stuff can still be cached and shared?

(It looks like that would be done in the crt repo, though.)

Copy link
Contributor

@alphan alphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @milesdai, just minor nits.

ci/bazelisk.sh Outdated Show resolved Hide resolved
ci/bazelisk.sh Show resolved Hide resolved
This commit injects the OS version into a paramter that is used as part
of the action key computation. This should prevent artifacts from
different OS's from colliding in the remote cache.

Signed-off-by: Miles Dai <milesdai@google.com>
@milesdai
Copy link
Contributor Author

milesdai commented Sep 1, 2022

Do we also need to make sure the opentitan platform has exec_properties defined (i.e. not picking up defaults), so the device-side stuff can still be cached and shared?

(It looks like that would be done in the crt repo, though.)

I haven't familiarized myself with Bazel's platform features yet. If this doesn't affect correctness, should we get this merged first, and look into this later?

@milesdai milesdai marked this pull request as ready for review September 1, 2022 15:15
@a-will
Copy link
Contributor

a-will commented Sep 1, 2022

Do we also need to make sure the opentitan platform has exec_properties defined (i.e. not picking up defaults), so the device-side stuff can still be cached and shared?
(It looks like that would be done in the crt repo, though.)

I haven't familiarized myself with Bazel's platform features yet. If this doesn't affect correctness, should we get this merged first, and look into this later?

Aye, the comment was mostly a discussion about to-do items after this PR. 🙂

@martin-lueker
Copy link
Contributor

Thanks @a-will for pointing me to this. @milesdai: Is this ready to go in?

@milesdai
Copy link
Contributor Author

milesdai commented Sep 1, 2022

Yes, I think this is good to go

@alphan alphan merged commit a41964f into lowRISC:master Sep 1, 2022
Copy link
Contributor

@drewmacrae drewmacrae left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description calls the use of an incompatible artifact a collision, when really it's just incompatible because of an undocumented dependency on the system, right? Collisions make me think of a crowded cache with too small a key, whereas here the key just doesn't include the information that's sufficient to find an appropriate artifact. (The description can be edited and corrected after the fact, the commit message shouldn't be corrected)

# Inject the OS version into a parameter used in the action key computation to
# avoid collisions between different operating systems in the caches.
# See #14695 for more information.
echo "build --remote_default_exec_properties=OSVersion=\"$(lsb_release -ds)\"" >> "${GCP_BAZELRC}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a plan to make the cache available to non-ci consumers like it was before by putting this into .bazelrc somewhere? If not could we document that the chache doesn't support non-ci users? I'd been using it experimentally until July, without enough instrumentation to see performance gains, but if I'm not mistaken, this is going to prevent cache hits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this will prevent cache hits from anyone not running Bazel with ci/bazelisk.sh. On our website, we don't actually specify that the remote cache should be used. Rather, a disk cache is recommended.

@milesdai milesdai deleted the cache-sharding branch September 14, 2022 20:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component:CI Continuous Integration (Azure Pipelines & Co.) SW:Build System
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ci] Ubuntu version mismatch between machines causes false cache hits
5 participants