Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

more fixes for image consolidation / bazel caching #6713

Merged
merged 3 commits into from
Feb 9, 2018

Conversation

BenTheElder
Copy link
Member

@BenTheElder BenTheElder commented Feb 8, 2018

/assign @ixdy @krzyzacy

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Feb 8, 2018
@BenTheElder
Copy link
Member Author

I'm still committed to refactoring how we make the bazel cache RCs once (if?) we move nursery out of experimental, but for now yes: more bash 🙃

@BenTheElder
Copy link
Member Author

/area images
/area bazel

@BenTheElder BenTheElder force-pushed the consolidation branch 2 times, most recently from 262f704 to b4a4707 Compare February 8, 2018 08:45
GCC_VERSION=$(package_version gcc)
echo "build --action_env=__CACHE_GCC_VERSION__=${GCC_VERSION}" >> $1
PYTHON_VERSION=$(package_version python)
echo "build --action_env=__CACHE_PYTHON_VERSION__=${PYTHON_VERSION}" >> $1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI if you want to stop writing echo ... >> $1 all the time, you can probably do something like

cat >$1 <<EOF
# this is the default for recent releases but we set it explicitly
# since this is the only hash our cache supports
startup --host_jvm_args=-Dbazel.DigestFunction=sha256
# use remote caching for all the things
build --experimental_remote_spawn_cache
# point it at our http cache ...
build --remote_http_cache=${CACHE_HOST}
...
EOF

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm the downside to this is that I have to deindent the block being written and the whole block highlights as a string 🤷‍♂️

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other alternatives:

  1. wrap everything in a subshell ( ), e.g.
(
  echo "blah"
) > $1
  1. Rather than having make_bazel_rc take an argument, just pipe it to a file below, e.g.
make_bazel_rc > "${HOME}/.bazelrc"
make_bazel_rc > /etc/bazel.bazelrc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(don't consider these comments blocking, though)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'll go with one of these. Blocking this myself on verifying that I can disrupt the cache with GCC version locally and then fix it with these args. Will do one of these first.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@ixdy
Copy link
Member

ixdy commented Feb 8, 2018

what do you mean by "clean test-infra logs"?

@BenTheElder
Copy link
Member Author

what do you mean by "clean test-infra logs"?

We get a lot of spam in the new builds: https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/pr-logs/pull/test-infra/6713/pull-test-infra-bazel/12990/?log#log which I suspect is due to missing tooling, not much else has changed so my best bet right now is outdated python tools.

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Feb 8, 2018
Copy link
Member

@krzyzacy krzyzacy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/hold

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: BenTheElder, krzyzacy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these OWNERS Files:
  • OWNERS [BenTheElder,krzyzacy]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@@ -19,6 +19,10 @@
# TODO(bentheelder): verify that this works and move it into the images
CACHE_HOST="http://bazel-cache.default:8080"

package_version () {
dpkg-query --showformat='${Version}' --show $1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct for getting a package version but not sufficient for tracking what $CC actually points to if we install other versions etc. updating in a bit with something smarter based on intentionally breaking the cache with planter and a local instance of the remote cache.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe $CC --version? contains spaces and strings, so needs quoting.

I'm also curious how stdlib includes are accounted for in caching.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also curious: does the caching take into account the --crosstool and --cpu flags?

GCC_VERSION=$(package_version gcc)
echo "build --action_env=__CACHE_GCC_VERSION__=${GCC_VERSION}" >> $1
PYTHON_VERSION=$(package_version python)
echo "build --action_env=__CACHE_PYTHON_VERSION__=${PYTHON_VERSION}" >> $1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you also need build --experimental_strict_action_env for these to take effect.. https://docs.bazel.build/versions/master/command-line-reference.html

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope..

@BenTheElder
Copy link
Member Author

BenTheElder commented Feb 9, 2018

Today I proved locally that if you switch GCC versions system-wide inside planter and try to reuse the remote cache you get poisoned results because bazel doesn't have any way to track the host binaries that aren't actually registered with bazel currently.

The proposed work around of --action_env=SOME_BOGUS_ENV=<tool versioning> almost works but some actions don't seem to pass any env (tracked with bazel build //... -s to output the calls) so test-infra will incorrectly cache and choke on missing headers while compiling protobuf cc files.

My next thought is to use the tooling tracking to create a meta "cache version" key we insert into bazelrc like build --remote_http_cache=http://${CACHE_DOMAIN}:${CACHE_PORT}/${CACHE_VERSION}. The docs for the protocol suggest that prefixing the cache HTTP path is normal / supported.

https://docs.bazel.build/versions/master/remote-caching.html#http-caching-protocol

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 9, 2018
@BenTheElder
Copy link
Member Author

/hold cancel
current changes should go in anyhow

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 9, 2018
@k8s-ci-robot k8s-ci-robot merged commit 303e45f into kubernetes:master Feb 9, 2018
@BenTheElder BenTheElder deleted the consolidation branch February 9, 2018 07:00
@BenTheElder
Copy link
Member Author

Following up with @buchgr on this idea, will experiment locally a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/bazel area/images cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants