Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support subresource integrity format for download() checksums. #7208

Conversation

jmillikin
Copy link
Contributor

@jmillikin jmillikin commented Jan 22, 2019

Closes #4881

Notes for the reviewer:

  • I will add tests (in the existing bash repository rule tests) after preliminary approval of the implementation.
  • Supported digests are the minimal recommended set from https://www.w3.org/TR/SRI/: SHA-256, SHA-384, and SHA-512. I did not enable support for SHA-1 checksums in this PR.
  • Do we want to make the struct returned by download() and download_and_extract() less strict about having a sha256 field? If we allowed the returned field to depend on the integrity list, then users of SHA-384 and SHA-512 checksums wouldn't need to re-checksum the downloaded file.

@jmillikin
Copy link
Contributor Author

def _repo_test(ctx):
    downloaded = ctx.download(
            url = ["https://github.com/bazelbuild/bazel/releases/download/0.21.0/bazel-0.21.0-darwin-x86_64.sig"],
            output = "test-download.bin",   
            integrity = [
                "sha256-YTmQnwlYCwr3Er6ellav7BH4VTgHi9hbe5vvOfA0P00=",
                "sha384-uIEYsgRb01KvTxnqnmbAlEwedx4VBiZlGlAgjL6Qb+K2h4CzDZbo6FRjA5xKqdDr",
                "sha512-/+EGy2d+zm/rB6H5FUEzisWW6dZ+N64/8Vrl/W/h0u8oI8nrHO/wvCguUgsp3H9ZKSS13cjk1JhhU0Cfn0a6DA==",
            ],
    )
    print("downloaded: ", downloaded)

    ctx.file("WORKSPACE", "workspace(name = {name})\n".format(name = repr(ctx.name)))
    ctx.file("BUILD.bazel", "")

repo_test = repository_rule(_repo_test)
$ ~/src/bazel/bazel-bin/src/bazel query @repo_test//:*
INFO: Invocation ID: 1e019ff7-7544-44e4-bf81-62382c261a54
DEBUG: /home/john/src/testing-bazel-integrity/repo.bzl:9:5: downloaded:  struct(sha256 = "6139909f09580b0af712be9e9656afec11f85538078bd85b7b9bef39f0343f4d", sha512 = "ffe106cb677ece6feb07a1f91541338ac596e9d67e37ae3ff15ae5fd6fe1d2ef2823c9eb1ceff0bc282e520b29dc7f592924b5ddc8e4d4986153409f9f46ba0c")
@repo_test//:BUILD.bazel
Loading: 1 packages loaded

@irengrig irengrig added the team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. label Jan 22, 2019
@irengrig
Copy link
Contributor

/cc @aehlig

@jin
Copy link
Member

jin commented Feb 19, 2019

Ping @dslomov, PTAL

@dslomov dslomov requested a review from aehlig February 25, 2019 15:41
@jmillikin-stripe
Copy link
Contributor

@dslomov Have you had time to review this yet?

@dslomov
Copy link
Contributor

dslomov commented Mar 8, 2019

ping @aehlig

@dslomov
Copy link
Contributor

dslomov commented Mar 11, 2019

General idea and the API shape sounds good to me.
The return struct should probably have an integrity field, as well as sha256, sha512 fields (as appropriate)

@jmillikin
Copy link
Contributor Author

I ran into a bunch of Git conflicts when trying to rebase this to current master. The changes to plumb checksum types through the HTTP layer have been extracted to #8714; this PR will become only the changes to repository_ctx.download().

@jmillikin jmillikin force-pushed the jmillikin_download-subresource-integrity branch 7 times, most recently from 6907711 to 2e28f5e Compare June 28, 2019 12:35
@jmillikin
Copy link
Contributor Author

🎉 Rebase complete, tests added, and all checks green.

I had to back off on one of the new features, multiple checksums, because the semantics were too hard to reason about. In browsers the download is allowed if any checksum matches, and I'm not 100% sure that's the right behavior for Bazel.

bazel-io pushed a commit that referenced this pull request Jul 2, 2019
This is extracted from #7208 ("Support subresource integrity format for `download()` checksums.")

I'm trying to revive that PR and rebase to master, but the replacement of `String sha256` through the HTTP downloader code is causing a lot of Git conflicts. I expect that portion of the change to require less discussion than the proposed content integrity changes, so I'm moving it to a separate PR to simplify review.

@dslomov or @aehlig, could one of you take a look? Once this is merged, I will rebase PR 7208 onto it.

Closes #8714.

PiperOrigin-RevId: 256188026
@jmillikin jmillikin force-pushed the jmillikin_download-subresource-integrity branch from 2e28f5e to 4e5aba1 Compare July 2, 2019 21:24
@jmillikin
Copy link
Contributor Author

PR #8714 has merged, and rebase onto it was successful.

siberex pushed a commit to siberex/bazel that referenced this pull request Jul 4, 2019
This is extracted from bazelbuild#7208 ("Support subresource integrity format for `download()` checksums.")

I'm trying to revive that PR and rebase to master, but the replacement of `String sha256` through the HTTP downloader code is causing a lot of Git conflicts. I expect that portion of the change to require less discussion than the proposed content integrity changes, so I'm moving it to a separate PR to simplify review.

@dslomov or @aehlig, could one of you take a look? Once this is merged, I will rebase PR 7208 onto it.

Closes bazelbuild#8714.

PiperOrigin-RevId: 256188026
@jmillikin-stripe
Copy link
Contributor

Gentle ping

irengrig pushed a commit to irengrig/bazel that referenced this pull request Jul 15, 2019
This is extracted from bazelbuild#7208 ("Support subresource integrity format for `download()` checksums.")

I'm trying to revive that PR and rebase to master, but the replacement of `String sha256` through the HTTP downloader code is causing a lot of Git conflicts. I expect that portion of the change to require less discussion than the proposed content integrity changes, so I'm moving it to a separate PR to simplify review.

@dslomov or @aehlig, could one of you take a look? Once this is merged, I will rebase PR 7208 onto it.

Closes bazelbuild#8714.

PiperOrigin-RevId: 256188026
@jmillikin
Copy link
Contributor Author

@aehlig @dslomov Have you had time to review this yet?

Copy link
Contributor

@aehlig aehlig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks. Will import.

@bazel-io bazel-io closed this in 7108c77 Jul 17, 2019
@jmillikin jmillikin deleted the jmillikin_download-subresource-integrity branch July 19, 2019 05:59
@philwo philwo added the team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website label Jun 15, 2020
@alexeagle
Copy link
Contributor

packages uploaded to the npm registry using version 4 or earlier of the npm CLI are published with a SHA-1 hash. Is there a particular reason you didn't include that @jmillikin ? I'll send a PR to add it, so that Bazel can download everything that appears in an npm package-lock.json

@jmillikin-stripe
Copy link
Contributor

Bazel doesn't currently support SHA1 as a download checksum, and I didn't want to change that as part of my PR. I think it would be fine to send a separate PR to enable SHA1 if that would help you.

You might also be able to use Yarn, which supports recalculating checksums when writing its lockfile. We use that functionality in our builds so that every package has at least a SHA256 checksum.

@alexeagle
Copy link
Contributor

Thanks for your reply. I opened #12777 to add SHA-1 in this spot.

The feature we are working on for rules_nodejs would have bazel download the tarballs rather than npm or yarn, so we can't rewrite the lockfile (it wouldn't have the bytes on disk yet to calculate a new hash anyway).

@philwo philwo removed the team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website label Nov 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FR: more flexible checksum algorithms for repository download rules
9 participants