Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fetchurl: accept SRI in hex and base32 #199660

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

vcunat
Copy link
Member

@vcunat vcunat commented Nov 5, 2022

Proof of concept stemming from NixOS/rfcs#131

Problem statement:

  • single attribute name for fetch* hashes would be nice; the currently most common are sha256 and hash
  • copy&paste of hash from upstream seems attractive (e.g. https-secured release announcements)
  • SRI hashes (hash attribute on nixpkgs level) only support base64 encoding
  • upstreams don't use base64 encoding commonly
  • implementing on nixpkgs level allows immediate usage; with nix we'd have to wait at least a year, I think

- only implemented for sha256 and sha512
- applies also to fetchFrom* as those wrap fetchurl
Just to allow testing the PoC, at least the base16 case,
as that's commonly used by upstreams.
@nbraud
Copy link
Contributor

nbraud commented Sep 16, 2023

SRI is an actual standard, with a precise definition, and silently diverging from it would be problematic IMO for anyone:

  • writing tooling around nix (like maintainer scripts, autoupdaters etc.) ;
  • working on supply-chain integrity, for instance wishing to check hashes in nixpkgs correspond to what is expected, or ;
  • otherwise trying to contribute to nixpkgs, who'd need to learn about yet-another custom (and undocumented) encoding and which fetchers support it or not.

That's already a significant problem with the legacy “base32” encoding; I legitimately do not understand the motivation for (re)creating more of those issues, that will stay in nixpkgs for at least another decade.
Sure, it would allow derivations to use hash without doing an hex-to-base64 conversion, but in doing so we lose most of the advantages of hash using a well-defined standard.

@vcunat
Copy link
Member Author

vcunat commented Sep 16, 2023

base16 hashes seem way more widespread in practice than base64/SRI, so I personally view SRI as a "less standard" approach.

But I'm glad I got at least some feedback on this PR.

@nbraud
Copy link
Contributor

nbraud commented Sep 16, 2023

base16 hashes seem way more widespread in practice than base64/SRI, so I personally view SRI as a "less standard" approach.

I'm afraid I do not follow:

  1. base 16 being a documented (and widely-used) standard, does not make the proposal in this PR a standard
  2. said proposal is to replace SRI as the format hash accepts, with a bespoke superset of that (i.e. the hash's name followed by any of a base16, “nix32,” or base64 encoding)

Moreover, that doesn't really address any of the issues I raised? Anyone needing to deal with hash now has to know it's not actually SRI on some fetchers, has to write (a lot more) custom code if they expect to parse those strings, etc.

If you are trying to argue hash should only use "${hashName}-${hexEncoding}, that's not what this PR implements, and would break backwards-compatibility.

But I'm glad I got at least some feedback on this PR.

You are welcome.

@nbraud
Copy link
Contributor

nbraud commented Sep 16, 2023

PS: In case that wasn't clear, I don't care all that much what encoding hash uses as long as it is:

  • precisely-defined,
  • adopted outside of Nix,
  • not tied to specific hash functions.

Using SRI fulfills all those (so it doesn't have the issues I raised above) and is what hash already uses.

@vcunat
Copy link
Member Author

vcunat commented Sep 16, 2023

The tooling argument: as the original value is base16 typically, you either need to complicate the tools that generate nix expressions (conversion to base64; including humans copy&pasting so far) or those that consume the nix expressions (accept more than base64). I don't see why the first would be clearly superior.

I thought it's better to have an option to keep the same format as what is published elsewhere (typically). That option has always been here through attributes like sha256; this PR allows to exercise it additionally with the hash attribute by trivial textual transformation, as I understand that uniformity in attribute names can be useful (we had issues with overriding).

Of course it's about opinions, to some extent 🤷🏽

@nbraud
Copy link
Contributor

nbraud commented Sep 16, 2023

The tooling argument: as the original value is base16 typically, you either need to complicate the tools that generate nix expressions (conversion to base64; including humans copy&pasting so far) or those that consume the nix expressions (accept more than base64). I don't see why the first would be clearly superior.

In all cases, tools that consume some upstream manifest (be it checksum files, indexes in language-specific package managers' repositories, etc.) must extract the hash, but they know (by definition) what formats they are dealing with.

If hash remains SRI, tools that consume nix expressions also only have a single format to care about. With this proposal, such tools have to know about 3 different formats, one of which is undocumented.

That's a lot more complexity; to put a concrete quantity on it, reliably recognising and decoding all 3 formats took me 100+ lines of Python (and a couple migraines...), and that's not a particularly verbose language. On the other hand, the conversion from a standard format (hex or base64) is pretty trivial; even in a shell script, that's just a call to nix-hash.

I thought it's better to have an option to keep the same format as what is published elsewhere (typically).

I honestly don't know that this matters so much:

  • when updating a derivation manually, people are already either:
    • putting in lib.fakeHash (or equivalent) and getting the hash on first (pre)fetch; then, it's easiest if we do have one uniform format;
    • selecting the right hash(es) by hand and pasting it in: in that case, pasting it into a nix-hash call, or having a quick editor macro to convert, doesn't seem a lot extra work;
  • if the process is (semi-)automated, the conversion becomes essentially a non-issue.

However, having a uniform format makes having automation at all, a lot easier (speaking as someone who just burnt a lot of her remaining sanity to successfully parse hashes from nixpkgs) and I believe that's what we should aim for, in terms of minimizing developer toil.

The same argument holds, I think, for people wanting to verify the correct hashes are in nixpkgs, but even moreso :
that kind of supply-chain integrity work only yields actual security benefits at scale (as in, one needs to check most, if not all, of the derivations they pull in) so it must be automated anyhow.

That option has always been here through attributes like sha256; this PR allows to exercise it additionally with the hash attribute by trivial textual transformation, as I understand that uniformity in attribute names can be useful (we had issues with overriding).

Oh, OK. That wasn't evident to me at all that this was what the proposal addresses.

I agree it would make sense to (eventually) deprecate sha256 and similar attributes, but (as explained above) I don't think making more-complex formats (that all tooling needs to support in perpetuity) is a good way to do that.

@nbraud
Copy link
Contributor

nbraud commented Sep 16, 2023

PS: One possibly-unobvious goal, in making a maintainer script and PRs to progressively move drvs to hash, is discover the shortcomings in current tooling.

For instance, one of the firefox packagers pointed out their automated updater doesn't emit SRI-format hashes, so I split out the change into #255518 and submitted a patch that makes it do so.

@vcunat
Copy link
Member Author

vcunat commented Sep 17, 2023

nix-prefetch-url and others emit base32 by default. An the nix command is officially still experimental. IMO it's good to start with such places, as base32 is certainly the worst part here. (Ideally making the nix command officially stable, at least some parts of it.)

@nbraud
Copy link
Contributor

nbraud commented Sep 19, 2023

nix-prefetch-url and others emit base32 by default. [...] IMO it's good to start with such places, as base32 is certainly the worst part here.

Agreed! I've started a list of which tools (still) use the nix32 encoding as their default (or only) option, and I plan to submit PRs eventually.

Probably not in the very-near future though: I'm not home for some days, and may be quite busy when I come back.

An the nix command is officially still experimental. [...] (Ideally making the nix command officially stable, at least some parts of it.)

nix hash is indeed not stabilized (see NixOS/nix#8876) but nix-hash has been for a while, as far as I can tell?

@vcunat
Copy link
Member Author

vcunat commented Sep 19, 2023

Yes, but I suspect that it's "stabilized" to the point that changing the default output format won't be accepted. Cross-ref: NixOS/nix#7690

@nbraud
Copy link
Contributor

nbraud commented Sep 19, 2023

Yes, but I suspect that it's "stabilized" to the point that changing the default output format won't be accepted. Cross-ref: NixOS/nix#7690

Yes, that PR did implement support for SRI hashes in nix-hash, which is why I mentioned it as a quick way to do conversions to/from SRI.

I wasn't necessarily referring to it specifically, in the context of updating tools' defaults, though I guess it could make sense, esp. if it's called by nix-prefetch-url etc.
Worst case, I submit a PR, discussion happen, and it's rejected.

@wegank wegank added 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md 2.status: merge conflict labels Mar 19, 2024
@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Mar 20, 2024
@wegank wegank added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jul 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants