Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos: use structured attrs for assertions/warnings #342372

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

h7x4
Copy link
Member

@h7x4 h7x4 commented Sep 16, 2024

Description of changes

This PR reworks the assertions module to take a freeform tree-like structure for assertions and warnings. This has one major advantage over the existing list structure, which is the ability for a downstream user to forcefully disable an assertion or warning they believe to be erroneous without needing to resort to any tricky hacks. It might also open up for ideas where we could add more metadata or properties to the assertions/warnings, like different types of warnings or whether the message should be lazily evaluated. It will also let you extract the tree structure of non-firing warnings/assertions for whatever kinds of research purposes you might need it for. I don't suspect that an entire manual appendix of warnings/assertions is particularly useful, but maybe the data is useful for some?

Example

Here is an example of the old and new way to write assertions and warnings:

# Old
{
  warnings = lib.mkIf config.services.foo.bar [
    "Some example warning"
  ];
  
  assertions = [{
    assertion = config.services.foo.baz;
    message = ''
      You forgot to enable baz.
    '';
  }];
}

# New
{
  warnings.services.foo.beCarefulWithBar = {
    # ⬐ New option for storing the boolean that triggers the warning
    condition = config.services.foo.bar;
    message = "Some example warning";
  ];
  
  assertions.services.foo.makeSureToEnableBaz = {
    assertion = config.services.foo.baz;
    message = ''
      You forgot to enable baz.
    '';
  };
}

If the user believes either of these to be an error in nixpkgs, they should of course make an issue first, but in the meanwhile they can do this:

{
  warnings.services.foo.beCarefulWithBar.enable = false;
  assertions.services.foo.makeSureToEnableBaz.enable = false;
}

Backwards compatibility

The module takes any existing assertions/warnings that are declared as lists, and coerces them to the new format under {assertions,warnings}.legacy.<sha256-sum of message string>. While this makes them a bit harder to disable, it ensures a smooth transition. While assertions and warnings are marked with internal = true, I believe a lot of people are using them for out-of-tree modules. I have explicitly opted out of adding a note to the release docs, but I have changed the nixos manual.

Other things worth mentioning, and maybe discussing

  • This PR depends on lib: add lib.attrsets.collect' #342330

  • I've modified some of the core modules and library functions to use the new scheme, in addition to a few modules with a large set of assertions/warnings as a proof of concept. Most assertions have just been placed in the tree, but some needed slight modifications. Some assertions and warnings that earlier were aggregated have been split up to be separate assertions.

  • There is obviously no established scheme for the naming or namespacing of assertions and warnings. I've tried naming them roughly to reflect what they are warning about and what they are trying to assert, but I'm very open for changes and/or standardisation of some kind of scheme.

  • Should we consider making these options non-internal, considering widespread use outside nixpkgs?

  • I'm not sure about the evaluation speed impact of all the coercion stuff I've added. It doesn't seem much slower on my PC, but I haven't made any measurements.

  • Due to typechecking of the assertion/warning submodule, the message strings will now become non-lazy. In one way, that's a good thing because we get to typecheck the messages. On another hand, there are some message strings that depend on their boolean error condition to make any sense (e.g if the assertion asserts that a setting should not be set, the message might refer to the content of that setting without any regard for what happens if the setting is not there). The warning mkIfs handle this properly, but the assertions might not. I'm not sure whether we should try to hunt these down, or make typechecking opt-in, or how to go about this? Building all the nixos tests is a good start I suppose. This is not the case anymore. Lazyness is on by default for legacy assertions, and opt-out for new ones.

  • There might be a case where identical assertions that were earlier part of the assertion list can have differing boolean values while still getting the same hash. I believe this could trigger an eval error where the same option is set to differing values? I'm not sure if we should handle those cases as they come and move towards the new format, or try to handle it with something like lib.types.boolByOr This is not relevant as long as assertions is using lib.imap0 for type coercion.

Testing

I've only done basic testing, opening a nix repl and evaluated attributes as follows:

# Using traceValSeqN for some reason has the sideeffect of pretty printing,
# in contrast to some of the other trace functions 
nix-repl> lib.traceValSeqN 40 legacyPackages.x86_64-linux.nixosTests._3proxy.config.nodes.peer0.assertions

... # Too much here, it's the entire expanded tree of all assertions

{
  boot = { ... };
  console = { ... };
  documentation = { ... };
  environment = { ... };
  fileSystems = { ... };
  fonts = { ... };
  hardware = { ... };
  i18n = { ... };
  krb5 = { ... };
  legacy = { ... };
  nesting = { ... };
  networking = { ... };
  nix = { ... };
  programs = { ... };
  security = { ... };
  services = { ... };
  sound = { ... };
  stubby = { ... };
  systemd = { ... };
  users = { ... };
  virtualisation = { ... };
  xdg = { ... };
  zramSwap = { ... };
}

nix-repl> lib.traceValSeqN 40 legacyPackages.x86_64-linux.nixosTests._3proxy.config.nodes.peer0.warning

... # Expanded tree here as well

{
  boot = { ... };
  docker-containers = { ... };
  documentation = { ... };
  environment = { ... };
  fonts = { ... };
  hardware = { ... };
  i18n = { ... };
  jobs = { ... };
  legacy = { ... };
  networking = { ... };
  nix = { ... };
  programs = { ... };
  qt5 = { ... };
  security = { ... };
  services = { ... };
  snapraid = { ... };
  system = { ... };
  systemd = { ... };
  unifi-poller = { ... };
  users = { ... };
  virtualisation = { ... };
}

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.11 Release Notes (or backporting 23.11 and 24.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

nixos/modules/misc/assertions.nix Outdated Show resolved Hide resolved
Comment on lines +34 to +55
# This might be replaced when tagged submodules or similar are available,
# see https://github.com/NixOS/nixpkgs/pull/254790
checkedAssertionItemType = let
check = x: x ? assertion && x ? message;
in lib.types.addCheck assertionItemType check;
Copy link
Member Author

@h7x4 h7x4 Sep 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a limitation here, where the submodule won't be accepted unless it is in the form of an attrset. Function submodules and submodules by path will not be accepted by the check, and ignored. Not sure if there is much to do about it? I don't think anyone will write either of the following anytime soon:

{
  assertions.a.b.c = { ... }: {
    assertion = true;
    message = "....";
  };
  
  assertions.x.y.z = ./myAssertion.nix;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need an extra config fixpoint, or options introspection, or anything like that, so a record would be a better type constructor for this, instead of submodule. It does the thing you need, and basically no redundant or confusing features.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm currently using both config and options for the lazyness toggle in a submodule fixpoint though. I'll have a look at it, but I'm not sure how well it would work with the current solution.

nixos/modules/system/activation/top-level.nix Outdated Show resolved Hide resolved
@h7x4 h7x4 force-pushed the nixos-structured-assertions-warnings branch from cab0fe9 to b01fa5a Compare September 17, 2024 09:00
@roberth
Copy link
Member

roberth commented Sep 17, 2024

Can't review right now, but I already have some (incomplete) thoughts:

  • If you haven't already, please read / be aware of [Reverted] Module-builtin assertions, disabling assertions and submodule assertions #97023
    It was reverted because triggering the checks at the right time is tricky. I don't think this PR changes that, which is good, but it may have other design aspects that could be interesting.
  • Disabling whole trees of checks mkIf- or enable-style (or similar) will be good for laziness / performance and module DX
  • We may consider not to translate existing checks into the new structure, but just trigger two independent check systems
  • I've said checks to generalize assertions and warnings just now, but maybe we want to reserve that term for derivations that check things (for example, checks as in flakes, or system.checks in NixOS)

@h7x4 h7x4 marked this pull request as draft September 17, 2024 10:06
@h7x4
Copy link
Member Author

h7x4 commented Sep 21, 2024

ofborg seems to fail due to an assertion message that is now being evaluated eagerly. It is building nixos-vm, which causes the failure.

$ nix-instantiate --arg nixpkgs ./. ./pkgs/top-level/release.nix -A unstable

...

… while evaluating definitions from `/home/h7x4/nixpkgs/nixos/modules/misc/nixpkgs.nix':

… while evaluating the option `nixpkgs.localSystem':

… while evaluating the option `nixpkgs.system':

nixosExpectedSystem (due to config.nixpkgs.localSystem.config) here causes the failure.

# nixos/modules/misc/nixpkgs.nix:354
let
  nixosExpectedSystem =
    if config.nixpkgs.crossSystem != null
    then config.nixpkgs.crossSystem.system or (lib.systems.parse.doubleFromSystem (lib.systems.parse.mkSystemFromString config.nixpkgs.crossSystem.config))
    else config.nixpkgs.localSystem.system or (lib.systems.parse.doubleFromSystem (lib.systems.parse.mkSystemFromString config.nixpkgs.localSystem.config));
  nixosOption =
    if config.nixpkgs.crossSystem != null
    then "nixpkgs.crossSystem"
    else "nixpkgs.localSystem";
  pkgsSystem = finalPkgs.stdenv.targetPlatform.system;
in {
  assertion = constructedByMe -> !hasPlatform -> nixosExpectedSystem == pkgsSystem;
  message = "The NixOS nixpkgs.pkgs option was set to a Nixpkgs invocation that compiles to target system ${pkgsSystem} but NixOS was configured for system ${nixosExpectedSystem} via NixOS option ${nixosOption}. The NixOS system settings must match the Nixpkgs target system.";
}

Eventually leading to triggering the throw in nixpkgs.system

# nixos/modules/misc/nixpkgs.nix:283
system = lib.mkOption {
  type = lib.types.str;
  example = "i686-linux";
  default =
    if opt.hostPlatform.isDefined
    then
      throw ''
        Neither ${opt.system} nor any other option in nixpkgs.* is meant
        to be read by modules and configurations.
        Use pkgs.stdenv.hostPlatform instead.
      ''
    else
      throw ''
        Neither ${opt.hostPlatform} nor the legacy option ${opt.system} has been set.
        You can set ${opt.hostPlatform} in hardware-configuration.nix by re-running
        a recent version of nixos-generate-config.
        The option ${opt.system} is still fully supported for NixOS 22.05 interoperability,
        but will be deprecated in the future, so we recommend to set ${opt.hostPlatform}.
      '';
  defaultText = lib.literalMD ''
    Traditionally `builtins.currentSystem`, but unset when invoking NixOS through `lib.nixosSystem`.
  '';
  ...
};

Sorry to ping you about this @roberth, but I could really use some help to decide how to go about this. I currently see 3 possible solutions:

  • Rewrite the assertion message or somehow should rewrite nixosExpectedSystem to not refer to nixpkgs.system (not sure where to start, or whether this even inherently makes sense)
  • Add a lazy bool option to assertion/warning to allow for disabling type checking/eval of message
  • Disable this part of the eval test or somehow modify it to set nixpkgs.system so it doesn't throw (however, I'm not sure what implications this has, but it doesn't seem very desirable).

I've verified that the command (and then probably also ofborg) evaluates completely if I remove ${nixosExpectedSystem} from the message.

# This might be replaced when tagged submodules or similar are available,
# see https://github.com/NixOS/nixpkgs/pull/254790
checkedAssertionItemType = let
check = x: x ? assertion && x ? message;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe removing check is sufficient to make it lazy again?

The unfortunate thing about strictness is that it is not lazy, unless you're checking something that was going to be evaluated anyway, such as most check that are done in the merge function (though for lists or attrsets, merge can still evaluate too much if not careful).

@roberth
Copy link
Member

roberth commented Sep 22, 2024

I see that you're trying to do a good job with thorough checking, which as a fan of good DX and static checking, I can very much relate to. However, the module system is very different from most other languages or configuration systems, by virtue of its fairly unique architecture, doing runtime checks in a lazy language. This has profound implications for the implementation trade-offs.

The module system is very sensitive when it comes to the specifics of its behavior, especially regarding strictness (laziness). It is very important for it to be as lazy as possible, so that it avoids running into errors that shouldn't be triggered, or infinite recursions.

We also need to care for users who write their own modules with assertions and warnings. By forcing them to migrate their code immediately, you put them in a position where they have to choose whether to support NixOS stable or unstable. This isn't great for end users, but particularly problematic for maintainers who need their module to work on both releases. For this reason, we can't accept an immediate breaking change in the assertions/warnings type. However, this can be resolved by translating list-based definitions to attributes like "anon-${i}".
For this you need something like coercedTo but using imap instead of map in the merge function.

  • Rewrite the assertion message or somehow should rewrite nixosExpectedSystem to not refer to nixpkgs.system (not sure where to start, or whether this even inherently makes sense)

As said, we must not regress in laziness, so this should be solved in the module system; not the code that uses the module system.

  • Add a lazy bool option to assertion/warning to allow for disabling type checking/eval of message

Lazy and unchecked are fast. The module system will check the message because it's a submodule anyway; just later than you envisaged. This will make it run a bit faster, and a bit more reliably.
Ideally infrequently used values, like the message, are checked in a static way, such as with a test. Anything that's important is worth testing.

  • Disable this part of the eval test or somehow modify it to set nixpkgs.system so it doesn't throw (however, I'm not sure what implications this has, but it doesn't seem very desirable).

That module already needs to balance complicated compatibility and error reporting at the cost of basically no simplicity and very little room to spare in terms of making literally anything more strict. I would be happy to review or discuss changes that improve the end user experience of that module, which I believe is still possible, but I feel like this may be the wrong reason.
I don't know if you've studied it in detail; if not, it's probably best to that module untouched and solve the problem in the module system itself.

I hope my earlier suggestion helps to make it work again.

@h7x4
Copy link
Member Author

h7x4 commented Oct 3, 2024

Okay, so I've had another go at it. Turns out that the message was evaluated in 3 places:

  1. While being typechecked by the module system

This was expected, mitigated by adding the lazy option, which disables typechecking and preserves lazyness. This is not needed for warnings, because all legacy warnings already are behind a mkIf or similar. I've added a note recommending people not to enable it unless absolutely necessary, so we can typecheck as much as possible, as well as a warning stating that you should make sure to verify the message yourself if you do end up enabling it. I would count the nixpkgs.config assertion as a location where it's necessary.

All legacy options are lazy by default to ensure we don't break any out-of-tree modules.

  1. During coercion while being hashed before being added to assertions.legacy

This is the most problematic part. We can't hash the message without evaling the message. Currently, I've switched to using the lib.imap0 (i: ... "anon-${toString i}") strategy, but the indices will shift everytime someone adds/removes an anonymous assertion and so you cannot disable an assertion in a stable manner. I'm tempted to make assertions.legacy into a list and handle it separately, but that might just add more convoluted coercion handling with little added value. We could also run through all legacy assertions, check if the user has disabled them and warn them that this is unstable?

I am also not sure how much I should strive to make this common with warnings. For warnings, there shouldn't be any problem with using the hashString approach, and it would make for a stable interface to let users disable legacy warnings. Is this property more important than similarity to legacy assertions?

  1. While running the collect' before displaying warnings in nixos/modules/system/activation/top-level.nix

The collect' would further recurse into the attrset if the condition didn't trigger, and accidentally evaluate the message. I've made a custom version of collect' that has a stop condition when it finds something that looks like an assertion but is marked as lazy. The use for this function seems fairly niche, so I haven't added it to lib.attrsets, but I'd be happy to do so if someone asks. warnings still uses lib.collect'

@h7x4 h7x4 force-pushed the nixos-structured-assertions-warnings branch from 8a86a2b to 6fc2a83 Compare October 8, 2024 10:21
@h7x4 h7x4 marked this pull request as ready for review October 8, 2024 10:21
@h7x4 h7x4 requested a review from roberth October 8, 2024 10:30
warnings.networking.bonds.deprecatedOptions = lib.mapAttrs (bondName: bond:
lib.mergeAttrsList (map (optName: {
${optName} = {
condition = (cfg.bonds.${bondName}.${optName} or null) != null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
condition = (cfg.bonds.${bondName}.${optName} or null) != null;
condition = (cfg.bond.${optName} or null) != null;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually after 7 years it's probably time to remove these legacy options and not waste any more time on them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting I remove the change, or do I just delete the deprecation warnings altogether?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants