Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib.path.difference: init #209375

Closed
wants to merge 9 commits into from
Closed

Conversation

infinisil
Copy link
Member

@infinisil infinisil commented Jan 6, 2023

Description of changes

Calculates the difference between a set of paths, meaning the common ancestor and the individual subpaths of them. Example:

path.difference { x = /a/b/c/d; y = /a/b/u/v; }

-> { commonPrefix = /a/b; suffix = { x = "./c/d"; y = "./u/v"; }; }

Notably this implementation doesn't rely on calling toString on paths, which makes the implementation rather tricky. This implementation also ensures that the given paths have a matching filesystem root, otherwise it throws an error. The lazy trees PR is the only known case where this could happen:

let
  lazyPath = (fetchTree { url = ./.; type = "git"; }).outPath;
in (import ./. {}).lib.path.difference {
  x = ./.;
  y = lazyPath;
}
error: lib.path.difference: Path x = /home/tweagysil/antithesis/nixpkgs (root /) has a different filesystem
  root than path y = /home/tweagysil/antithesis/nixpkgs/ (root /home/tweagysil/antithesis/nixpkgs/

Originally implemented in #200718. Relates to other work in the path library effort.

This work is sponsored by Antithesis

Things done
  • Documentation
  • Tests

@roberth
Copy link
Member

roberth commented Jan 8, 2023

The lazy trees PR is the only known case where this could happen:

How about a store path and a non-store path? I would usually consider these to have a sufficiently different meaning that / as a common ancestor doesn't make sense. However, forcing this restriction here may make the function less useful. It seems that the caller should validate ancestor themselves most of the time.

(not suggesting a change, but it may be relevant for docs)

lib/path/default.nix Outdated Show resolved Hide resolved
@infinisil
Copy link
Member Author

@roberth store paths can't be path value types (since those don't allow string context, but all store paths should have string context to be valid), so I don't think it's a concern here, since this function throws an error if you're passing anything other than a path value.

Copy link
Contributor

@fricklerhandwerk fricklerhandwerk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, so far. As always, notes on naming and wording.

lib/path/default.nix Outdated Show resolved Hide resolved
Comment on lines +110 to +118
# Takes a Nix path value and deconstructs it into the filesystem root
# (generally `/`) and a subpath
deconstructPath =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Takes a Nix path value and deconstructs it into the filesystem root
# (generally `/`) and a subpath
deconstructPath =
# Takes a Nix path and splits it into the filesystem root (generally `/`) and a subpath
splitPath =

Simpler, shorter name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

splitPath would also be a name for something like strings.split "/"

splitPathFromRoot?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not just splitting it from the root but also splitting the path itself. Ultimately this is an internal variable name, it doesn't matter much as long as we don't expose it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sure matters less than an exported name, but for readability (= maintainability) it's relevant nonetheless. The suggestion is merely to (slightly) reduce cognitive load of reading a long word with many syllables. It arguably eases readability by only a small amount, but we have a long leverage on the future here. In any case, I'm not pushing this one at all, just trying to keep things at the amazing level of quality that you set out with.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not be held to this standard all the time, yes I'm a perfectionist, but at some level it's just not worth it. There will always be things that aren't immediately obvious from just the name, in which case programmers can read comments, the source code, run the code themselves, etc.

But it's also entirely subjective which name is best. In this case, I think deconstructPath is still the best name, and me being the original author of the code should give that opinion some weight.

In particular, I also think deconstructPath being non-obvious is even a good thing, because what the function does is in fact non-obvious and non-trivial, I'd rather people look at the definition and comment instead of assuming they know what it means.

lib/path/default.nix Outdated Show resolved Hide resolved
lib/path/default.nix Outdated Show resolved Hide resolved
lib/path/default.nix Outdated Show resolved Hide resolved
(isAttrs paths)
"lib.path.difference: The given argument is of type ${typeOf paths}, but an attribute set was expected";
{
inherit commonAncestor subpaths;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
inherit commonAncestor subpaths;
inherit commonPrefix suffix;

I still think this is the appropriate naming. Paths are sequences, they can have a prefix and a suffix. Directories have ancestors, but directories are trees. It's a completely different thing.

I actually prefer prefix for brevity even if it's less clear what it refers to, but I'm fine with making it explicit by calling it commonPrefix. I suggest indeed using singular suffix, because that will read nicely at the call site:

let
  paths = lib.path.difference { foo = /qux/foo; bar = /qux/bar; };
in paths.suffix.foo 

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss this in #210423 (comment)

lib/path/default.nix Outdated Show resolved Hide resolved
lib/path/default.nix Outdated Show resolved Hide resolved
lib/path/default.nix Outdated Show resolved Hide resolved
lib/path/default.nix Outdated Show resolved Hide resolved
@infinisil infinisil marked this pull request as ready for review March 14, 2023 22:12
Comment on lines +199 to +202
- The _longest_ common prefix is returned

forall paths, result = difference paths.
! exists longerPrefix. hasProperPrefix result.commonPrefix longerPrefix && all (hasPrefix longerPrefix) (attrValues paths)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this PR weakly depends on #210423 because of the docs here. The deconstructPath function is also duplicated/reusable between these two PRs

Copy link
Contributor

@fricklerhandwerk fricklerhandwerk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's such a joy to read your code.

lib/path/default.nix Outdated Show resolved Hide resolved
infinisil added a commit to tweag/nixpkgs that referenced this pull request Mar 22, 2023
@infinisil
Copy link
Member Author

While working on other parts I'm noticing the need for a way to only get the commonPrefix, without requiring to specify attribute names. So I'm thinking of also having a function like

lib.path.commonPrefix :: [ Path ] -> Path

The meat of the code could be shared between the two functions, more thought needed

@infinisil
Copy link
Member Author

Or alternatively, have a version of difference that supports lists

@fricklerhandwerk
Copy link
Contributor

I remember we discussed this in the design phase, and both a list-based difference as well as that one here can be trivial wrappers around the original one.

So trivial, I'm not sure the second one is even needed.

@infinisil
Copy link
Member Author

It's non-trivial to use an attribute-set based difference if you have a list of paths and only need commonPrefix:

(difference (listToAttrs (imap0 (index: path: nameValuePair (toString index) path) list))).commonPrefix

@fricklerhandwerk
Copy link
Contributor

Yeah right, I agree we should add a list-based wrapper for that. But another wrapper just for accessing an attribute from the return value might not be worth it.

@infinisil infinisil marked this pull request as draft June 13, 2023 18:44
@github-actions github-actions bot added the 6.topic: lib The Nixpkgs function library label Jul 19, 2023
@infinisil
Copy link
Member Author

I'll close this for now, while the function is neat, there aren't any motivating use cases for needing this as far as I can see, and the the design of it doesn't seem trivial.

@infinisil infinisil closed this Aug 14, 2023
@infinisil infinisil deleted the lib.path.difference branch August 14, 2023 19:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants