-
-
Notifications
You must be signed in to change notification settings - Fork 631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new version of LockfileMetadata
to support checking for identically specified requirements
#12782
Add new version of LockfileMetadata
to support checking for identically specified requirements
#12782
Conversation
metadata = LockfileMetadata.new( | ||
req.requirements_hex_digest, | ||
req.interpreter_constraints, | ||
{Requirement.parse(i) for i in req.requirements}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not absolutely convinced that this is the best place to put this parsing logic. It may be worth moving it over to new
at some point, but for now, this works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can fail and lead to confusing error messages. We probably want to catch those and enrich the error, but not clear to me the best way to do that. Mind adding a TODO(#12314)
please? (No need to fix now)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding an error a TODO in here. As mentioned on the other location where I'm using .parse
, the exception that gets raised doesn't play nice with mypy, and that's going to get in the way of improving this. I'm going to return to this in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Chris.
As discussed in #12683, I originally thought we might not want to put version numbers in the lockfile metadata, even it makes total sense to do internally. My main thought is it might suggest we support more versions than the current and latest - but I realize now that's not a good reason. It's valid for Pants 2.11 to only support lockfile versions 3 and 4, for example. And I think that is more clear. So, I think I now agree with your decision to surface version
.
If we do add version
, I think we want to update 2.7 to set version = 1
before the stable release goes out. We still have time to make that breaking change. Could you please split out this PR into that prework—i.e. not do the "compatible" requirements change yet—so we can cherry-pick?
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
This permits creating new versions of `LockfileMetadata` with different data requirements with minimal changes to the external API. Generally speaking, end-users should interact with `LockfileMetadata` and not the concrete subclasses. [ci skip-build-wheels] [ci skip-rust]
[ci skip-rust] [ci skip-build-wheels]
[ci skip-rust] [ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
93bbf5b
to
1ea583a
Compare
Updated description to refer to the fact that the original version is now called "V1" rather than "V0", and the former V1 is now V2. |
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
…ible-requirements-only [ci skip-rust] [ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
@@ -612,6 +612,7 @@ def _validate_metadata( | |||
requirements.lockfile_hex_digest, | |||
request.interpreter_constraints, | |||
python_setup.interpreter_universe, | |||
{}, # TODO(chrisjrn): include requirements strings here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment I'm not passing the actual requirements strings in, because we're only validating for tool lockfiles. This will change once we start using user lockfiles (which themselves send down requirements strings)
Thanks Chris! Before I look more closely at the code, I wanted to revisit the metadata format discussion. We've talked in the past about how the digest is only useful for tool lockfiles, whereas the list of requirements is only necessary for user lockfiles (but maybe still insightful for tool lockfiles?) Now that you've landed the This means that we could do a scheme like: // Tool lockfile
{
"invalidation_hash": "abc",
"interpreter_constraints": ["==3.9.*"],
"version": 2
}
// User lockfile
{
"requirements": ["foo"],
"interpreter_constraints": ["==3.9.*"],
"version": 2
} Or, to be more verbose: // Tool lockfile
{
"invalidation_hash": "abc",
"requirements": null,
"interpreter_constraints": ["==3.9.*"],
"version": 2
}
// User lockfile
{
"invalidation_hash": null,
"requirements": ["foo"],
"interpreter_constraints": ["==3.9.*"],
"version": 2
} Or, your format where you set all the metadata even if it's not applicable. With this discussion, it's worth considering how we'll handle What are your thoughts on the format? I personally think I prefer the simplest: only put metadata for what the caller actually needs. Keep our Pants boilerplate to a minimum. |
…eing more verbose # Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
My previous suggestion on the original ticket was to include a I think there are two pertinent questions here:
|
Makes sense. At that point, I suggest getting rid of the digest. It's no longer necessary - you can do exact string comparisons for tool lockfiles, whereas compatible check for user lockfiles. The digest maybe improves perf, but feels like a micro-optimization and we should go with a simpler lockfile format. Wdyt? |
That's a fair point. If we do go with that approach, then we're going to need to add a new field to specify whether we want an exact requirements match, or whether we want to allow only compatible requirements. I considered whether Having either the digest or the requirements strings makes it clear which behaviour is going to be applied; having only strings is going to require that behaviour flag on the side. Also, did you have any thoughts about platforms vs ICs? |
…ents strings instead # Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
For paper trail purposes -- we've decided to remove the |
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
…ible-requirements-only # Conflicts: # 3rdparty/python/lockfiles/user_reqs.txt
@@ -160,6 +161,7 @@ def pex_requirements( | |||
importlib.resources.read_binary(*self.default_lockfile_resource), | |||
), | |||
lockfile_hex_digest=hex_digest, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
n.b. We still calculate the hex digest while V1 lockfiles are still supported.
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
…r requirements mismatches # Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
LockfileMetadata
to support checking for compatible requirementsLockfileMetadata
to support checking for ~compatible~ identically specified requirements
LockfileMetadata
to support checking for ~compatible~ identically specified requirementsLockfileMetadata
to support checking for identically specified requirements
@Eric-Arellano This implements the lockfile metadata per our discussion yesterday and is probably ready for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Could you please update the PR description to explain how we reached this decision about removing the hash and using a list of requirements?
@@ -0,0 +1,142 @@ | |||
# Copyright 2021 Pants project contributors (see CONTRIBUTORS.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leftover?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait I'm confused still - what is this being used for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could swear I deleted this. FIXING.
@@ -527,10 +530,17 @@ def _validate_metadata( | |||
python_setup: PythonSetup, | |||
) -> None: | |||
|
|||
req_strings = ( | |||
{Requirement.parse(i) for i in requirements.req_strings} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto that this can fail, and please adding a TODO to improve the error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error that comes up here is pkg_resources.extern.packaging.requirements.InvalidRequirement
, which mypy balks at:
src/python/pants/backend/python/util_rules/pex.py:19:1: error: Skipping
analyzing "pkg_resources.extern.packaging.requirements": found module but no
type hints or library stubs [import]
I'll add the TODO, but it may end up taking some work to actually improve this error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we handle it like this:
pants/src/python/pants/backend/python/target_types.py
Lines 807 to 824 in fd71177
def parse_requirements_file(content: str, *, rel_path: str) -> Iterator[Requirement]: | |
"""Parse all `Requirement` objects from a requirements.txt-style file. | |
This will safely ignore any options starting with `--` and will ignore comments. Any pip-style | |
VCS requirements will fail, with a helpful error message describing how to use PEP 440. | |
""" | |
for i, line in enumerate(content.splitlines()): | |
line = line.strip() | |
if not line or line.startswith("#") or line.startswith("-"): | |
continue | |
try: | |
yield Requirement.parse(line) | |
except Exception as e: | |
raise ValueError( | |
format_invalid_requirement_string_error( | |
line, e, description_of_origin=f"{rel_path} at line {i + 1}" | |
) | |
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, so we're just handling the generic exception? That makes sense.
src/python/pants/backend/python/util_rules/lockfile_metadata.py
Outdated
Show resolved
Hide resolved
src/python/pants/backend/python/util_rules/lockfile_metadata.py
Outdated
Show resolved
Hide resolved
src/python/pants/backend/python/util_rules/lockfile_metadata.py
Outdated
Show resolved
Hide resolved
src/python/pants/backend/python/util_rules/lockfile_metadata.py
Outdated
Show resolved
Hide resolved
This static method should be used in place of the `LockfileMetadata` constructor. | ||
""" | ||
|
||
return LockfileMetadataV2(valid_for_interpreter_constraints, requirements) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused on this part. Why is it hardcoded to V2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new
should produce the current version of the metadata header, but this means that we only have to update which constructor is called once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify: This keeps the API to places that need to create a new LockfileMetadata
object reasonably consistent. We don't want to support creating older versions, but we do want to support deserialising them until the deprecation process is complete. Having the new
method in place means the factory method has a consistent name, and consumers will get directed to new parameters that need to be added by the type checker. I suspect it's a lower support burden than trying to keep track of which version of the class is the current one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! Okay that last paragraph makes way more sense to me. Thanks for explaining. Could you please add that explanation to the docstring: when writing lockfiles, we always use the latest metadata version, even if we support reading older versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I guess you do already say "the most recent version". I'm trying to figure out why this was still confusing me. Maybeee something like this:
Create an instance the most up-to-date subclass of
LockfileMetadata
.Note that we always use the most recent lockfile version when writing lockfiles, even if we support reading older versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This static method should be used in place of the `LockfileMetadata` constructor. This gives calling sites a predictable method to call to construct a new `LockfileMetadata` for writing, while still allowing us to support _reading_ older, deprecated metadata versions.
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
…quirements` # Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
…ible-requirements-only # Conflicts: # 3rdparty/python/lockfiles/user_reqs.txt # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
@@ -4,10 +4,14 @@ | |||
# | |||
# --- BEGIN PANTS LOCKFILE METADATA: DO NOT EDIT OR REMOVE --- | |||
# { | |||
# "version": 1, | |||
# "requirements_invalidation_digest": "815457a1baf6226c993e5468ccdf64c69fe7214d3d9237911c762733e0130526", | |||
# "version": 2, | |||
# "valid_for_interpreter_constraints": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This name has grown on me btw :) especially when paired with generated_with_requirements
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason why naming things is a hard problem in computer science is because people insist on being concise 😆
This static method should be used in place of the `LockfileMetadata` constructor. | ||
""" | ||
|
||
return LockfileMetadataV2(valid_for_interpreter_constraints, requirements) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I guess you do already say "the most recent version". I'm trying to figure out why this was still confusing me. Maybeee something like this:
Create an instance the most up-to-date subclass of
LockfileMetadata
.Note that we always use the most recent lockfile version when writing lockfiles, even if we support reading older versions.
) -> _T: | ||
"""Construct a `LockfileMetadata` subclass from the supplied JSON dict. | ||
|
||
*** Not implemented. Subclasses should override. *** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, imo better is to set @abstractmethod
and subclass ABC
? MyPy doesn't like the interaction of ABC and dataclasses - if it complains, use # type: ignore[error-code]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's complaining, and it breaks the type checking in subclasses too. That's a bit too much complaining for me to be comfortable with suppressing.
def _header_dict(self) -> dict[Any, Any]: | ||
"""Produce a dictionary to be serialized into the lockfile header. | ||
|
||
Subclasses should call `super` and update the resulting dictionary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Subclasses should call `super` and update the resulting dictionary | |
Subclasses should call `super` and update the resulting dictionary. |
@@ -0,0 +1,142 @@ | |||
# Copyright 2021 Pants project contributors (see CONTRIBUTORS.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait I'm confused still - what is this being used for?
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
Thanks! Note that we will not cherry-pick this to Pants 2.7 @chrisjrn |
This factors out versioning capabilities into
LockfileMetadata
so that it's possible to easily change the set of validation requirements for a lockfile. V1 represents the original lockfile version (where constraints and an invalidation digest are set). V2 allows for the old behaviour, but also allows specifying the input requirements for a lockfile, and verifying that the user requirements area non-strict subset ofidentical to the input requirements.We decided to replace the requirements hex digest with requirements strings to allow us to test whether the lockfile produces a compatible environment rather than an identical environment, which will be useful for user lockfile support when we eventually enable that. In the meantime, tool lockfiles still test for an identical environment, but the extra data in the lockfile will allow for more fine-grained error messages in a future version.
The implementations of
_from_json_dict
andis_valid_for
are a bit repetitive; I can factor out the common behaviour with a bit of work, but given we expect to delete the V1 implementation before too long.Currently this does not add the
platforms
capability to the header, but now it's going to be easy enough to bump the version number if we want to add more fields.Closes #12610