Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add [python-setup].disable_mixed_interpreter_constraints for repositories not using the interpreter_constraints field #12495

Closed

Conversation

Eric-Arellano
Copy link
Contributor

@Eric-Arellano Eric-Arellano commented Aug 4, 2021

The interpreter_constraints field allows your repo to support multiple distinct ICs, e.g. some Py2 only code and Py3 only code. Pants has code to then robustly support this mixing, e.g. partitioning Flake8 and Bandit based on interpreter constraints. That's a really neat feature, but not used by many orgs.

At the same time, we have the pants.backend.python.mixed_interpreter_constraints backend already, which adds the py-constraints goal.

This PR is prework to start requiring activating pants.backend.python.mixed_interpreter_constraints for the interpreter_constraints feature to be registered. Two main reasons:

  1. A good default is to not use mixed interpreter constraints.
    • It's a best practice to use a single Python version for your whole project. Keeps things consistent and simple.
    • With interpreter_constraints activated by default, it's easy for an engineer to accidentally use this advanced feature, resulting in functionality the build engineers may have intentionally not wanted to use.
  2. Regardless of how much we optimize the multiple-IC code paths, fundamentally, we are doing more work than is necessary.

Deprecation plan

  1. (In a followup PR), deprecate using interpreter_constraints field if the backend pants.backend.python.mixed_interpreter_constraints is not activated.
  2. (Here), add --disable-mixed-interpreter-constraints to allow getting the performance boosts right away.

--disable-mixed-interpreter-constraints is temporary and will be removed once we switch the interpreter_constraints field to only be registered if the backend is activated.

[ci skip-build-wheels]

… performance in repos with a single Python constraint

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@Eric-Arellano Eric-Arellano changed the title [WIP] Add [python-setup].disable_mixed_interpreter_constraints to improve performance in repos with a single Python constraint Add [python-setup].disable_mixed_interpreter_constraints to improve performance in repos with a single Python constraint Aug 4, 2021
@Eric-Arellano Eric-Arellano marked this pull request as ready for review August 4, 2021 17:40
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@stuhood
Copy link
Sponsor Member

stuhood commented Aug 5, 2021

As discussed in the thread, I think that this is a lower priority than fixing the actual performance issue, especially if they're both going to require a deprecation cycle.

Eric-Arellano added a commit that referenced this pull request Aug 5, 2021
…es` (#12498)

I've been struggling to get the MyPy tests working properly in #12495. This is a prefactor to make the codes more readable by using the preferred `.write_files()` API.

[ci skip-rust]
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@Eric-Arellano
Copy link
Contributor Author

I think that this is a lower priority than fixing the actual performance issue

To fully fix all of the performance issues—particularly how slow MyPy is to partition because it has to look at transitive closures—we need to fix #11270, which is blocked on #11269. This is all before even considering lockfiles, I'm purely talking about status quo perf of MyPy.

Certainly, we should fix these underlying performance issues. But in the meantime, a) we can get real wins for users now, and b) even with the best optimizations possible, we would still be doing wasted work for a niche feature most orgs don't use (mixed ICs). I do think this is valuable to land now.

Fwit, I'm not turning on this option for pantsbuild.pants so that we still feel the pain of these performance issues and are more motivated to fix them.

@stuhood
Copy link
Sponsor Member

stuhood commented Aug 5, 2021

To fully fix all of the performance issues—particularly how slow MyPy is to partition because it has to look at transitive closures—we need to fix #11270, which is blocked on #11269. This is all before even considering lockfiles, I'm purely talking about status quo perf of MyPy.

The proposal I made in the thread makes the transitive lookups per-root unnecessary: you'd only do a transitive lookup after partitioning, per-partition. So it makes #11270 much less necessary.

Eric-Arellano added a commit that referenced this pull request Aug 6, 2021
)

### Shading

These tools all run with user requirements. As before, we combine the "tool pex" with the user requirements pex via `--pex-path`.

I thought that tool lockfiles would cause issues that it's likely for the tool requirements to conflict with the user requirements. But turns out, this has always been the case in Python land. Before, the tool would still have pinned deps in its PEX file, only, those pins might float over time without a lockfile. This PR changes nothing in terms of the need for shading (#9206).

### Performance concerns

As with Flake8, Bandit, and Setuptools, we must consult the codebase to determine which interpreter constraints the tool will be run with. This does have a performance hit, especially for generating Pytest's lockfile because we must consider the transitive deps of each `python_tests` target. 

There are four proposals floating to improve the performance. All are compatible with the others:

1. When generating the lockfile via `./pants tool-lock`, continue to inspect the whole repo. But when checking at call sites if the lockfile is stale, only inspect the code in question: create what the `PythonToolLockfile` would be if it were just for that code, and ensure the ICs are compatible w/ the checked-in file.
2. Stop treating the IC field as the constraints for the source itself, and start treating it as constraints for the whole transitive closure. This would allow us to look only at the target roots, w/o needing to resolve transitive dependencies.
3. Allow repos to opt-out of mixed interpreter constraints and only set ICs via `[python-setup].interpreter_constraints` (#12495). This has no impact if you are using mixed ICs, but greatly speeds things up for other repos not using the mechanism.
4. Add an ignore option to our check for stale lockfiles. Users could risk ignoring for desktop builds, and then error in CI.

For now, this PR provides a common denominator.
@Eric-Arellano Eric-Arellano changed the title Add [python-setup].disable_mixed_interpreter_constraints to improve performance in repos with a single Python constraint Add [python-setup].disable_mixed_interpreter_constraints for repositories not using the interpreter_constraints field Aug 19, 2021
@Eric-Arellano
Copy link
Contributor Author

I've updated the PR title and description to no longer focus exclusively on performance. Still relevant, but even if perf were not considered, I think activating interpreter_constraints by default is a mistake. PTAL @stuhood @benjyw (and cc @chrisjrn)

@Eric-Arellano
Copy link
Contributor Author

Also, I'm okay with not landing this PR in particular if we decide the temporary option isn't worth the churn. The more important change I care about is no longer activating interpreter_constraints field by default. This PR is only meant to facilitate the transition, but it's not necessary.

@benjyw
Copy link
Sponsor Contributor

benjyw commented Aug 19, 2021

To chime in, I think on balance that it's a good idea to reduce niche complexity unless needed. As long as we properly document how to use Pants with mixed interpreters.

@Eric-Arellano
Copy link
Contributor Author

As long as we properly document how to use Pants with mixed interpreters.

Agreed. Which will be nothing more than moving the snippet Tip: activate pants.backend.python.mixed_interpreter_constraints to now be in the first paragraph of the section Using multiple Python versions in the same project. https://www.pantsbuild.org/docs/python-interpreter-compatibility#using-multiple-python-versions-in-the-same-project.

It will be really easy for users to activate this niche feature: it's a one-line config change.

@stuhood
Copy link
Sponsor Member

stuhood commented Aug 19, 2021

To chime in, I think on balance that it's a good idea to reduce niche complexity unless needed. As long as we properly document how to use Pants with mixed interpreters.

Doesn't this do the opposite thing? To allow for disabling a feature (which we have other pathways to make faster), we're increasing complexity by 200+ lines, and adding more steps for developers.

@Eric-Arellano
Copy link
Contributor Author

Eric-Arellano commented Aug 19, 2021

we're increasing complexity by 200+ lines, and adding more steps for developers.

Bump on the part below. If I could go back, I would not have put up this PR first and I would have gone straight for the PR to deprecate using interpreter_constraints w/o activating mixed_interpreter_constraints backend.

Also, I'm okay with not landing this PR in particular if we decide the temporary option isn't worth the churn. The more important change I care about is no longer activating interpreter_constraints field by default. This PR is only meant to facilitate the transition, but it's not necessary.

Does that make sense? This PR's 200+ lines and new option are in no way necessary for the proposed change.

@benjyw
Copy link
Sponsor Contributor

benjyw commented Aug 19, 2021

Ah, I meant complexity for end users, not for us.

@stuhood
Copy link
Sponsor Member

stuhood commented Aug 19, 2021

Does that make sense? This PR's 200+ lines and new option are in no way necessary for the proposed change.

It doesn't make sense. Most of the new lines here are if statements of the form: if python_setup.disable_mixed_interpreter_constraints... even if those are replaced with if $mixed_interpreter_constraints_backend_enabled, it's extra complexity.

And the new option I'm referring to is the enabling of the backend itself: you have to figure out that you need to enable the backend to get the feature (and we'd likely need a pointer if someone tries using the field).

Ah, I meant complexity for end users, not for us.

This is more complexity for end users, and for us. For example: on the docsite (and probably also via an assist in the code) we will need to call out that the interpreter constraints field only exists if you've enabled a particular backend, and how to do that.

@Eric-Arellano
Copy link
Contributor Author

For example: on the docsite (and probably also via an assist in the code) we will need to call out that the interpreter constraints field only exists if you've enabled a particular backend, and how to do that.

Sure, although that is already something we need to solve with #12523. For protobuf_library, the python_source_root field is only added if you activate pants.backend.codegen.protobuf.python, which we'll need to highlight as soon as we add pants.backend.codegen.protobuf.java for example. Once that feature is implemented for plugin fields generically, I think that concern isn't a big deal.

you have to figure out that you need to enable the backend to get the feature

I expect the main entry point will be a codebase admin deciding if they want this niche feature while reading https://www.pantsbuild.org/docs/python-interpreter-compatibility#using-multiple-python-versions-in-the-same-project. Once decided, others don't need to think about it. It's one time and done by an admin.

(and we'd likely need a pointer if someone tries using the field).

This is the same for using any of our backends like Flake8. Liam already added an error message linking to https://www.pantsbuild.org/docs/enabling-backends#available-backends when you trying using an unrecognized symbol, which users would get when using the interpreter_constraints field if not activated.

And we can continue to improve discoverability with ./pants help backends.

This is more complexity for end users

Counterpoint is that we're imo removing complexity for the organizations that don't want to use this advanced and niche feature. It won't show up in ./pants help anymore and it won't be something an engineer unintentionally uses because the org wanted the rigor of only having single constraints for the whole project.

@benjyw
Copy link
Sponsor Contributor

benjyw commented Aug 21, 2021

Does that make sense? This PR's 200+ lines and new option are in no way necessary for the proposed change.

It doesn't make sense. Most of the new lines here are if statements of the form: if python_setup.disable_mixed_interpreter_constraints... even if those are replaced with if $mixed_interpreter_constraints_backend_enabled, it's extra complexity.

And the new option I'm referring to is the enabling of the backend itself: you have to figure out that you need to enable the backend to get the feature (and we'd likely need a pointer if someone tries using the field).

Ah, I meant complexity for end users, not for us.

This is more complexity for end users, and for us. For example: on the docsite (and probably also via an assist in the code) we will need to call out that the interpreter constraints field only exists if you've enabled a particular backend, and how to do that.

The docs question is interesting, but isn't it the case that we need to do this sort of thing in general?

@Eric-Arellano
Copy link
Contributor Author

Closing in favor of #12652.

@Eric-Arellano Eric-Arellano deleted the disable-mixed-ics branch August 25, 2021 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants