Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pants should shade Python tool binaries. #9206

Open
jsirois opened this issue Feb 28, 2020 · 3 comments
Open

Pants should shade Python tool binaries. #9206

jsirois opened this issue Feb 28, 2020 · 3 comments
Labels
backend: Python Python backend-related issues enhancement

Comments

@jsirois
Copy link
Contributor

jsirois commented Feb 28, 2020

We have all the same problems with running python tools against python code that we did with running jvm tools against jvm code.

@jsirois
Copy link
Contributor Author

jsirois commented Feb 28, 2020

Pex already uses shading technology for vendoring its own dependencies into the pex distribution and shipped PEX files (see: https://github.com/pantsbuild/pex/blob/master/pex/vendor/README.md). We should be able to leverage some version of that, perhaps breaking out a pantsbuild or repo for the library / tool for Pants and Pex to share.

Eric-Arellano added a commit that referenced this issue Aug 6, 2021
)

### Shading

These tools all run with user requirements. As before, we combine the "tool pex" with the user requirements pex via `--pex-path`.

I thought that tool lockfiles would cause issues that it's likely for the tool requirements to conflict with the user requirements. But turns out, this has always been the case in Python land. Before, the tool would still have pinned deps in its PEX file, only, those pins might float over time without a lockfile. This PR changes nothing in terms of the need for shading (#9206).

### Performance concerns

As with Flake8, Bandit, and Setuptools, we must consult the codebase to determine which interpreter constraints the tool will be run with. This does have a performance hit, especially for generating Pytest's lockfile because we must consider the transitive deps of each `python_tests` target. 

There are four proposals floating to improve the performance. All are compatible with the others:

1. When generating the lockfile via `./pants tool-lock`, continue to inspect the whole repo. But when checking at call sites if the lockfile is stale, only inspect the code in question: create what the `PythonToolLockfile` would be if it were just for that code, and ensure the ICs are compatible w/ the checked-in file.
2. Stop treating the IC field as the constraints for the source itself, and start treating it as constraints for the whole transitive closure. This would allow us to look only at the target roots, w/o needing to resolve transitive dependencies.
3. Allow repos to opt-out of mixed interpreter constraints and only set ICs via `[python-setup].interpreter_constraints` (#12495). This has no impact if you are using mixed ICs, but greatly speeds things up for other repos not using the mechanism.
4. Add an ignore option to our check for stale lockfiles. Users could risk ignoring for desktop builds, and then error in CI.

For now, this PR provides a common denominator.
@benjyw benjyw removed the python label Sep 9, 2021
@cognifloyd cognifloyd added the backend: Python Python backend-related issues label Mar 13, 2023
@cognifloyd
Copy link
Member

What do you mean by "shade python tool binaries"?

We have all the same problems with running python tools against python code that we did with running jvm tools against jvm code.

I'm not familiar with the jvm problems. Could you summarize them please?

@benjyw
Copy link
Sponsor Contributor

benjyw commented Sep 16, 2023

@cognifloyd "shading a package" means putting it under some obfuscated parent package, e.g., foo.bar -> _obfuscated_1234.foo.bar and rewriting all the internal imports appropriately. If you control the callers (which we do in the tool case) then you invoke the package via the new location. If you don't, you leave just the entry points in the original locations.

The problem this solves is that often the tool code and the user code it runs against must coexist in the same runtime and thus share a namespace (the sys.path for Python, the classpath for JVM). So if the tool shares a dep with user code (or if the user code is the tool, imagine if mypy used Pants to build itself), and the two versions are incompatible, things may go boom. This is not a problem if the tool treats the code simply as text input (e.g., a simple linter), but is a problem when the user code must be classloaded in the same runtime as the tool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend: Python Python backend-related issues enhancement
Projects
None yet
Development

No branches or pull requests

3 participants