Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

With Enum34 installed on python3.6+, pip picks up enum34 instead of the stdlib enum package #8214

Closed
abadger opened this issue May 9, 2020 · 34 comments · Fixed by #9689
Closed
Labels
state: needs discussion This needs some more discussion type: enhancement Improvements to functionality

Comments

@abadger
Copy link
Contributor

abadger commented May 9, 2020

Environment

  • pip version: pip 20.1
  • Python version: Python 3.6.10 [This was reported to me against python3.6 so I tested and fixed it there but it should affect later versions of python as well]
  • OS: Fedora 31

Description

When the enum34 package is installed on a system, pip will no longer install a project that uses pyproject.toml instead of setup.py. Instead, it tracebacks inside of the stdlib because pip is finding the enum34 package's version of enum.py instead of the stdlib version. This happens in the re-invocation of pip from pip/_internal/commands/install.pyhttps://github.com/abadger/pip/blob/master/src/pip/_internal/build_env.py#L171 although I think the bug should be fixed in __main__.py instead.

(edited: Put the correct file into the description)

The reason is that __main__.py has code which changes sys.path in case we are running pip from a wheel. That code appears to be changing sys.path in certain circumstances when we aren't running from a wheel by mistake. So that code is where the bug lies, not in install. It may be correct to change install.py as well but I am leaning towards not doing that... It looks like the code in install is making sure that it reinvokes itself by specifying the actual path to pip which it was invoked with. If we tried to use something different like python -m there, then we wouldn't be assured of executing the code from the same install of pip.

I'll add an easy reproducer using pip's repository in the reproducer section.

Fixed by #8213

Expected behavior

Even if enum34 is installed, pip should still be able to install other packages with a pyproject.toml file.

How to Reproduce

  1. python3.6 -m pip install --user enum34
  2. python3.6 -m pip install --user git+git://github.com/pypa/pip
  3. A traceback occurs:

Output


[pts/16@peru /srv/git/python-pip/pip/news]$ python3.6 -m pip install --user enum34
/usr/lib64/python3.6/enum.py
Requirement already satisfied: enum34 in /home/badger/.local/lib/python3.6/site-packages (1.1.10)
Could not build wheels for enum34, since package 'wheel' is not installed.
[pts/16@peru /srv/git/python-pip/pip/news]$ python3.6 -m pip --version
/usr/lib64/python3.6/enum.py
pip 20.1 from /home/badger/.local/lib/python3.6/site-packages/pip (python 3.6)
[pts/16@peru /srv/git/python-pip/pip/news]$ python3.6 --version *[fix-pip-to-work-in-the-face-of-enum34]  (08:47:21)
Python 3.6.10
[pts/16@peru /srv/git/python-pip/pip/news]$ python3.6 -m pip install --user git+git://github.com/pypa/pip
Collecting git+git://github.com/pypa/pip
  Cloning git://github.com/pypa/pip to /var/tmp/pip-req-build-edpkq5kv
  Running command git clone -q git://github.com/pypa/pip /var/tmp/pip-req-build-edpkq5kv
  Installing build dependencies ... error
  ERROR: Command errored out with exit status 1:
   command: /usr/bin/python3.6 /home/badger/.local/lib/python3.6/site-packages/pip install --ignore-installed --no-user --prefix /var/tmp/pip-build-env-aap_1xn0/overlay --no-warn-script-location --no-binary :none: --only-binary :none: -i https://pypi.org/simple -- setuptools wheel
       cwd: None
  Complete output (14 lines):
  Traceback (most recent call last):
    File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main
      "__main__", mod_spec)
    File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
      exec(code, run_globals)
    File "/home/badger/.local/lib/python3.6/site-packages/pip/__main__.py", line 23, in <module>
      from pip._internal.cli.main import main as _main  # isort:skip # noqa
    File "/home/badger/.local/lib/python3.6/site-packages/pip/_internal/cli/main.py", line 5, in <module>
      import locale
    File "/usr/lib64/python3.6/locale.py", line 16, in <module>
      import re
    File "/usr/lib64/python3.6/re.py", line 142, in <module>
      class RegexFlag(enum.IntFlag):
  AttributeError: module 'enum' has no attribute 'IntFlag'
  ----------------------------------------
ERROR: Command errored out with exit status 1: /usr/bin/python3.6 /home/badger/.local/lib/python3.6/site-packages/pip install --ignore-installed --no-user --prefix /var/tmp/pip-build-env-aap_1xn0/overlay --no-warn-script-location --no-binary :none: --only-binary :none: -i https://pypi.org/simple -- setuptools wheel Check the logs for full command output.

@triage-new-issues triage-new-issues bot added the S: needs triage Issues/PRs that need to be triaged label May 9, 2020
abadger added a commit to abadger/pip that referenced this issue May 9, 2020
Because pip ships with an __main__.py file, one valid way to invoke it
is: python /usr/lib/python3.6/site-packages/pip/.  This results in
__package__ being the empty string   This, in turn, triggers the code to
add the pip wheel to sys.path.  Unfortunately, while this is fine to do
when it's restricted to a wheel just containing pip, it isn't okay to do
when it is site-packages as that will mean that other things in
site-packages could end up overriding things in the stdlib
(*cough*enum34*cough*).

Check file extension to limit when we end up adding the extra path to
sys.path.

Fixes pypa#8214
@pfmoore
Copy link
Member

pfmoore commented May 9, 2020

This happens in the re-invocation of pip from pip/_internal/commands/install.py

Can you provide a link to the relevant line of code? I can't find what you're referring to here 🙁

Also, I just tried this and it worked fine for me:

[root@696e6f345c66 /]# python3.6 -m pip install --user enum34
WARNING: Running pip install with root privileges is generally not a good idea. Try `python3.6 -m pip install --user` instead.
Collecting enum34
  Downloading https://files.pythonhosted.org/packages/63/f6/ccb1c83687756aeabbf3ca0f213508fcfb03883ff200d201b3a4c60cedcc/enum34-1.1.10-py3-none-any.whl
Installing collected packages: enum34
Successfully installed enum34-1.1.10
WARNING: You are using pip version 19.1.1, however version 20.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
[root@696e6f345c66 /]# python3.6 -m pip install --user git+git://github.com/pypa/pip
WARNING: Running pip install with root privileges is generally not a good idea. Try `python3.6 -m pip install --user` instead.
Collecting git+git://github.com/pypa/pip
  Cloning git://github.com/pypa/pip to /tmp/pip-req-build-st28fks3
  Running command git clone -q git://github.com/pypa/pip /tmp/pip-req-build-st28fks3
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
    Preparing wheel metadata ... done
Building wheels for collected packages: pip
  Building wheel for pip (PEP 517) ... done
  Stored in directory: /tmp/pip-ephem-wheel-cache-fbj8jirm/wheels/1f/11/b0/5bd125f64cf9eedf300124c783996f30957e0d1b4fac0c38d3
Successfully built pip
Installing collected packages: pip
Successfully installed pip-20.2.dev0

@abadger
Copy link
Contributor Author

abadger commented May 9, 2020

@pfmoore It's this line here: https://github.com/abadger/pip/blob/master/src/pip/_internal/build_env.py#L171

/me looks through your output to see if he can spot what is different between our two environments.

@abadger
Copy link
Contributor Author

abadger commented May 9, 2020

pip_location is defined at the top of the file like this: from pip import __file__ as pip_location

@abadger
Copy link
Contributor Author

abadger commented May 9, 2020

Oh! I think I might know why you couldn't reproduce. Where is your initial pip located? My initial pip is also in ~/.local. That would come into play because the line in build_env.py is going to putresult in a directory relative to the pip installation you are running into sys.path. So if you were running the initial pip from /usr/lib/[...]/site-packages but installed enum34 into ~/.local/[...]/site-packages, then you wouldn't trigger the issue where enum from enum34 is used in preference to the one from the stdlib.

/me goes forth to verify this is the difference now that he's posted the theory here.

EDITED: My language was unclear iniitally. The code in build_env re-invokes pip as sys.executable /path/to/the/pip/install/directory/. This results in a directory relative to the pip installation to end up in the re-invoked pip's sys.path. However, the code in pip/__main__.py is what actually puts that directory into sys.path.

@abadger
Copy link
Contributor Author

abadger commented May 9, 2020

Yep, that's what it was. So if you want to reproduce from where you said it worked for you, you can simply run the last pip command again. Since you just installed pip from git into the user home dir, the second time the command runs, it will use that pip and then fail.

If you want to start over, here are all the steps I took with a container to reproduce (Leaving off output but adding comments)

# Host computer
podman run -it fedora /bin/bash
# Now inside the container
# Running ensurepip with --user puts pip into the same site-packages as enum34
# will be installed into
python3.8 -m ensurepip --user
python3.8 -m pip install --user enum34
python3.8 -m pip install --user git+git://github.com/pypa/pip
# Traceback from the initial bug report here

@abadger
Copy link
Contributor Author

abadger commented May 9, 2020

Oops, and I see why you couldn't find the line I was talking about earlier too... I posted the wrong file at the top of the issue. Sorry. I was greping the code for the command line strings to see if I could find any other occurrences that would have the same problem and I cut and pasted the wrong filename into the issue report.

@felixfontein
Copy link

In python-poetry/poetry#1122 this happened with Python 3.7.3.

@pfmoore
Copy link
Member

pfmoore commented May 9, 2020

So this is related to running pip for the isolated build environment. That makes sense, and was what I initially assumed you might be thinking about, but your reference to the wrong file put me off track. Thanks for clarifying.

So I think there probably is something to address here, but I'm not sure your proposed fix is the right approach. I still think that running pip as python /path/to/pip isn't something we should support - and if that means we need to consider another way of running pip in build_env.py then let's look at that.

@abadger
Copy link
Contributor Author

abadger commented May 9, 2020

. Yeah, that's fair. So there's several things I am thinking....

  • From the comment in the source code of __main__.py I interpret that the current code in __main__ is overzealous in what it adds. If you don't think the scope of that code should be decreased, what should the comment actually say? What use case in addition to a pip wheel is it meant to address?
  • It seems like there should be a different way to address this particular issue inside of build_env. The strand that has to be pulled apart for any other solution to work, though, is that it needs to load the correct library for pip without affecting the other libraries that might be pulled in. Python has all sorts of ways to invoke pip but half of them will end up affecting sys.path in a global manner. I think you could separate out the pip library that you want to use (turn it into a wheel or copy the directory structure to its own directory or etc.) Then place that onto PYTHONPATH before re-invoking pip using sys.executable -m That way only the library that you separated out would override everything else on the PYTHONPATH.

@abadger
Copy link
Contributor Author

abadger commented May 9, 2020

I've figured out another way to change build_env.py that looks like it works.... but I'm afraid it might be an undocumented implementation detail of the way python works so I'm not sure if it's safe to rely on. Instead of executing the directory where pip lives, execute pip's __main__.py file instead. Executing the directory will end up setting __package__ to the empty string. Executing the __main__.py file will set __package__ to None. __main__.py only adds to sys.path if __package__ is the empty string so it will work for this as well (Probably needs to have a comment if you go this route, though, so that everyone knows that the condition needs to be True for empty string and False for None).

If you want to go that route, it would look something like this:

diff --git a/src/pip/__main__.py b/src/pip/__main__.py
index a529758b..2e9ff7e7 100644
--- a/src/pip/__main__.py
+++ b/src/pip/__main__.py
@@ -12,16 +12,16 @@ if sys.path[0] in ('', os.getcwd()):
 
 # If we are running from a wheel, add the wheel to sys.path
 # This allows the usage python pip-*.whl/pip install pip-*.whl
+# Do not enter this codepath if __package__ is None as that
+# means that pip/__main__.py was invoked directly, which is what
+# build_env.py does (issue #8214)
 if __package__ == '':
     # __file__ is pip-*.whl/pip/__main__.py
     # first dirname call strips of '/__main__.py', second strips off '/pip'
 
diff --git a/src/pip/_internal/build_env.py b/src/pip/_internal/build_env.py
index b8f005f5..77e192f2 100644
--- a/src/pip/_internal/build_env.py
+++ b/src/pip/_internal/build_env.py
@@ -167,8 +167,9 @@ class BuildEnvironment(object):
         prefix.setup = True
         if not requirements:
             return
+        command = os.path.join(os.path.dirname(pip_location), '__main__.py')
         args = [
-            sys.executable, os.path.dirname(pip_location), 'install',
+            sys.executable, command, 'install',
             '--ignore-installed', '--no-user', '--prefix', prefix.path,
             '--no-warn-script-location',
         ]  # type: List[str]

@abadger
Copy link
Contributor Author

abadger commented May 9, 2020

Documentation on __package__ that I can find:

If you want to go this route, perhaps we should ping ncoghlan at that time to see if he considers the None/empty string split to be something that we can depend upon here.

@pradyunsg pradyunsg added state: needs discussion This needs some more discussion type: bugfix type: enhancement Improvements to functionality labels May 9, 2020
@triage-new-issues triage-new-issues bot removed the S: needs triage Issues/PRs that need to be triaged label May 9, 2020
@pradyunsg
Copy link
Member

pradyunsg commented May 9, 2020

I'd prefer that we switch our BuildEnvironment logic to be not dependent on the path/to/pip hack, instead of actively supporting python path/to/pip invocations in general.

@pradyunsg
Copy link
Member

pradyunsg commented May 9, 2020

Directly invoking __main__.py is perfectly reasonable!

abadger added a commit to abadger/pip that referenced this issue May 10, 2020
With the prior code, a python library installed into the same directory
as pip will be promoted to be found before other libraries of the same
name when an isolated build is performed.  This is because
pip/__main__.py adds the directory holding pip to the front of sys.path
where both the pip library and the other libraries will be found before
the other libraries.

This is especially problematic for libraries which shadow the names of
stdlib libraries.  The enum34 library, for instance, will shadow the
stdlib enum library.  When pip runs an isolated build and the enum34
library is installed, enum.py from the enum34 library will be found and
imported instead of enum.py from the stdlib.  That, in turn, breaks the
stdlib re module which then breaks pip.

This commit fixes that by changing build_env to invoke pip a different
way (by calling python on pip/__main__.py instead of calling python on
pip/).  Using __main__.py changes what python sets the value of
__package__ to which, in turn, makes it so the code in __main__.py
does not add the directory holding pip to sys.path.

Fixes pypa#8214
@abadger
Copy link
Contributor Author

abadger commented May 10, 2020

Okay, two of the three ways I could see this working are easy to implement so I implemented them in two separate PRs. I'm not going to implement turning the pip directory into a wheel and then running that in build_env because there's probably going to be design decisions that would require some back and forth with committers in order to come up with the right way to do it and I don't have the time to commit to that. Here's my summary of the three methods:

  • Only deal with wheels in __main__.py: Fix pip's machinery for creating an isolated build environment so that user installed packages do not override the stdlib #8213

    • Breaks the use case of running python path/to/pip/ in a checkout unless PYTHONPATH is set (ie: this will work: PYTHONPATH=$(pwd)/src/ python src/pip.
    • Makes the code in __main__.py conform to the comment which says that adding to the front of sys.path is to make wheel files work.
  • Separate the build_env invocation from the wheel and development invocation: Fix isolated builds when enum34 is installed #8217

    • Makes the code in build_env.py use a different method of invoking pip so that we can handle it differently (ie: not at all) in pip/__main__.py.
    • Might be depending on Python implementation details. Should ask ncoghlan, the PEP366 author whether we should be depending on empty string vs None if this is how we want to proceed.
    • Makes comments in __main__.py conform to expectations set forth in these PRs and bug reports.
    • python path/to/pip/ continues to work as well as it did before
    • EDIT: I've been thinking about this this week and, although I haven't tested it, I think this means that pip might not execute the same library version that as python path/to/pip/__main__.py is a part of. I think __main__.py might be treated as a script in this case (not a module whose __main__ is being executed) and therefore, python won't treat the directory that __main__.py is in specially for looking up additional paths. This will mean that when pip reinvokes itself, it could end up running a different version of pip.
  • EDIT (Added this solution: When adding the path to pip to sys.path, make sure we add the path after the stdlib paths: More intelligently insert the pip path into sys.path #8241

  • Build a pip wheel or copy of the pip dir for the isolated build: No sample implementation.

    • Might be a good fit for the model of an isolated build.
    • More intrusive than either of the other solutions.
    • Should be doable without breaking python path/to/pip/

Note that even though any one of these will fix this issue, none of them are mutually exclusive. ie: you could implement all of them if you felt that together, they made the code make more sense as a whole.

abadger added a commit to abadger/pip that referenced this issue May 10, 2020
With the prior code, a python library installed into the same directory
as pip will be promoted to be found before other libraries of the same
name when an isolated build is performed.  This is because
pip/__main__.py adds the directory holding pip to the front of sys.path
where both the pip library and the other libraries will be found before
the other libraries.

This is especially problematic for libraries which shadow the names of
stdlib libraries.  The enum34 library, for instance, will shadow the
stdlib enum library.  When pip runs an isolated build and the enum34
library is installed, enum.py from the enum34 library will be found and
imported instead of enum.py from the stdlib.  That, in turn, breaks the
stdlib re module which then breaks pip.

This commit fixes that by changing build_env to invoke pip a different
way (by calling python on pip/__main__.py instead of calling python on
pip/).  Using __main__.py changes what python sets the value of
__package__ to which, in turn, makes it so the code in __main__.py
does not add the directory holding pip to sys.path.

Fixes pypa#8214
@pfmoore
Copy link
Member

pfmoore commented May 10, 2020

Catching up on the comments, but I wanted to respond to this (from the other thread):

Paul seemed to be arguing that invoking pip like that from the CLI is not supported. That it's only supported from within BuildEnvironment right now. @pfmoore is that a method of invocation that you do want to support?

No, you misunderstood me badly. I do not think we should support python /path/to/pip as a valid invocation method. I think how build_env.py invokes pip should be changed (I see you've now submitted a PR doing this).

Maybe you misunderstood because I noted that python /path/to/pip.whl (note this is the wheel) is IIUC used by get-pip.py (I was still confused by your report at that point, and thought that was related to what you were saying). But that's still not supported outside of get-pip.py, and if it broke we'd look at changing get-pip.py, not making it supported.

@abadger
Copy link
Contributor Author

abadger commented May 10, 2020

Thanks for responding. Unless I'm misunderstanding you now, I think I understood you just fine. Here's what I said in the portion you just quoted with some emphasis added:

Paul seemed to be arguing that invoking pip like that from the CLI is not supported.

which seems to match what you just said, yes?

I do not think we should support python /path/to/pip as a valid invocation method.

I was asking for confirmation of that because @uranusjr was using a user invoking python /path/to/pip (directory) as his example of what did not work with #8213: #8213 (comment) If python /path/to/pip/ is not a supported use case, then the fact that python /path/to/pip stops working doesn't seem like it is a valid argument against #8213.

@pfmoore
Copy link
Member

pfmoore commented May 10, 2020

which seems to match what you just said, yes?

Yes it does. Sorry, this thread got massively confusing and I misread 🙁

If python /path/to/pip/ is not a supported use case, then the fact that python /path/to/pip stops working doesn't seem like it is a valid argument against #8213.

The various misinterpretations, incorrect statements, and general confusion has rendered this discussion pretty near to impossible to follow, TBH. Some of that is my fault, for which I apologise.

#8123 is entitled "fix when pip is invoked as python /path/to/pip", and I think most of us have been responding to the implication from that title, that it should be allowed for pip to be invoked that way.

In actual fact, the issue now appears to be something more like "Pip's machinery for creating an isolated build environment doesn't correctly ignore packages installed in the user's environment". And we seem to have jumped straight from there into debating solutions, rather than understanding what's actually going on here. For example, why is enum34 being picked up at all, given that the invocation you pointed to specifies --ignore-installed?

I'm going to ignore this thread for a few days, to give things a chance to settle down, and a clearer picture to emerge (in my own head, if nowhere else!) I also intend to ignore all of the proposed fixes, and focus solely on understanding what's going on with the issue when I come back to it - and frankly, I'd suggest others do the same. We can evaluate possible fixes once we better understand what we're fixing...

@abadger
Copy link
Contributor Author

abadger commented May 10, 2020

Cool, I'm available on IRC when you are ready to talk about it from a deeper level. I can talk there and post summaries here or someone can ping me to take a look at updates here. (If I'm busy, once something goes out of my mental buffer, it can be hard to get it back in without pinging me there to get me to look at something ;-)

When you get back to this, regarding --ignore-installed, I believe that --ignore-installed is for pip when it is deciding what packages need to be installed. This issue doesn't have to do with the packages pip is depsolving for. It has to do with what packages the python interpreter is locating for satisfying imports as it runs pip itself. So --ignored-installed won't have any effect on that.

@abadger
Copy link
Contributor Author

abadger commented May 10, 2020

I'm going to post this here, too because I can see a misunderstanding but I have no expectation that you'll read it and get back to me for a while:

I changed the title on #8213. Your suggested title was slightly different than what the problem actually is. It's not that pip should ignore the user's environment when it reinvokes itself for an isolated build; it's that pip should not place any directories with libraries other than pip before the stdlib in sys.path.

Here's a step through of how sys.path is being modified:

  • I have pip and enum34 installed into /usr/lib/python3.7/site-packages/
  • I invoke /usr/bin/pip install git+git://github.com/pypa/pip
  • At this point, the relevant sys.path entries look like this: ['/usr/lib64/python3.7', '/usr/lib64/python3.7/lib-dynload', '/usr/lib/python3.7/site-packages']
    • The first two entries are for the stdlib. The last entry is for site-packages. The stdlib entries come first so python is going to find the stdlib's enum.py. These are just standard sys.path entries; unmodified by anything the user has done.
  • pip does its work and decides it needs to build in an isolated environment. So it ends up in the build_env.py file and re-invokes itself via: /usr/bin/python3.7 /usr/lib/python3.7/site-packages/pip/
  • Python decides this is a python module and it can invoke it. It finds the __main__.py and starts executing it.
  • sys.path should be the same as the outer invocation of pip as PYTHONPATH hasn't changed and build_env didn't try to make any changes to what libraries are available to what pip sees. So the relevant portions would once again be: ['/usr/lib64/python3.7', '/usr/lib64/python3.7/lib-dynload', '/usr/lib/python3.7/site-packages']
  • When Python gets here: https://github.com/pypa/pip/blob/master/src/pip/__main__.py#L15 __package__ is the empty string because that's what it is set to when you invoke a module this way in python. So the conditional is True and at this point it modifies sys.path.
  • Since pip is in /usr/lib/python3.7/site-packages/pip, the effect of the path modification is to re-add site-packages to sys.path, but it gets added at the front of the list. So the relevant entries in sys.path now look like this: ['/usr/lib/python3.7/site-packages', '/usr/lib64/python3.7', '/usr/lib64/python3.7/lib-dynload', '/usr/lib/python3.7/site-packages']
    • site-packages now has two entries in sys.path. The one pip.__main__ just added is the first one.
  • The stage is now set for the bug to occur. As more code is executed, Python eventually gets to a chain of imports which yield the Python-3.6 (or later) version of re.py. That version of re.py needs to import enum. It does so and because the re-invoked pip has modified sys.path, it finds and imports /usr/lib/python3.7/site-packages/enum.py which is the enum.py from the enum34 package.
  • re.py in Python-3.6+ expects enum.py to have an IntFlag member. The old version of enum in the enum34 package does not have this member. So when that code is executed, it fails with a traceback.

So the breakage is independent of the usage of the user's site-packages directory. The breakage can occur when pip puts any directory which contains more modules than just pip into sys.path before the stdlib entries there. The changes in any of the proposed fixes are aimed at limiting the scope of the modification to sys.path. #8213 and #8217 work by modifying sys.path in fewer cases. The fix I did not implement works by creating a different sys.path entry which would only contain pip.

I'm sorry for any confusion; I think I tend to think things through in a different way than most people. I read books backwards and get the most out of movies when I've already read the plot :-) So when I lay out how something works, perhaps I lay out my thoughts and analysis in a way that is backwards for other people.

@pfmoore
Copy link
Member

pfmoore commented May 10, 2020

(This popped up in my email, so I ended up reading it - thanks for the detailed post, but please be aware that I haven't thought it through yet. This note is mostly just something that came to mind now, and I want to record it so I don't forget. I'll follow up with a better analysis later).

So this is basically related to the ages-old debate as to whether it's acceptable to "shadow" stdlib modules. It seems to me there are two main points here:

  1. Why does pip need to insert its directory at the front of sys.path (as that opens up the possibility of shadowing)?
  2. Why does enum34 use the import name enum but not present a compatible API to the stdlib enum library that it will inevitably shadow if it ends up ahead of the stdlib on sys.path?

IMO, enum34 isn't correct - it should either be 100% compatible, or have a different name (and require users to try to import enum, and if that fails, import enum34 as enum). But there's not much we can do about that bug, except report it to them and see what they say.

So we're left with working around the issue in pip. Putting the new entry at the end of sys.path would solve the issue, but I've no idea at this point what other issues might be introduced by doing that. Trying to only put our directory at the front of sys.path "when it's safe" probably only adds more fragility to the whole mechanism (what would be safe, after all?)

As I say, I'm not going to think through this now, though.

@pfmoore
Copy link
Member

pfmoore commented Jul 24, 2020

A lot of the code in pip (around build isolation) is "defensive" -- meant to allow us to use recursive subprocess calls while preventing fork bombing OR to make sure we pass any user-provided options down into the process that's populating the build environment.

But surely (?) both of those are needed even in a general library? Protecting against fork-bombs is a general need, and a library needs to allow the user how to invoke pip.

And, within pip itself, we should stop having the recursive nature of build environment population, and transition to doing it in a single process now that we have a proper resolver and better understanding of the expectations around this area.

+1 on this

@pradyunsg
Copy link
Member

But surely (?) both of those are needed even in a general library? Protecting against fork-bombs is a general need, and a library needs to allow the user how to invoke pip.

Yea, but we'd need to disentangle that logic which is what I was trying to flag. :)

Also, it's only needed somewhere in the recursive chain, and pep517 doesn't come in the recursive loop. IOW, pip has these protections and since pep517 is currently just straight up using pip, pep517 doesn't need to have any protections and can "reuse" that pip does it.

Keeping the expectations the same or making pep517 keep track of the recursion might be a design decision to make during this refactor. :)

basnijholt added a commit to basnijholt/aiokef that referenced this issue Sep 16, 2020
basnijholt added a commit to basnijholt/aiokef that referenced this issue Sep 16, 2020
millerdev added a commit to dimagi/commcare-hq that referenced this issue Sep 28, 2020
This is a holdover from long since passed Python 2.7 days. It probably should have been removed long ago. `enum34` is a backport of the stdlib enum module for Python versions < 3.4, so we no longer need it since we require 3.6.

Error seen attempting to `pip install psycogreen==1.0.2` when enum34 is installed:

AttributeError: module 'enum' has no attribute 'IntFlag'

Relevant links:
https://stackoverflow.com/q/43124775/10840
pypa/pip#8214
@uranusjr
Copy link
Member

I’ve been thinking about this recently. Maybe we shouldn’t really run the pip installation to populate the isolated environment in the first place? pip can (for example) build itself into a zip somewhere, and run that instead of the installed source. This zip-building process can be done on-demand when an isolated environment is needed, and cached for the entire session to minimise performance impact (although honestly the delay is probably minor compared to the wheel-building process itself).

vepadulano added a commit to vepadulano/PyRDF that referenced this issue Dec 8, 2020
The enum34 package is included in the standard library since Python 3.4, so there's no need to add it as a requirement when installing PyRDF in a Python 3.4+ environment.
In fact, this requirement is messing with pip (e.g. pypa/pip#8214)
vepadulano added a commit to vepadulano/PyRDF that referenced this issue Dec 8, 2020
The enum34 package is included in the standard library since Python 3.4, so there's no need to add it as a requirement when installing PyRDF in a Python 3.4+ environment.
In fact, this requirement is messing with pip (e.g. pypa/pip#8214)
vepadulano added a commit to vepadulano/PyRDF that referenced this issue Dec 8, 2020
The enum34 package is included in the standard library since Python 3.4, so there's no need to add it as a requirement when installing PyRDF in a Python 3.4+ environment.
In fact, this requirement is messing with pip (e.g. pypa/pip#8214)
vepadulano added a commit to vepadulano/PyRDF that referenced this issue Dec 8, 2020
The enum34 package is included in the standard library since Python 3.4, so there's no need to add it as a requirement when installing PyRDF in a Python 3.4+ environment.
In fact, this requirement is messing with pip (e.g. pypa/pip#8214)
flyte added a commit to flyte/mqtt-io that referenced this issue Jan 18, 2021
@uranusjr
Copy link
Member

@abadger since you’ve done the most work, and likely thought about this a lot more than anyone else, I wonder what do you feel are the caveats around this issue. There were problems raised against the two previous solutions, and maybe we’d need to rethink what we need from the logic, in order to come up with a solution to the right problem.

@uranusjr
Copy link
Member

(I can probably provide time to figure out the technical details about how we can achive what we want, but first we need to understand exactly what we want.)

@abadger
Copy link
Contributor Author

abadger commented Mar 4, 2021

@uranusjr Reading your proposed fix from October, #8214 (comment) I think that would work. If I remember the problem right and understand your solution:

  • Currently, when creating the isolated build environment, pip is munging sys.path, giving the directory that it is installed into precedence over every other location, including the standard library. It does this because it wants to ensure that the pip that is running is used and it can't depend that it comes first in the library lookup without this step.

  • The problem with the way sys.path is currently munged is that there may be other libraries in the directory pip is installed into. If one of those libraries overrides something that generally occurs earlier in the path (in this case enum34 overriding enum from the stdlib) and the library has an api incompatibility, then it can cause a failure.

Your solution should fix this by isolating just the pip code which is needed into a wheel. That code will then be used in preference to any other installation of pip in sys.path which satisfies the first requirement. Since there won't be any other libraries placed earlier in the path, just the pip library via the wheel, this will also satisfy the second problem.

You probably don't need to build pip into a wheel if it's extra steps. If you can isolate the pip code into its own directory and add that directory to sys.path, that would work as well.

@uranusjr
Copy link
Member

uranusjr commented Mar 4, 2021

I did a quick check, creating a zipapp for the pip code base takes <1s on my various machines. There is an overhead running pip install in a zip file (about 0.5s), so in total the overhead is about one second. Pretty acceptable in the grand scheme of things IMO.

This might be the way to go. I’ll try to find some time working on this.

@abadger
Copy link
Contributor Author

abadger commented Apr 3, 2021 via email

inmantaci added a commit to inmanta/inmanta-core that referenced this issue Apr 27, 2021
Bumps [pip](https://github.com/pypa/pip) from 21.0.1 to 21.1.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a href="https://github.com/pypa/pip/blob/main/NEWS.rst">pip's changelog</a>.</em></p>
<blockquote>
<h1>21.1 (2021-04-24)</h1>
<h2>Process</h2>
<ul>
<li>Start installation scheme migration from <code>distutils</code> to <code>sysconfig</code>. A
warning is implemented to detect differences between the two implementations to
encourage user reports, so we can avoid breakages before they happen.</li>
</ul>
<h2>Features</h2>
<ul>
<li>Add the ability for the new resolver to process URL constraints. (<code>[#8253](pypa/pip#8253) &lt;https://github.com/pypa/pip/issues/8253&gt;</code>_)</li>
<li>Add a feature <code>--use-feature=in-tree-build</code> to build local projects in-place
when installing. This is expected to become the default behavior in pip 21.3;
see <code>Installing from local packages &lt;https://pip.pypa.io/en/stable/user_guide/#installing-from-local-packages&gt;</code>_
for more information. (<code>[#9091](pypa/pip#9091) &lt;https://github.com/pypa/pip/issues/9091&gt;</code>_)</li>
<li>Bring back the &quot;(from versions: ...)&quot; message, that was shown on resolution failures. (<code>[#9139](pypa/pip#9139) &lt;https://github.com/pypa/pip/issues/9139&gt;</code>_)</li>
<li>Add support for editable installs for project with only setup.cfg files. (<code>[#9547](pypa/pip#9547) &lt;https://github.com/pypa/pip/issues/9547&gt;</code>_)</li>
<li>Improve performance when picking the best file from indexes during <code>pip install</code>. (<code>[#9748](pypa/pip#9748) &lt;https://github.com/pypa/pip/issues/9748&gt;</code>_)</li>
<li>Warn instead of erroring out when doing a PEP 517 build in presence of
<code>--build-option</code>. Warn when doing a PEP 517 build in presence of
<code>--global-option</code>. (<code>[#9774](pypa/pip#9774) &lt;https://github.com/pypa/pip/issues/9774&gt;</code>_)</li>
</ul>
<h2>Bug Fixes</h2>
<ul>
<li>Fixed <code>--target</code> to work with <code>--editable</code> installs. (<code>[#4390](pypa/pip#4390) &lt;https://github.com/pypa/pip/issues/4390&gt;</code>_)</li>
<li>Add a warning, discouraging the usage of pip as root, outside a virtual environment. (<code>[#6409](pypa/pip#6409) &lt;https://github.com/pypa/pip/issues/6409&gt;</code>_)</li>
<li>Ignore <code>.dist-info</code> directories if the stem is not a valid Python distribution
name, so they don't show up in e.g. <code>pip freeze</code>. (<code>[#7269](pypa/pip#7269) &lt;https://github.com/pypa/pip/issues/7269&gt;</code>_)</li>
<li>Only query the keyring for URLs that actually trigger error 401.
This prevents an unnecessary keyring unlock prompt on every pip install
invocation (even with default index URL which is not password protected). (<code>[#8090](pypa/pip#8090) &lt;https://github.com/pypa/pip/issues/8090&gt;</code>_)</li>
<li>Prevent packages already-installed alongside with pip to be injected into an
isolated build environment during build-time dependency population. (<code>[#8214](pypa/pip#8214) &lt;https://github.com/pypa/pip/issues/8214&gt;</code>_)</li>
<li>Fix <code>pip freeze</code> permission denied error in order to display an understandable error message and offer solutions. (<code>[#8418](pypa/pip#8418) &lt;https://github.com/pypa/pip/issues/8418&gt;</code>_)</li>
<li>Correctly uninstall script files (from setuptools' <code>scripts</code> argument), when installed with <code>--user</code>. (<code>[#8733](pypa/pip#8733) &lt;https://github.com/pypa/pip/issues/8733&gt;</code>_)</li>
<li>New resolver: When a requirement is requested both via a direct URL
(<code>req @ URL</code>) and via version specifier with extras (<code>req[extra]</code>), the
resolver will now be able to use the URL to correctly resolve the requirement
with extras. (<code>[#8785](pypa/pip#8785) &lt;https://github.com/pypa/pip/issues/8785&gt;</code>_)</li>
<li>New resolver: Show relevant entries from user-supplied constraint files in the
error message to improve debuggability. (<code>[#9300](pypa/pip#9300) &lt;https://github.com/pypa/pip/issues/9300&gt;</code>_)</li>
<li>Avoid parsing version to make the version check more robust against lousily
debundled downstream distributions. (<code>[#9348](pypa/pip#9348) &lt;https://github.com/pypa/pip/issues/9348&gt;</code>_)</li>
<li><code>--user</code> is no longer suggested incorrectly when pip fails with a permission
error in a virtual environment. (<code>[#9409](pypa/pip#9409) &lt;https://github.com/pypa/pip/issues/9409&gt;</code>_)</li>
<li>Fix incorrect reporting on <code>Requires-Python</code> conflicts. (<code>[#9541](pypa/pip#9541) &lt;https://github.com/pypa/pip/issues/9541&gt;</code>_)</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="https://github.com/pypa/pip/commit/2b2a268d25963727c2a1c805de8f0246b9cd63f6"><code>2b2a268</code></a> Bump for release</li>
<li><a href="https://github.com/pypa/pip/commit/ea761a6575f37b90cf89035ee8be3808cf872184"><code>ea761a6</code></a> Update AUTHORS.txt</li>
<li><a href="https://github.com/pypa/pip/commit/2edd3fdf2af2f09dce5085ef0eb54684b4f9bc04"><code>2edd3fd</code></a> Postpone a deprecation to 21.2</li>
<li><a href="https://github.com/pypa/pip/commit/3cccfbf169bd35133ee25d2543659b9c1e262f8c"><code>3cccfbf</code></a> Rename mislabeled news fragment</li>
<li><a href="https://github.com/pypa/pip/commit/21cd124b5d40b510295c201b9152a65ac3337a37"><code>21cd124</code></a> Fix NEWS.rst placeholder position</li>
<li><a href="https://github.com/pypa/pip/commit/e46bdda9711392fec0c45c1175bae6db847cb30b"><code>e46bdda</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/9827">#9827</a> from pradyunsg/fix-git-improper-tag-handling</li>
<li><a href="https://github.com/pypa/pip/commit/0e4938d269815a5bf1dd8c16e851cb1199fc5317"><code>0e4938d</code></a> 📰</li>
<li><a href="https://github.com/pypa/pip/commit/ca832b2836e0bffa7cf95589acdcd71230f5834e"><code>ca832b2</code></a> Don't split git references on unicode separators</li>
<li><a href="https://github.com/pypa/pip/commit/1320bac4ff80d76b8fba2c8b4b4614a40fb9c6c3"><code>1320bac</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/9814">#9814</a> from pradyunsg/revamp-ci-apr-2021-v2</li>
<li><a href="https://github.com/pypa/pip/commit/e9cc23ffd97cb6d66d32dc3ec27cf832524bb33d"><code>e9cc23f</code></a> Skip checks on PRs only</li>
<li>Additional commits viewable in <a href="https://github.com/pypa/pip/compare/21.0.1...21.1">compare view</a></li>
</ul>
</details>
<br />

[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pip&package-manager=pip&previous-version=21.0.1&new-version=21.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

</details>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 30, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.