Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade vendored requests to 2.26.0 #10174

Merged
merged 5 commits into from
Jul 23, 2021
Merged

Conversation

illia-v
Copy link
Contributor

@illia-v illia-v commented Jul 19, 2021

https://github.com/psf/requests/blob/master/HISTORY.md#2260-2021-07-13

psf/requests@v2.25.1...v2.26.0

requests 2.26.0 allows using charset_normalizer instead of chardet, but I did not vendor the former because chardet is used by other parts of pip.

https://github.com/psf/requests/blob/master/HISTORY.md#2260-2021-07-13
psf/requests@v2.25.1...v2.26.0

requests 2.26.0 allows using charset_normalizer instead of chardet, but I did not vendor the former because chardet is used by other parts of pip.
@pfmoore
Copy link
Member

pfmoore commented Jul 19, 2021

I did not vendor the former because chardet is used by other parts of pip.

I just did a grep and I don't see where else pip uses chardet (except in the vendored html5lib, where it seems to be optional). Can you clarify where you think it's used?

@illia-v
Copy link
Contributor Author

illia-v commented Jul 19, 2021

I did not vendor the former because chardet is used by other parts of pip.

I just did a grep and I don't see where else pip uses chardet (except in the vendored html5lib, where it seems to be optional). Can you clarify where you think it's used?

I meant the use in html5lib

@uranusjr
Copy link
Member

How hard would it be to patch html5lib to use charset_normalizer instead? I took a quick look and it seems that html5lib is using an interface not present in charset_normalizer, but I’m not faimilar with the internals of either to understand whether the patch would be difficult.

@pfmoore
Copy link
Member

pfmoore commented Jul 20, 2021

I don't want to have this flare up into another licensing discussion, so I won't link to the original thread here, but upgrading requests to 2.26.0 is fine with me as a "normal" upgrade of a vendored dependency. If html5lib can't use charset_normalizer, then IMO it's up to the people for whom our vendoring of chardet is an issue to raise that with html5lib. I'm -1 on carrying patches to html5lib to do that, as it sucks us into that whole licensing argument (which is IMO rather toxic, if the requests and pip threads I've read are anything to go by).

If simply dropping chardet as a dependency, and having html5lib fall back to whatever it does when chardet isn't available, is acceptable, then I'm fine with doing that as well (but it should probably be in a separate PR, as it introduces a potential behaviour change).

@uranusjr uranusjr added this to the 21.2 milestone Jul 22, 2021
@uranusjr
Copy link
Member

It seems like this PR is not updating the vendoring metadata correctly. Could you take a look at the CI error and the documentaion on Vendoring Policy and see if you could fix the issue?

@illia-v
Copy link
Contributor Author

illia-v commented Jul 22, 2021

It seems like this PR is not updating the vendoring metadata correctly. Could you take a look at the CI error and the documentaion on Vendoring Policy and see if you could fix the issue?

Sorry, I fixed the problem

Comment on lines 11 to 14
from pip._vendor import chardet
try:
from pip._vendor import chardet
except ImportError:
import charset_normalizer as chardet
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we patch this to remove the try-except block? I feel it's more predictable if we always import the vendored chardet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just patched it.

I thought about an alternative: substituting all charset_normalizer imports with pip._vendor.charset_normalizer using vendoring capabilities. This way we won't have to maintain the additional patches, and switching to charset_normalizer will be smoother.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like a good idea. Would it be easier if I merge this first or can you do that in this PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, let’s have this in main so we can more easily evaluate this alternative solution.

@uranusjr uranusjr merged commit 0fb0e3b into pypa:main Jul 23, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 27, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants