Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run auditwheel on new manylinux uploads, reject if it fails #5420

Open
dstufft opened this issue Feb 12, 2019 · 18 comments
Open

Run auditwheel on new manylinux uploads, reject if it fails #5420

dstufft opened this issue Feb 12, 2019 · 18 comments

Comments

@dstufft
Copy link
Member

dstufft commented Feb 12, 2019

What's the problem this feature will solve?

Projects are purposely uploading invalid manylinux1 wheels to PyPI which are causing crashes and other nonsense when end users erroneously use them.

Describe the solution you'd like

PyPI should be as strict as possible about the wheels it allows to be uploaded, and thus should do something like run auditwheel when the upload is a manylinux wheel, and reject it if it's not compliant.

Additional context

See tensorflow/tensorflow#8802 as an example. We will likely need to figure out if auditwheel is safe to run or if we need to do something like farm this out to a worker process that does it in isolation (and if so, we might need to adjust the upload API to allow it to be async).

@njsmith
Copy link
Contributor

njsmith commented Feb 12, 2019

CC @ehashman

@ehashman
Copy link

also cc @di given the tensorflow issue.

@njsmith
Copy link
Contributor

njsmith commented Feb 12, 2019

Relevant information in case anyone's worried this is too harsh or something: In response to a twitter thread calling out PyTorch for breaking the manylinux spec, Soumith Chintala (PyTorch's creator) just tweeted:

I'd be 100% supportive of you rejecting non-compliant wheels as well. It applies a uniform standard to everyone, and we'd move the needle much faster with vendors than we are right now.

The motivation here isn't to punish particular companies for being naughty, but to (a) improve the experience for PyPI's users, (b) give the developers at those companies the leverage they need to get things fixed.

@di
Copy link
Member

di commented Feb 12, 2019

Yeah, probably overdue for an update, but I'm currently working with the TensorFlow team and other interested parties to try and determine what can be done here to improve the situation.

I think it's probably worth saying here (for anyone following along) that while I do think PyPI should eventually start auditing uploaded wheels, this is not going to happen overnight and/or without fair warning.

I'm interested in ensuring that PyPI is a source of artifacts that are reliable for users, but I'm also interested in putting in the work necessary to ensure that we have specifications to support the increasingly common use cases that currently lead to this issue.

I'll post a discussion thread soon to add more detail and discuss in-depth -- let's keep this thread on-topic.

@njsmith
Copy link
Contributor

njsmith commented Feb 12, 2019

We will likely need to figure out if auditwheel is safe to run or if we need to do something like farm this out to a worker process that does it in isolation (and if so, we might need to adjust the upload API to allow it to be async).

I believe the situation is: when auditwheel checks a wheel, that involves unpacking the zip, reading the wheel metadata, and for any binaries inside, using pyelftools to read metadata out of the binaries. It doesn't execute any code from the binaries; it just reads metadata from the ELF headers. So in theory this isn't any more dangerous than e.g. trying to decompress a potentially-malicious zip file. And pyelftools is pure-python (or at least its README says it is :-)), so at least in theory we should be safe from really egregious issues like buffer overflows. I doubt anyone ever thought much about using it on untrusted binaries, so I'd still be wary of less-dangerous issues like carefully crafted binaries causing it to allocate a bunch of memory or spin on CPU. It might make sense to run it in a subprocess with some rlimits set. And it'd be good for someone to doublecheck everything I just wrote :-). But at least in principle, there's nothing especially dangerous about auditing manylinux wheels.

auditwheel repair is a different matter – it uses patchelf, which is a complicated C++ program that I'm certain has never had any security audits. But PyPI won't be trying to repair wheels, just check them.

@dstufft
Copy link
Member Author

dstufft commented Feb 12, 2019

I'm personally willing to take a fairly hard line stance on this. The presumption when manylinux1 was added was that projects would be good citizens and upload properly produced wheels or would not upload manylinux1 wheels at all. Of course mistakes happen and people can ship broken wheels, but that should then be treated as a bug and either fixed or faulty wheels should stop being produced until they can be produced correctly.

Obviously this won't happen overnight, for one the feature has to be implemented at all, and I wouldn't see us treating it any differently than we did the HIBP rollout. Metrics to gauge impact, set a date, possibly generate an email, shut it off on that date. We can use metrics and outside information to inform that date, but to be clear, I'm perfectly OK with leaving projects who don't fix themselves out in the cold.

IOW, yes we should do this, no we shouldn't make it a surprise, yes we should use data to inform how aggressive we are, no we shouldn't let any one project dictate our strategy here.

@di
Copy link
Member

di commented Mar 22, 2019

FYI, I've started a discussion on the next manylinux spec here: https://discuss.python.org/t/the-next-manylinux-specification/

@di
Copy link
Member

di commented May 6, 2019

At the PyCon 2019 packaging summit, we determined that this is blocked on #726.

@brainwane
Copy link
Contributor

Once we have the ability to do this, should we also consider running auditwheel retrospectively on existing release artifacts? Thinking about pypa/manylinux#305 (see pypa/manylinux#305 (comment) ).

@takluyver
Copy link
Contributor

I don't think it makes sense to retrospectively invalidate wheels that are incompatible with new distributions, as in that case. We presumably want to maintain the rule that you can't replace an already uploaded package, so the only option would be to remove them or somehow mark them as bad so installers don't consider them. But those wheels are still useful for users on other popular distros.

It may be worth identifying affected wheels and alerting the maintainers of those packages to think about uploading a new version. But that's probably quite a separate feature from gating uploads.

@jamadden
Copy link
Contributor

jamadden commented Oct 2, 2019

We've also seen wheels being uploaded with generic platform tags like py3-none-any but containing and requiring (non-manylinuxX compatible) shared libraries (e.g., #6400, #6403). Would it be worth extending the check to such wheels? Or simply denying uploading generic wheels that in fact contain platform-specific binaries?

@takluyver
Copy link
Contributor

It's possible to imagine packages containing platform-specific binaries which are nonetheless cross-platform - e.g. the binaries are only used on the relevant platform, and it works without them on other platforms, or it's a library for introspecting such binary files, and includes them as test cases.

@jamadden
Copy link
Contributor

jamadden commented Oct 2, 2019

That's true. So I suppose the interesting case is packages (distributions) that ship binary Python modules with the intent of importing them, or binary shared libraries with the intent to use ctypes or similar on them. Without a pure-Python fallback, the package doesn't work, on only works partly.

I only bring it up because it's a potential annoyance for end-users, and a small burden on PyPI infrastructure and admins (all the cases of this I've run into I've only noticed because the distributions requested upload limit increases, meaning they need extra storage, bandwidth, and admin time).

I don't know if it's worth trying to do anything about that or not (that's why I asked 😄) False positives might be too high?

Most of the times so far the mislabeling has been by accident and the packagers have worked to update the build system to produce manylinux compliant wheels, so I'm guessing they would have welcomed an error telling them that their distribution didn't work like they wanted it to. I haven't yet heard from a packager who was deliberately trying to install one-platform modules on all platforms and sidestep the manylinux requirements to "sneak" binaries onto PyPI.

@takluyver
Copy link
Contributor

I'd be OK with putting in restrictions like that if there was a way for projects to get them lifted (like the size restriction). I don't think either of the cases I mentioned are very common.

But do we have evidence how common it is that people are uploading platform specific wheels with -none-any tags? The two issues you linked before are from the same author, so it could be a one off.

@jamadden
Copy link
Contributor

jamadden commented Oct 2, 2019

I've only seen a few issues, those that resulted in limit requests. I don't have evidence, just anecdotes.

One other notable one was #6197, which involved distributing an actual executable, the running of which was the main point of the Python code. It linked to a very specific set of system libraries and was uploaded as py3-none-any. The executable was also included in the source .tar.gz. We worked through those issues and resulted in a buildable source archive (theoretically cross-platform), but hadn't yet gotten to produce a manylinux wheel; it was at least producing linux-x86_64 though.

I'm not pointing these concerns out to the packagers to be pedantic or malicious but to try to help them do what it looks like they're trying to do and increase the overall quality of the ecosystem. That would be the reason behind the warning or error too: a clear message about a problem with some info about where to go for help instead of being contacted by end users and not clear on what's wrong. I don't know whether automating that is truly helpful, or indeed if I should stop checking for problems like that myself.

@brainwane brainwane removed the blocked Issues we can't or shouldn't get to yet label Apr 3, 2020
@brainwane
Copy link
Contributor

(I think this ties in to pypa/packaging-problems#25 .)

@di said today in IRC that, in his opinion, this does not need to block on #726.

What actual hardware or other services would we need in order to implement this just on new uploads? Just the auditwheel check?

@dstufft
Copy link
Member Author

dstufft commented Apr 3, 2020

I think this is just a SMOP, AFAIK auditwheel just does static analysis, so we just need to actually wire things up.

@brainwane
Copy link
Contributor

Implementing this would -- per @pradyunsg's opinion in IRC just now -- help mitigate the compatibility issues that he sees users face. And in my assessment, it would help reduce how often users get into bad situations, and thus reduce the general support load on the people who support Python packaging users. Since rolling out the new pip resolver will probably lead to a burst of support requests, I'd love more progress on this feature.

Per discussion in IRC today, we may need to implement #7730 first. More discussion is welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants