Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove "twine register" reference in distributing.rst #271

Merged
merged 2 commits into from
Apr 18, 2017

Conversation

jni
Copy link
Contributor

@jni jni commented Nov 8, 2016

Fixes #263.

@jni
Copy link
Contributor Author

jni commented Dec 20, 2016

Is anybody maintaining this thing? This is not a complicated fix.

@ncoghlan
Copy link
Member

At least in my case, it's the need to investigate the discrepancy between the proposed fix here and the "if necessary" caveat in the twine README that kept me from merging the change: https://github.com/pypa/twine/#usage

Looking at pypa/twine#200 it seems the fact it's outright failing for you (rather than being harmlessly redundant) is a bug in Warehouse rather than an error in the documentation, as anyone using the old implementation at pypi.python.org for uploads will still need to register explicitly.

@jni
Copy link
Contributor Author

jni commented Dec 23, 2016

@ncoghlan I don't really understand the difference between warehouse vs pypi.python.org. How would a user select between the two? Is it not the case that most newcomers will encounter the same problem I did?

@ncoghlan
Copy link
Member

The differences relates to the configuration settings mentioned here: https://packaging.python.org/distributing/#create-an-account

The upload.pypi.org/legacy setting is the one that chooses the Warehouse implementation of the legacy API (which is generally more stable for actual file uploads) over the legacy service itself. The problem you've hit is that there's an incompatibility in the Warehouse emulation of the legacy API, where the explicit registration endpoint is throwing an error when it shouldn't.

However, not everyone is going to have that new setting - many will have a config file that still uses pypi.python.org, in which case they still need to do the explicit registration first. That means just dropping the documentation of the explicit registration step isn't the right thing to do either - either the bug in Warehouse needs to be fixed (so the steps just work as written) or else the docs need to warn people about the bug, so they know they can just proceed to the next step regardless if they're using upload.pypi.org as their upload server.

@jni
Copy link
Contributor Author

jni commented Dec 28, 2016

@ncoghlan I'm happy to make the requested change, but how does one check whether they need to do it?

@brettcannon
Copy link
Member

@jni Basically you check by making sure you aren't using the new endpoint in your .pypirc file and then try uploading a new project using twine without first registering.

@jni
Copy link
Contributor Author

jni commented Apr 10, 2017

@brettcannon I honestly don't know what to do with this PR (and the many others referencing it). I'm confused about the multiple endpoints, whether they will continue to exist, and which endpoint supports new or old APIs.

My two cents is that this documentation is intended for newcomers to Python packaging, who are unlikely to have a .pypirc file at all, and who need simplicity, not five branch points asking them to look inside files that they don't know exist. My suggestion is to merge this as is and then add the multiple endpoints discussion to a separate "troubleshooting" section.

I've allowed edits to my branch from maintainers, so do with it what you will.

@ncoghlan
Copy link
Member

@jni If people don't have a .pypirc file, then the instructions should work as written, and if they don't, then that's a bug in twine where the default registration endpoint needs to be reverted until the regression in Warehouse is fixed.

@ncoghlan
Copy link
Member

@dstufft It would be really be nice to just have the API regression fixed in Warehouse, so this whole problem goes away :)

@jni
Copy link
Contributor Author

jni commented Apr 10, 2017

@ncoghlan fix or no fix, if you look at the discussion in pypi/warehouse#1627, the preferred approach is to upload directly.

@jni
Copy link
Contributor Author

jni commented Apr 10, 2017

At least that's my reading of it.

@ncoghlan
Copy link
Member

@jni Thanks for the pointer - I've chimed in over there as well :)

@theacodes
Copy link
Member

Just for my understanding - where are we at on this change? Does it still need to be made?

@ncoghlan
Copy link
Member

@jonparrott The current status is that the instructions are still self-contradictory as @jni pointed out:

We just didn't originally notice the contradiction, since the folks encountering the upload problems with the legacy service necessarily already had their projects registered.

One option that would allow this to be resolved independently of pypi/warehouse#1627 would be to restructure this section to cover two different paths:

  • Implicit project registration
    • autoregister the project when you upload the artifacts
    • twine only, but already works with the more reliable Warehouse upload API
  • Explicit project registration
    • allows the project to be registered in advance of uploading any artifacts
    • allows the online description and other metadata to be updated without publishing a new version
    • currently requires the use of the legacy PyPI service

This could be a good improvement anyway, since the first section can be completely opinionated (Use the Warehouse API, use twine upload), with the only the second section presenting the use with any other decisions to make.

@theacodes
Copy link
Member

I'd prefer to take a strong stance instead of being wishy-washy. I personally think we should go for implicit registration and just add a note that it's also possible to do it explicitly and perhaps link off somewhere else if someone wants to go that route. WDYT?

@dstufft
Copy link
Member

dstufft commented Apr 11, 2017

The only problem with that is legacy PyPI needs an explicit registration once (but only once).

I'm not sure that matters though because Python, twine, setuptools all default to using Warehouse now, so only people on older Pythons/twine/setuptools will upload to legacy PyPI unless they have a ~/.pypirc that explicitly points them back at legacy.

In general though I'm +1 on focusing on implicit registration.

@pfmoore
Copy link
Member

pfmoore commented Apr 11, 2017

I'm not sure that matters though because Python, twine, setuptools all default to using Warehouse now, so only people on older Pythons/twine/setuptools will upload to legacy PyPI unless they have a ~/.pypirc that explicitly points them back at legacy.

I have a ~/.pypirc with the legacy URL in it. I have that because I set it up ages ago, from what I recall by cargo-culting some advice on the net, to allow me to configure my username and password so I didn't have to enter them every time I did an upload. I doubt that's uncommon. The problem is that I need to put the user details in my .pypirc under the [pypi] section, because that's the one twine uses by default, and it's not clear to me that I can just have a username in that section, without a URL. Can I? If so, I guess I can just remove the repository: line from that section.

I'm not aware that the .pypirc file format is documented anywhere. Because of this, I suspect more people than you might think will have an explicit URL in their .pypirc, because they copied the same advice I did (wherever I got it from).

More generally, if I were reading this section of the guide, things I'd want to see (as in, I'd be expecting to find) are:

  1. If I have a new project, how do I register it? Typically, I would expect to pre-register before I had my project packaged, if only so that I can claim the name before I write all the code that has the name included. If pre-registration isn't the recommended approach, then I'd like guidance on how to prepare a "dummy" release that had no working code but just the metadata present.
  2. How do I upload my release to my project? (Obvious one)
  3. How do I configure the tools so that I don't have to enter my username every time? Ideally I'd want to store my password too - assume I understand that storing my password in plain text in a config file is bad practice, but I still don't want to have to enter my password every time, so what are my options? For example, both git and Mercurial remember my password, so telling me I shouldn't do that goes counter to my experience.
  4. I'd like to see a mention of testpypi, for "practicing" the process. When I was first using PyPI, the biggest issue for me was not just dumping my new project straight into the live PyPI before I knew what I was doing. So "how to test you're doing it right" is important. That leads into the question of how I configure multiple repositories.

It's quite possible that (3) and (4) are not in scope for this guide, but I'd like at a minimum a "see more" pointer to the full documentation.

One other thought - is testpypi going to still be available after Warehouse goes live? Is there a warehouse-based testpypi at the moment (I'd have guessed it would be at https://testpypi.org/pypi, but that doesn't exist)? Would it actually be better to suggest to people that they use a local devpi instance to practice their release management, and then retire testpypi?

@dstufft
Copy link
Member

dstufft commented Apr 11, 2017

I think that (3) can be solved without the plaintext using keyring support that exists in twine now, although I think it's not very user friendly yet until pypa/twine#216 is solved.

(4) There is currently a test warehouse, it is at test.pypi.org/pypi, although long term I want to shut it down and separate the idea of pushing a release to real PyPI and publishing it to be generally available apart which is useful both for testing releases (you can just cancel the temporary upload, or have it auto delete after a week or so) but also for people who want to build up a number of release artifacts across different systems and test the uploaded bits before publishing. That is tracked in pypi/warehouse#726.

@jwodder
Copy link
Contributor

jwodder commented Apr 12, 2017

@pfmoore The ~/.pypirc file is documented at https://docs.python.org/3/distutils/packageindex.html#the-pypirc-file. The docs say that you can leave out the repository: line and have it default to https://upload.pypi.org/legacy/, which apparently became the default in Python 3.5.

@pfmoore
Copy link
Member

pfmoore commented Apr 12, 2017

@jwodder My apologies - thank you for pointing this out. That must either have changed since I looked (which was a long time ago, admittedly) or I looked in the wrong place.

@ncoghlan
Copy link
Member

As @pfmoore notes, the idea of only recommending an implicit registration based workflow seems strange to me, as it's inherently prone to race conditions - there may be a period of days or weeks where you've committed to a particular name, and are actively developing the code using that name (whether in private or in the open), but don't have anything worth publishing to PyPI yet.

Implicit registration mainly seems to be useful in cases where people already have a project that has been around for a while, and decide "Oh, I should probably publish this to PyPI, let me see if the name is still available". And even then, it seems weird to only be able to register the name after writing your setup.py, building an sdist, and having it ready to upload, rather than going:

  • register to lock in the name
  • build your releasable sdist (and optionally, wheel files)
  • upload your first release

It feels akin to only being able to register a domain name at the time you first publish the associated site, rather than having "register the domain name" and "publish a site update" being clearly distinct activities.

Perhaps that's the underlying problem here? "Register your project" and "Upload your first release" should really be covered as distinct steps, but that's currently obscured by the fact that even with the legacy PyPI API you still need at least a minimal sdist for the registration step due to the way the client tooling works (it just doesn't need to be a releasable one).

@dstufft
Copy link
Member

dstufft commented Apr 13, 2017

Honestly, I'm not to fond of the idea of "claiming" a name anyways. AFAIK the other popular languages don't really allow it (RubyGems doesn't allow it implicitly by only allowing you to upload things, npm explicitly has a policy against it saying the name belongs to the first person who publishes an actual project).

As a user of pip and PyPI, it is frustrating to me when I find a project whose name appears to do what I want, but whose page is a placeholder waiting for some code. More often then not this has been a placeholder for some period of time because the person came up with a great name, claimed it, then never ended up actually publishing a name under it. There is obviously no technical means we can employ that prevents people from squatting names, if we require a tarball, then people will just upload a tarball, if we require it to be updated frequently, then people will just regularly update it. Since there is no "cost" associated with taking ownership of a name on PyPI, people are incentivized to do so greedily in case they might ever use such a name, rather than when they actually need it. However, just because we can't prevent it, doesn't mean we really need to continue to keep around workflows that primarily exist to benefit it. Longer term I would love to provide a policy similar to what NPM has here, which is "produce working code for your name, or get out of the way for someone who does".

On the surface the comparison with DNS names seems like a fair one, but it is different in a few key places. For one DNS names cost money, so there is an incentive to keep only names you are still planning on using. In addition to that DNS names need to be renewed, so if you bought a name that you were planning on using, but then decided not to do so, typically you'll stop renewing for that name, it will expire and be released back into the pool of available names. Finally, the way people discover a domain name and the way people decide if a domain name are taken are distinct mechanisms, whereas for PyPI it is the same mechanism, so simply hiding claimed but unused packages from search (or deprioritizing them) doesn't work, because either you make it appear like the name is available when it's not (by hiding it from search results, making a better UX for people looking for a thing to use) or you make it obvious something by that name exists (by showing it in the search results, making a better UX for people looking to name their project).

@pfmoore
Copy link
Member

pfmoore commented Apr 13, 2017

If squatting is your issue, we could presumably allow pre-registering, then expire the registration if no upload were made within (say) 3 months. That allows people to decide on a name, then have sufficient time to develop something before being forced to upload. Honestly, I don't see how "doesn't do anything, just a placeholder" uploads are better than registered projects with no files available. And at least the latter are easier to locate (and delete, if we decide to).

The baseline for me is that PyPI has this feature, so Warehouse should too. Whether we do it via twine or via a web form is a matter of UI, but pre-registration of names is an existing feature of PyPI, so if the proposal is to remove it, then that needs to be agreed. Maybe the question should be raised on distutils-sig? If the majority there has the view that name squatting is a sufficiently significant issue to warrant enforcing a (IMO) more clumsy project creation workflow, then I'm OK going with the consensus. But I can't say I like it.

@dstufft
Copy link
Member

dstufft commented Apr 13, 2017

@pfmoore I don't think the workflow is more clumsy? Whenever you're ready to publish you just upload. It removes a required step, thus streamlining the workflow. If you're not ready to publish, then you probably shouldn't be grabbing the name to begin with.

Like I said though, trying to layer on technical solutions to this tends to just be an arms race, if we require some activity within 3 months, people will just make sure they do that when squatting. I'm not super interested in trying to layer in a bunch of technical solutions here because it just makes things a bit harder for everyone else. What I'm somewhat opposed (but not entirely opposed) to is throwing in API end points whose main purpose is squatting names.

@pfmoore
Copy link
Member

pfmoore commented Apr 13, 2017

What I meant was that I have to choose a name, then build my code with no guarantee that by the time I'm ready to publish, I won't have to change that name. Or alternatively, I create a dummy project with a setup.py but no real code, and upload that as a placeholder. The dummy upload is clumsy IMO.

Maybe I'm being unnecessarily paranoid that someone might grab my cool name, that's certainly possible. But OTOH, not allowing pre-registration is just as much a technical solution to the squatting problem here, as squatters willing to re-register would be just as willing to upload a dummy project. My main argument is that we should have a better justification for removing an existing feature.

Anyhow, I've made my point, I'll leave it to others to make the decision.

@dstufft
Copy link
Member

dstufft commented Apr 13, 2017

Yea, you want to squat the name with the promise that at some point you're going to upload something to that name :) Maybe you're actually going to upload something relatively soon and thus the impact is small, maybe you're going to get bored of the project and never upload something in which case that name is now being held and others with their own equally cool ideas can't use that name.

The difference with not allowing pre-registration vs trying to require an upload in X amount of time or something, is that pre-registration doesn't have much in the way of actual use cases other than squatting, but more importantly I don't think not implementing it is going to solve anything, I just don't think it's worth implementing it and maintaining it since it's primary purpose is something I'm not really wanting to incentivize anyways. Since it's a brand new code base, implementing an existing feature takes more work than "removing" (or really, not implementing) that existing feature, so up front I looked at features we had and I just didn't implement ones that I wasn't super interested in supporting any longer.

Beyond just not wanting to take the time to implement a feature I wasn't really thrilled about continuing to support, is there is some evidence that the register API leads to confusion sometimes. I've seen more than one case where projects would register a release with PyPI, but forget to upload it. When they went to PyPI they saw the release there and didn't notice it had no files (since there's no "HEY THERES NO FILES" warning, just the absence of files which is hard to notice) and then got confused when pip install foo didn't pull in their latest release. This isn't a constant every day problem I see people running into, but there is a small trickle every couple of months of a project getting bit by it, so that also lead me to feel like this API was more trouble than it's worth.

Finally there is just the cognitive burden, having two things to understand the difference between is inherently harder to understand than one thing to understand (and from a UX perspective, having a user presented with the option to upload and register is a lot worse than the option to just upload). We can look at this issue itself and how different folks are trying to propose different ways to cope with the additional complexity of mentioning the explicit registration at all without making it much more complicated for a new user to understand (and ultimately all of them fall short of the simplicity of "upload when you're ready, that's it".

@ncoghlan
Copy link
Member

ncoghlan commented Apr 13, 2017

Note that I'm fine with having "register-on-first-release" being the default approach recommended for small individual projects. It just doesn't always work so well in the corporate open source context, where naming things often gets a lot more complicated, there may be trademark lawyers involved, and the minimum requirements for getting to an initial release may be higher.

The main pre-registration use cases that I'm talking about are the ones like https://pypi.python.org/pypi/leappto where we could upload the current sdist as 0.0.1.dev0 if you insisted, but it wouldn't be particularly useful to anyone else, as it still has all the Vagrant and libvirt specific hacks for the proof-of-concept we're currently working on.

In our kind of situation, "Registered, but no releases yet" is a more accurate reflection of the project's current state than "Registered, with only a dummy release" (hence why the metadata also includes the "Pre-alpha" classifier).

Rather than DNS, a better parallel for PyPI is probably GitHub, where creating a repo is a distinct step from pushing useful content to it. I think the GitHub name squatting policy gets this balance right, by making it clear that names cannot be held indefinitely for future use, even though the platform allows you to register arbitrary names if you want to: https://help.github.com/articles/name-squatting-policy/

Making PyPI's explicit policy be "Names registered without making any releases are deemed provisional, and may be automatically relinquished after a period of time" would be relatively straightforward to eventually automate server-side: without the need to judge whether or not a single solitary release on a project is a "real release", it becomes feasible to add a check that means that if a project doesn't make a release within a certain number of days, the provisional registration will lapse, and the name will go back into the generally available pool.

Given such an automated garbage collector, truly malicious actors would take the additional step of either uploading a dummy release or setting up a counter-bot to automate re-registration, but it would be sufficient to handle the benign cases of folks that forgot to deregister a name they reconsidered and decided not to use, as well as those that simply didn't realise it wasn't OK to reserve names indefinitely for possible future use.

@dstufft
Copy link
Member

dstufft commented Apr 13, 2017

@ncoghlan I think doing that sanely would require changes to the PyPI data model. You currently cannot get pages like https://pypi.python.org/pypi/leappto without giving a version number, so you have to pick a version number of some sorts and you're essentially (as far as PyPI is concerned) making a release.

Thus if that is a use case we genuinely want to support (and I'm not sure that it is, but I'm not opposed to it), we probably want to bake that in as part of the data model itself, rather than hijacking the concept of making a release. Possibly with a name like twine reserve or something. Getting into the weeds about that is probably not something for this issue though, and in general I am pro removing the concept of twine register from the beginner focused guides.

@ncoghlan
Copy link
Member

@dstufft I think the current registration API is deeply flawed (specifically due to the ability to make the last registered metadata differ from the last actual release), and am a fan of it ultimately going away.

The specific aspects of the current approach to reaching that goal that I don't like are:

  1. the hard compatibility break. An endpoint that silently did nothing would be better than an error in this particular case, since it would preserve API compatibility for any documentation and client automation that immediately proceeded from registration to uploading a release.
  2. the complete loss of pre-registration support, which will only serve to make well-intentioned short term name squatting (i.e. "I'm going to publish something soon, I'm just locking the name to prevent last minute disasters while I'm getting ready for the first release") even harder to distinguish from indefinite name squatting (i.e. "I'm holding on to this name to keep anyone else from using it, even though I have no immediate plans for it")

Of those two, it's really the first one that's at issue in this particular thread - it's a clear barrier to the Warehouse migration, because it breaks things, starting with the User Guide. Making it a silently successful operation, rather than the current noisy failure, would be sufficient to resolve that.

For the latter point, pre-registration already has two relatively easy workarounds in either uploading a 0.0.1.dev0 sdist or else using the legacy PyPI service (at least prior to the migration), so adding that capability back in a new form can be tackled as a separate Warehouse RFE for an updated pre-registration endpoint that doesn't have the problems of the current approach, which can in turn be made conditional on other related enhancements (like the automated garbage collector for provisional registrations).

@dstufft
Copy link
Member

dstufft commented Apr 13, 2017

I am massively -1 on an API that silently does nothing. If someone is relying on that behavior for something beyond the typical twine register && twine upload then they are going to have Warehouse lie to them and claim it did the thing they asked for, when it really didn't. This should either be an error or it should work, silently doing nothing is not a reasonable option in my opinion.

Probably the most useful thing to do for (1) is to update legacy so it doesn't ever require a twine register step either, so all instructions and guides can be updated to never mention it.

@ncoghlan
Copy link
Member

In this particular case, the only known use case for register-without-upload is name squatting (hence why it's OK to drop the API in the first place), and I'm entirely OK with silently breaking automated name squatting scripts rather than noisily breaking them :)

Non-automated name squatting will have a human checking PyPI and going "Hey, why didn't my name get registered?" and people will hopefully eventually find this PR and the Warehouse issue at pypi/warehouse#1627 and figure out what is going on.

Adding support for implicit registration to the legacy PyPI service would certainly be an acceptable alternative approach, but it seems like a lot of work for the sake of providing an easier to debug error for a use case that we don't really care all that much about breaking (at least in its current form).

@ncoghlan
Copy link
Member

In pypi/warehouse#1627 (comment) I noted that even if the Warehouse server were to change to claim that everything is fine when accessing the legacy registration API, the twine register command could gain a user visible warning that it doesn't actually do anything when used with the pypi.org service's emulation of the legacy API.

@theacodes
Copy link
Member

theacodes commented Apr 18, 2017

This conversation is really useful, but I want to bring something actionable back to this particular PR.

Based on my testing in a fresh debian install and the latest twine (1.8.1) twine upload dist/* is enough to register the project without any other setup (other than having a PyPI account):

(env)root@bdc012554861:/sampleproject# twine upload dist/*
Uploading distributions to https://upload.pypi.org/legacy/
Enter your username: jonparrott
Enter your password:
Uploading jonparrott-twine-test-project-1.2.0.tar.gz
[================================] 8956/8956 - 00:00:01

Twine automatically uses the new https://upload.pypi.org/legacy/ by default. So I'm going to say let's remove the register step and go with just twine upload. Anyone opposed? If not, I'm gonna update this PR myself to do just that. We can file feature request bugs for other stuff like mentioning keyring support (when it's available) and testpypi.

@theacodes
Copy link
Member

@ncoghlan @dstufft @brettcannon @pfmoore this PR has been updated. Please take a look. I'll file bugs for remaining issues that occurred in the conversation above once this is merged.

@theacodes theacodes requested a review from ncoghlan April 18, 2017 21:00
@theacodes
Copy link
Member

theacodes commented Apr 18, 2017

(FYI: I took the "minimally invasive" approach here. I tried not to "remove" information such as the gpg stuff which I think deserves a separate tutorial. I really want to revisit and streamline this doc, but I want to get all of the outstanding PRs merged/closed and bugs triaged before I start making big changes)

@pfmoore
Copy link
Member

pfmoore commented Apr 18, 2017

LGTM. And I agree with your approach of getting the basics sorted out first, then working on any refinements in subsequent PRs.

@theacodes
Copy link
Member

Merged with with @pfmoore's approval. If anyone else has concerns, I'm happy to address them in follow-up PRs. :)

(Also if anyone disagrees with my methodologies in terms of PRs and bugs holler, i'm adjustable, just trying to keep things moving).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants