Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[baseline practices] .npmignore or package.json files #164

Closed
jchip opened this issue Feb 14, 2019 · 54 comments
Closed

[baseline practices] .npmignore or package.json files #164

jchip opened this issue Feb 14, 2019 · 54 comments
Labels
article Need to spread this information package-maintenance-agenda Agenda items for package-maintenance team

Comments

@jchip
Copy link
Contributor

jchip commented Feb 14, 2019

Do we have something for specifying the included files for publishing yet?

I just happen to look through some of the files in node_modules for one of my apps and this is what I found:

  • coverage, .nyc_output
  • .idea
  • .eslintrc, .travis.yml, etc
  • quite a few packages include their full test files.
  • also for packages written in ts, some are including the TS source, is that needed?

Kind of related to #77

We should have some thing that suggest ensuring files exist in package.json and include essential files for publishing.

@ljharb
Copy link
Member

ljharb commented Feb 14, 2019

First of all, "files" is super dangerous - forgetting to add a new file causes breakage. Nobody should ever use it.

.npmignore is fine, but also, imo npm explore foo && npm install && npm test should always work - i consider it a best practice to always ship full tests with every package.

@jchip
Copy link
Contributor Author

jchip commented Feb 15, 2019

I have no opinion about .npmignore or files except .npmignore should be in .npmignore itself. We should have some mention about being mindful to publish without extra files and npm offers two ways.

I approach this from practical stand point and prefer my node_modules smaller. I think publishing packages without extra files is one of the most overlooked thing and I've installed packages that publish demo images or temp files in the 10s of MB size. I think if there are a lot of tests it's better to leave them, but it's up to individual package owner. The practice I follow is prepublishOnly runs npm test.

@ljharb
Copy link
Member

ljharb commented Feb 15, 2019

The cost of publishing an extra file is slower install times; the cost of omitting a needed file is breakage. There’s no semver contract around package sizes to break.

@jchip
Copy link
Contributor Author

jchip commented Feb 15, 2019

as a user, I prefer both: faster install and no breakage.

@ljharb
Copy link
Member

ljharb commented Feb 15, 2019

In the better case, sure! But in the worse case, which would you prefer, a fast broken install, or a slow working one?

@jchip
Copy link
Contributor Author

jchip commented Feb 15, 2019

I'd like to clarify what you actually have issue with.

It's clear you are against files and that test should be published. I am sure we can word it to your satisfaction.

Do you actually have issue with the fact that we should have guideline about the broader topic I raised?

@ljharb
Copy link
Member

ljharb commented Feb 15, 2019

In general, yes - there are very few files I think should ever be excluded from the published tarball, and I'd be concerned that we'd be unable to come up with guidance that wouldn't over-weight "package size" as a consideration.

In other words, I don't think that package authors should be solving this problem - I think npm should (and, with their "tink" project, they are).

@jchip
Copy link
Contributor Author

jchip commented Feb 15, 2019

OK, let me clarify. The primary motivation for bringing this up is because I don't think files like these should be published:

  • .DS_Store, .nyc_output, coverage, .idea, .vscode

I don't think tink is solving that.

npm could default to ignore those files, but I imagine that's not something they want to do either.

I think having a guideline regarding be mindful of what files you published make sense.

@ljharb
Copy link
Member

ljharb commented Feb 15, 2019

oh sure. But those should all be gitignored - which makes them ignored by npm by default.

@dominykas
Copy link
Member

Depending on how you structure your project, you can have files be very safe. Basically, put stuff in lib/ and bin/ and you're grand. And docs/ as well.

Re tests - that's very debatable. I wish there was tooling to easily "switch to use this deep dep to be installed from their git" so that you can get tests, but shipping tests together with the package is an antipattern in my eyes, partly because it complicates security efforts as there is more code to review/analyse - and that is code that's never going to be used in production (yes, it's a tradeoff between dev needs and prod needs). Keep in mind, that the payload for event-stream was hidden inside tests... Either way, because it's debatable, I agree that we can't say that it's a "best practice".

Re other files - as a random one-off contributor I find it very uncomfortable when I need to add .idea to someone's .gitignore... and I know there's people who don't like it... so having that as a baseline would be very nice, but I'm not sure that's going to fly, because the list of common files is very large and it bloats the .gitignores. Instead, I'd really like it if npm actually respected a global .gitignore as well as the local one... There's an issue open for years.

@mcollina
Copy link
Member

I think @styfle should join this discussion. Mainly because we sit on the opposite side of this fence: he wrote https://packagephobia.now.sh/ and I'm a strict believer that sources, docs, tests and examples are key part of a package. Specifically, I think I should be able to build/test that package without having to open up git. I'm a huge fan of "show source feature" of the 90s web.

Reducing the size of packages is important to reduce the size of the artifacts that we deploy. I think we should build a tool that traverse node_modules and remove not-needed files for production (tests, docs). This would be very nice to build as part of this group.

@styfle
Copy link
Member

styfle commented Feb 15, 2019

packages written in ts, some are including the TS source, is that needed?

No, it is not needed. You can publish a just the type declaration file (think header file if you use C++) which is perfect for anyone installing your package.

i consider it a best practice to always ship full tests with every package.

I disagree. I think npm publish should be the minimum build artifacts only. When I run npm install yourpackage, I can't even run your tests because npm only installs the production dependencies, not the devDependencies from yourpackage. And I can tell you from experience, that most people who publish tests to npm do so by accident, not on purpose. That's why I built Package Phobia.

I'm a huge fan of "show source feature" of the 90s web.

Met too! But that's not realistic because people who are publishing to npm are not necessarily writing JavaScript. They might be writing Rust and compiling to WASM and then deploying the WASM to npm. They don't need to deploy the Rust source code, nor do I want them too. Their tests are probably written in Rust too.

Probably a better approach to this "show the source feature" is to somehow enforce a build step when publishing to npm so that the source is guaranteed to match the build artifacts. Something like ZEIT Now Builders. But that's problem/topic for a different thread.

Depending on how you structure your project, you can have files be very safe.

I like files because I can just add files: ["dist"] and only the compiled output is going to be published. However, there is a known bug that is only affecting files but not .npmignore filtering. https://npm.community/t/ds-store-files-show-up-after-npm-publish/831

@ljharb
Copy link
Member

ljharb commented Feb 15, 2019

npm’s tink project will allow you to publish gigabytes if you like, but consumers will only install what they need - so this “phobia” will go away soon anyways, without forcing everyone to hamstring offline usage.

Again, even if you want to restrict what shows up on npm, the “files” field is very dangerous, and i do not want this project to endorse its use in any way.

@jchip
Copy link
Contributor Author

jchip commented Feb 16, 2019

While a few things are debatable with opinions on either sides of the fence, I hope we can agree on the basic things that a guideline about having the proper setup to ensure some files are not published is a good practice.

The guideline should only refer to the methods npm provides to do this, and avoid having any preferences on files or .npmignore.

Maybe note some gotchas and tips:

  • the existence of .npmignore makes npm ignore .gitignore, which is why some package is publishing .nyc_output and coverage, like this https://github.com/nodejitsu/node-http-proxy
  • using .npmignore would publish any unintended files left over, for example temp files.
  • files has a bug https://npm.community/t/ds-store-files-show-up-after-npm-publish/831
  • files ensure no unintended files get publish accidentally, but it could be dangerous, but following a well defined directory structure makes that less of an issue
  • publishing test is potentially a security attack vector, ie: incident in event-stream.

Other than that, I think we would leave the choice of whether to publish things like docs and test, or other files, to the individual package owner and their preferences.

@jchip
Copy link
Contributor Author

jchip commented Feb 16, 2019

also suggest to run a npm pack to inspect the tgz files content before doing a real publishing. I think there are tools out there that makes the publish process better. Maybe we can suggest one.

@ljharb
Copy link
Member

ljharb commented Feb 16, 2019

If you have 2FA enabled, npm itself now shows you a preview of the file list before actually publishing.

@mhdawson mhdawson added the package-maintenance-agenda Agenda items for package-maintenance team label Feb 25, 2019
@mhdawson
Copy link
Member

Some existing guidance: https://docs.npmjs.com/misc/developers

@bnb
Copy link
Contributor

bnb commented Mar 25, 2019

Nobody was in today's meeting that had context on it. Pushed to the next meeting, hopefully someone with context can join the next meeting 👍

@sam-github
Copy link

Personally, I am a fan of not including extraneous source in npm published packages, and having well structured git tags on the source repo, so the corresponding source is easy to fetch. I understand the opposite point of view, there are some pros and cons both ways, but I'm not sure there is package size and install time is an issue, and some packages have incredibly large sets of test data, I've been pretty astonished and the kinds of stuff that shows up in packages when exploring node_modules size on disk. Install time can be pretty brutal, too. These are pains that I feel often, and I can honestly say I've never wanted to run one of my dep's tests unless I was intending to modify them -- in which case I checkout the repo. I suspect many packages do publish their tests... but I also suspect that isn't a conscious choice, its because it all gets published by default.

Perhaps the guidance should be clear that if tests are published, they should be runable from a package install after an install of their dev deps, and that if tests include 100s of MB of test data, consider not publishing them.

@ljharb
Copy link
Member

ljharb commented May 1, 2019

@sam-github although i disagree, that’s fine - let’s just advocate for npmignore in that case and not “files”.

I agree with your last paragraph regardless.

@sam-github
Copy link

Yes, I agree on .npmignore, its more robust than files.

Its particularly important to have because with transpiling, .npmignore defaults to .gitignore, which is the right choice for npm, but the wrong choice for anyone working with transpilers, because transpiled output should be git ignored, but MUST be packaged, and won't be by default.

I think there are areas where there is consensus (but still a few will disagree with the consensus viewpoint), and areas where there is not (yet?) consensus, and including tests is the latter category.

Even with disagreements on what to do, I think its possible for the guidelines to lay out the pros and cons. Its more satisfying to tell people "this is the right way, do it" than "there is disagreement on this", but even in areas of disagreement we can give some information. Perhaps particularly in areas of disagreement we should give information, because it might be harder for people to figure out why there isn't agreement yet, and waste time surfing blogs to understand the different points of view.

@ghinks
Copy link
Contributor

ghinks commented Jul 19, 2019

I am going to go back to the original intent for this issue as raised by @jchip which is advice on which files should be ignored. I will open separate issues for the discussion points

  • github repos can be deleted
  • final bundle size impact != package size
  • tooling
  • minimalist vs fully inclusive

these items I summarized above and we shall come back to them but Joel’s initial intent was just to get a list of regular files such as .vscode and log files etc that should be ignored. My intent is to look at the existing github git ignore repo and see what additional files should be ignored. It may even be a good idea to raise a PR against the node gitignore that is being used.

@craftninja
Copy link
Contributor

From meeting: Maybe we give people pros and cons of choices, multiple best practices (plural) - offer different perspectives of different solutions. A blog post and updates to NPM docs? The blog post asks the question, doesn't necessarily provide The Solution. We could outline several use cases and examples for different paths.

@ghinks
Copy link
Contributor

ghinks commented Jul 31, 2019

agreed @craftninja . I am looking at using the gitignore.git from http://github.com/github/gitignore as a base ignore with modifications and adding, as you said a number of options, I am going off grid for 10 days though.

@wesleytodd
Copy link
Member

Since you mentioned http://github.com/github/gitignore maybe I can plug this module I maintain.

https://www.npmjs.com/package/create-git

The module uses those gitignore templates and lets you select multiple. If we want to make a "best practice" for .gitignore in node packages I agree we should PR there, and then I can add it as the default for this package. This might be the start of the "tooling" we want for this as well.

@lholmquist
Copy link
Contributor

agree with the blog post showing the different solutions. Showing the pros and cons of each solution is a good compromise on the different schools of thought.

while i don't like to compare node modules to maven packages, there is something nice about being able to "go download the source" of a particular package, at least in development

@mhdawson
Copy link
Member

From discussion in the meeting this week, sound like the next steps might be a PR for the guidance and then a blog post to publicize but the guidance might be more around discussing the options versus making a specific recommendation.

@ahmadnassri
Copy link
Contributor

I like the blog post idea, and would support it further in hosting (or cross-posting) it on the official npm blog as a guest post by whoever wants to author it.

@mhdawson
Copy link
Member

@jchip will you have time to put together a PR, authoring a blog post?

@jchip
Copy link
Contributor Author

jchip commented Aug 15, 2019

@mhdawson yes, I can work on it next week (end of summer, which was usually busy for me with kids).

@mhdawson
Copy link
Member

mhdawson commented Nov 5, 2019

Close as PR has landed

@mhdawson mhdawson closed this as completed Nov 5, 2019
@styfle
Copy link
Member

styfle commented Nov 5, 2019

@mhdawson Can you link to the PR and the final web page?

@Eomm
Copy link
Member

Eomm commented Nov 5, 2019

there are 2 PRs about this issue: #178 #243

and the final files are:
https://github.com/nodejs/package-maintenance/blob/master/docs/drafts/PUBLISH-GUIDELINES.md
https://github.com/nodejs/package-maintenance/blob/master/docs/drafts/PUBLISH-BLOG.md

but I'm not sure the blog post has been posted jet

@GrosSacASac
Copy link

When I use an external dependencies I like it to be as small as possible. I want to see more I can always use npm info ws repository.url. But there are problems when git and npm are out of sync, so at the end including everything in the npm package is also ok.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
article Need to spread this information package-maintenance-agenda Agenda items for package-maintenance team
Projects
None yet
Development

No branches or pull requests