Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inline unicode regex and remove unicode-10.0.0 package (saves ~146MB from the project size) #323

Closed
wants to merge 2 commits into from

Conversation

aral
Copy link

@aral aral commented May 23, 2018

I was testing Dexie.js in Node this week (background/use case) when I noticed the size of my working directory shoot up by ~180MB following the installation of IndexedDBShim. It looks like ~146MB of that size is the unicode-10.0.0 module.

As far as I can see, the only bit of that module that is being used is a single regular expression in the file https://github.com/mathiasbynens/unicode-10.0.0/edit/master/Binary_Property/Expands_On_NFD/regex.js, which is 2KB.

In this pull request, I remove the unicode-10.0.0 dependency and inline the regular expression in util.js to shave ~146MB from the dependencies (not to mention lots of bandwidth saved for developers everywhere) ;)

All Node tests are passing but I cannot verify that this does not have side effects in browsers as I get ~23 errors on master with the mocha tests in browser (testing in latest Safari).

As far as I can see, there should be no side effects.

@aral aral changed the title Inline Unicode package (saves ~146MB from the project size) Inline unicode regex and remove unicode-10.0.0 package (saves ~146MB from the project size) May 23, 2018
@brettz9
Copy link
Collaborator

brettz9 commented May 25, 2018

Thanks--that'd be a helpful change, and you have some good questions which I hope to get to shortly after finishing up on some other work.

I think I'd prefer here though, that we keep the dependency, but change it to a dev dependency, and have a copy routine which grabs the file we need (and then import that file copy). I feel that process more clearly outlines where we are getting the data, and allows us to update if there are fixes, and to swap the version when Unicode updates.

@aral
Copy link
Author

aral commented May 25, 2018

@brettz9 Changing it to a dev dependency would remove the bandwidth burden from those using the library (for the vast majority of the use case), so I’d be happy with that for the reasons you mentioned.

Contributors would still have ~146MB more to download than before, however.

I hear your point about having the full package helping to (a) better reflect intent (b) make updates easier.

I wonder if those two goals might not also be met by:

  1. A good comment in the source:
// This regular expression is from the Unicode-10.0.0 package
// and used for XXX purpose.
  1. Manual updates of the line based on Unicode version updates (since 2012, this has been one update a year).

There’s just something about including a 146MB module for 2KB of code that I can’t seem to stomach :)

Your choice, of course. And, for my specific use case, I’m happy with the Unicode package being made a dev dependency (although I’d be happier if we shaved that 146MB off for everyone, including contributors) :)

Hope you’ve had a great week and here’s wishing you a lovely weekend :)

@brettz9 brettz9 closed this in af82f8f May 28, 2018
@brettz9
Copy link
Collaborator

brettz9 commented May 28, 2018

Hi Aral,

Apologies for the delayed reply, and I should say it is particularly encouraging when someone is looking into the testing side.

While "We're already burdening users with XYZ" is not usually a good excuse, if someone is recursively cloning web-platform-tests--as all testers really should because those are the only tests that are still failing, there is indeed already a lot for testers to download.

Besides this, I went ahead with the devDep inclusion for the additional reason that it minimizes the verification/trust needed. If someone trusts npm and has verified or has trust in the Unicode package (and runs the new copy script), there is no need for them to trust whether our regular expression is a faithful copy.

Btw, off topic... With the changes I made by having an interim include file, we are now no longer hard-coding the version of Unicode that we are including, so in the future, we should only need to update package.json and the copy script rather than messing with the source.

Thanks for the suggestions, the helpful reports, and look forward to any further contributions!

@aral
Copy link
Author

aral commented May 30, 2018

@brettz9 Hey Brett, thanks for the update. I understand your reasons – as I mentioned earlier, as long as end-users do not have to download the extra 146MB, I’m happy. Your solution and reason are logical.

Is this change in a local branch at the moment? I can’t seem to find it here.

@brettz9
Copy link
Collaborator

brettz9 commented May 31, 2018

I had been missing a 3.7.0 tag now added, but it had already been published to master and to npm.

The regex file is copied into src (see https://github.com/axemclion/IndexedDBShim/blob/master/src/unicode-regex.js ) via the Grunt "copy" task at https://github.com/axemclion/IndexedDBShim/blob/master/Gruntfile.js#L414-L418 . That copy task can be invoked from npm via the "copy" script: https://github.com/axemclion/IndexedDBShim/blob/master/package.json#L27 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants