Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compression #4718

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open

Compression #4718

wants to merge 10 commits into from

Conversation

xmaysonnave
Copy link
Contributor

Motivation

IPFS with TiddlyWiki upload contents to IPFS nodes.
IPFS gateways where you retrieve contents have different setup:
1 - The ipfs.infura.io proxy server do not send compress content to client
2 - The gateway.ipfs.io proxy server send compressed content to client
3 - Uploading compressed content siply do not work as browsers do not receive the proper response header
4 - I opened a feature request to the ipfs server team to receive from the gateway the correc response header
ipfs/kubo#7268
However there is a consensus among the IPFS community who consider that compressed content sould be handled at the application level.
ipfs/notes#324

Artifacts

  1. bin/build-site.sh
  2. boot/boot.js
  3. boot/pako.min.js
  4. boot/pako.min.js.meta
  5. boot/sjcl.js.meta
  6. core/modules/commands/compress.js
  7. core/modules/commands/savewikifolder.js
  8. core/modules/startup/compress.js
  9. core/modules/widgets/compress.js
  10. core/pako-license.tid
  11. core/templates/store.area.template.html.tid
  12. editions/empty/tiddlywiki.info
  13. editions/full/tiddlywiki.info
  14. editions/tw5.com/tiddlywiki.info
  15. plugins/tiddlywiki/innerwiki/template.tid
  16. plugins/tiddlywiki/share/exclusions.tid

Implementation

The $:/core/templates/store.area.template.html handles a new compressedStoreArea
The compress widget jsonify the content and call the $tw.compress.deflate located in boot.js
The deflate function compress, convert the uint8array in base 64.
Then the compress widget encrypt if applicable.
Finally the content is jsonified.
The json signature is usefull to analyse json content (either compressed/encrypted).
IPFS with TiddlyWiki use this feature to analyse whether an attachment or an import coming from IPFS is encrypted (compression is not implemented yet)

Open Questions

  1. Compression and Encryption happen when no Tiddlers have been loaded.
    The hook "th-boot-tiddlers-loaded" is called during the $tw.boot.startup.
    A mechanism to enhance the $tw.boot.boot function could be interresting if plugin developers wish to use another encryption/compression strategy without modifying the core sequence.
    It probably means to have a special html container who will load the minimal javascript logic who could overload the compress/encryption mechanism with other strategies.
  2. While developping IPFS with TiddlyWiki I faced an issue with encrypted attachments (base64)

Results

This PR is able to generate a few samples:

  1. editions/full
  • output/editions/full/index.html -> 10.8MB
  • output/editions/full/index-compressed.html -> 3.0MB
  • output/editions/full/index-compressed-and-encrypted.html -> 4.0MB
  • output/editions/full/index-encrypted.tml -> 13MB
  1. editions/empty
  • output/empty.html -> 2.3MB
  • output/empty-compressed.html -> 611.3kB
  • output/empty-compressed-and-encrypted.html -> 758.8kB
  • output/empty-encrypted.html -> 2.8MB
  1. editions/tw5.com
  • output/index.html -> 5.7MB
  • output/index-compressed.html -> 3.7MB
  • output/index-compressed-and-encrypted -> 4.2MB
  • output/index-encrypted.html -> 8.1MB

Some benchmarking :

  1. on my laptop Core I5-7300U
  • Compressed Wiki 2.8MB browser inflated 432ms
  • Wiki node deflated 431ms
    2 - on my mobile Redmi note 7 pro
  • inflating could go up 700ms while deflating could go up to 1400ms

Remarks

  1. Compression is not activated by default and ensure a backward compatibility.
  2. The pako library is a small library 46K, popular and maintained
  3. The /boot/pako.min.js.meta has three fields:
    3.1 _license_uri, the url to the license
    3.2 _project_uri, the project url
    3.3 _source_uri, where the library has been downloaded with its version
  4. The /boot/sjcl.js.meta has been modified to expose the detailed previous fields

@Jermolene
Copy link
Member

Hi @xmaysonnave apologies for the delayed response. I have some concerns about including support for compression in the core:

  • For many use cases, it is simpler and less brittle to use external compression mechanisms (e.g. hosting TW HTML files on a web server that supports GZip, or an enhanced TiddlySpot saver that applies GZip compression)
  • Adding a new boot-time component makes the core and the build process significantly more complicated for all users
  • The support in this PR is still fairly incomplete (for example, there is no support for importing from a compressed wiki)

So, I'd like to explore how we could add the necessary hooks to make it possible to implement this compression mechanism as a plugin. The plugin could include a raw-markup tiddler that reads the compressed store area and loads the tiddlers into the $tw.preloadTiddlers array.

@xmaysonnave
Copy link
Contributor Author

1 - Your first remark is interesting as I made a request to go-ipfs maintainer to enhance the http header when a compressed content has been uploaded. However their consensus is to say that compression should be handled at the application level rather than the server level. It was one of my motivation to handle compression this way.
ipfs/kubo#7268

2 - The support in this PR is complete the $tw.boot.boot instantiates the $tw.util.Compress provided in the boot.js patch.
Node is able to compress, browser are able to inflate and deflate.

3 - Not sure if a plugin could handle the situation. It implies then that plugin will not be compressed. The approach here is the same than the current encryption mechanism. Compression occurs before any loaded tiddly logic. AFAIK only libraryModules, bootKernelPrefix and bootKernel are loaded to handle the situation. I need to say to at the moment I compress and encrypt a complete wiki with Ethereum keys. It means that extensibilty a the boot level could a real plus. My current approach is the following:

3.1 Node
I have an ipfs-tiddlywiki.js who is able to compress and encrypt.

var $tw = require('tiddlywiki').TiddlyWiki()
var $tw = require('./src/boot/ipfs-boot.js').TiddlyWiki($tw)

// Pass the command line arguments to the boot kernel
$tw.boot.argv = Array.prototype.slice.call(process.argv, 2)

// Boot the TW5 app
$tw.boot.boot()

ipfs-boot contain the extra logic who overload crypto and boot.

3.2 Browser
Here the situation is more tricky.

3.2.2 The first dirty solution to let it work was to copy a pached boot.js in node_modules/tiddlywiki/boot and let tiddly do the work. node and browser were tricked and used the pathed boot.js.

3.2.3 The second solution was to find a way to enhance the boot sequence without touching the boot.js. Currently I load a library modules who suppress the boot before bootKernelPrefix and bootKernel.

var ipfsBootPrefix = function ($tw) {
  /*jslint node: true, browser: true */
  'use strict'

  $tw = $tw || Object.create(null)
  $tw.boot = $tw.boot || Object.create(null)
  $tw.boot.suppressBoot = true

  return $tw
}

if (typeof exports === 'undefined') {
  // Set up $tw global for the browser
  window.$tw = ipfsBootPrefix(window.$tw)
} else {
  // Export functionality as a module
  exports.ipfsBootPrefix = ipfsBootPrefix
}

Then I modified tiddlywiki5.html.tid to load the real boot after the bootKernel.

<!--~~ Boot kernel ~~-->
<div id="bootKernel" style="display:none;">
`{{ $:/boot/boot.js ||$:/core/templates/javascript-tiddler}}`
</div>
<!--~~ Ipfs Boot kernel ~~-->
<div id="ipfsBootKernel" style="display:none;">
`{{ $:/boot/ipfs-boot.js ||$:/core/templates/javascript-tiddler}}`
</div>

I guess you got the overall idea. However I'm not completely satisfied with this approach. The $:/core/templates/tiddlywiki5.html is not extensible and need to be patched to load the ipfsBootKernel. The $:/core/templates/tiddlywiki5-external-js.html seems to be a better approach as my plugin can update $:/core/templates/tiddlywiki5.js.

Thanks

@pmario
Copy link
Member

pmario commented Aug 29, 2020

@xmaysonnave

2 - The support in this PR is complete the $tw.boot.boot instantiates the $tw.util.Compress provided in the boot.js patch.
Node is able to compress, browser are able to inflate and deflate.

Did you test the following workflow.

  • Compress test.html
  • open it in browser
  • create abc.html that contains tiddler A and tiddler B
  • Save compressed.
  • now open the file explorer
  • drag & drop import the compressed abc.html into test.html in the browser

The expected result is, that there should be an Import tiddler with A and B, that should be importable.

The same needs to be true, if abc.html is compressed and encrypted. The import mechanism still has to work.

@xmaysonnave
Copy link
Contributor Author

Thanks to highlight this work flow. I didn't know that it was possible. I'll take a look.

@pmario
Copy link
Member

pmario commented Aug 29, 2020

I did read the whole thread at: ipfs/kubo#7268 and I think similar reasoning applies to the TW core.

TW core already contains a jszip library. ... But I think it's not usable for your usecase. It wasn't implemented that way. You need "boot level" decompression and decryption.

It seems "paco" is well maintained and reasonably small. ... but it will add a new dependency to the TW core that can't be changed anymore. ... Something that is part of the core is really hard to be removed again.

Jeremy is "super picky" (in a good way) about backwards compatibility and it's almost impossible to depricate core functions.


Jeremy wrote:

So, I'd like to explore how we could add the necessary hooks to make it possible to implement this compression mechanism as a plugin. The plugin could include a raw-markup tiddler that reads the compressed store area and loads the tiddlers into the $tw.preloadTiddlers array.

I'd definitely want to go that route. ... It will open up the possibility to use different compression and decompression methods, depending on the users usecase and the users taste.

eg: Your tests showed, that empty-compressed is about 612kByte. That's much better than 2.3MByte, but I'd expect something like 370kByte as shown in the dev-tools if you load from https://tiddlywiki.com/empty.html, where the server does the compression.

That's just an example and no offence intended.


It seems, there is a possibility that you dindn't consider yet. The "external-js" configuration to load and save TWs. It may be a proper possibility for your usecase.

see: https://github.com/Jermolene/TiddlyWiki5/blob/master/editions/tw5.com/tiddlywiki.info#L54

It saves a "index.html" and a "tiddlywiki5.js" file. ... The problem is, there are some bugs in the current implementation. ... But the concept should be worth a view.

An empty index.html is about 80kByte
tiddlywiki5.js is about 2MByte.

So with IPFS it should be possible to get a unique and "stable" address for tiddlywiki5.js so it would be stored only once. right?

index.html should be able to dynamically load this "core". It would be a "storage container", that only contains the content. This content can be encrypted before written to the ipfs store and decrypted after the core has been activated.

So IPFS disc consumption would be much smaller. Since the content is encrypted client side, IPFS deduplication can't be used for content anyway.

What do you think?

@pmario
Copy link
Member

pmario commented Aug 29, 2020

@xmaysonnave Here's the discussion about the external-js with a single file wiki: #3501 ... It may not 100% solve your problem / usecase, but IMO it could be a different approach that may reduce data sent and stored at the IPFS store.

I did consider it to be used with DAT- now HYPER-protocoll. Since HYPER is a copy-on-write store, which is natively versioned, it makes much sense to use TW that way.

@xmaysonnave
Copy link
Contributor Author

Thanks for your comments.

I started one or two weeks ago to analyse external-js. I experimented quickly without much success. I'll revisited it. I need to find the proper balance between node generation, drag and drop plugin, libraries and manual modifications, especially at the html header level where users could tune their wiki (Meta data, etc...) external-js is interesting as he externalized exactly what I need to tune.

The interesting point is that now I use my own ipfs-tiddlywiki.js with node. It open the door to tiddly node generation among chained packages. It will help to decouple the ipfs stuff from my public wiki. Let the time mature ideas and experiments...

By the way my compressed empty contains my plugin. It explains the size you noticed but 700k is not bad. A few hundred ms to inflate.

@pmario
Copy link
Member

pmario commented Aug 29, 2020

I can see, that you already digged deep into the TW build process. So I'll update my PR with some more changes.

If you checkout my empty-external-js branch, you'll see the tiddlywiki.info from empty editon has a external-js build command now. AND it has 1 plugin (for testing) that will be exported to the external core.

It can be built using the following command. (I'm using node version 12.18.3)

cd ~/to/your/tw5/repo
node tiddlywiki.js editions/empty/ --build external-js

The result can be found in the editions/empty/output directory.

tiddlywiki5-external-js.html template contains a new name for the core now: external-core.js

tiddlywiki.js.tiddlers.tid contains an extended filter, that will add every plugin imported in tiddlywiki.info. Those plugins will be exported into the external core.

save-all-external-js.tid contains a filter now, that will remove every plugin, that is imported with drag&drop. ... So if you need those plugins in your html file you'll need to modify this filter again....

I prefere to add plugins to the external core with tiddlywiki.info

hope this helps. the TW code is from 5.1.18-prerelease. ... But that shouldn't matter very much. The concept is the same.

@pmario
Copy link
Member

pmario commented Aug 29, 2020

Here's the old PR #3501

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants