Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load site via http first, fetch via IPFS in background #710

Open
olizilla opened this issue Apr 10, 2019 · 7 comments
Open

Load site via http first, fetch via IPFS in background #710

olizilla opened this issue Apr 10, 2019 · 7 comments

Comments

@olizilla
Copy link
Member

I'd like to be able to run companion in "upgade my experience where possible mode"... I'd like to know when a site has an IPFS version available, and that I could pin it or switch to ti if I wanted. Redirecting all sites that have a DNSLink generally makes my browsing experience slower. In the case of the http://tableflip.io we added our site to a local ipfs node some months ago, and have since cleared out our local ipfs stores a few times due to dev work, so now I can't load the site unless I kill the redirect.

I feel like a nicer experience would involve trying to load the site over both http and ipfs, and where both respond but http wins, signalling to the user that an IPFS version is available.

Something where you can "make this site available offline and re-host it" as an opt-in rather than an eager redirect.

@Mikaela
Copy link
Contributor

Mikaela commented Apr 10, 2019

👍 I am mostly experiencing this when I want to refer to IPFS website or documentation or Filecoin and as my HighFlood is around 20 to avoid Go-IPFS killing my IPv6 connectivity (ipfs/kubo#3065 (comment)), it often takes minutes for me to get to the site and by then I may have just given up or opened an incognito window (without IPFS Companion) to get to the sites in order to find the answer to why I happened to open the site.

@lidel
Copy link
Member

lidel commented Apr 11, 2019

I agree, if using IPFS degrades browsing experience, people will just turn it off.
Let me know your thoughts on musings below:

Short Term: "Co-host visited pages of this website"

Addressing this short-term would include:

  • Disabling initial redirect of DNSLink websites
  • Replacing "Pin IPFS Resource" toggle with "Co-host visited pages of this website" on DNSLink websites and /ipns/<fqdn>
    • Enabling it will activate ambient preload of website assets to local IPFS node in the background
    • Opt-in (users won't download website data twice unless they want to co-host specific one)
    • Notes on performance
      • It needs to be done per-resource (preload assets that were actually requested, as ipfs refs -r <root cid> will be too much if website is huge – think Wikipedia)
      • async, limit the number of assets fetched in the background (eg. with p-queue)

Mid Term: "Load from IPFS if faster"

When HTTP Proxy enabling <cid>.ipfs.localhost lands (ipfs/kubo#5982, ipfs/js-ipfs#1877), we can refine this and redirect to local gateway if it proves to be faster.

Note to self:

It will be tricky to implement it using existing WebExtension APIs without sending 1-2 additional XHR requests:

  1. Block main request in webRequest.onBeforeRequest
  2. Send a separate HTTP HEAD to original server and IPFS gateway
  3. Decide on next step based on who responds first. IF..
    • IPFS Gateway responded first, then redirect main request to it
    • original server was first, then just resume original request

@Mikaela
Copy link
Contributor

Mikaela commented Apr 12, 2019

Would it be possible to transparently move from HTTP to IPFS when the content is finally found via IPFS?

@lidel
Copy link
Member

lidel commented Apr 15, 2019

"transparent" and "content is finally found via IPFS" are a bit vague :)

We could enable permanent "redirect this website to IPFS" after the root CID got fetched in the background, but that does not guarantee us good performance without some additional work.

Long Term: Smart DAG Prefetching (?)

In practice, "content is found" could mean various things:

  • (A1) [fast but naive] content providers are found for requested CID (via DHT or other discovery method), but we don't know if we can reach any of them yet
  • (B1) [slow but robust] same as A1 but we also successfully connected at least one content provider AND received some bytes for the CID
  • (C1) [faster but bandwidth expensive] same as B1 but we actively prefetch website assets in the background (eg. all links on HTML page or even more exciting, everything 1-2 DAG levels from current root CID), so when user clicks on a link it data is already in local node and loads instantly.
    • DAG-based content addressing gives us much more info and could make prefetching a lot smarter. In the old web the client does not know how the entire website structure looks like, so it can preload only things that are referenced by already fetched HTML. IPFS enables clients with spare bandwidth to make concious choices to preload much deeper and/or make decisions based on content type (e.g. focus on HTML pages but exclude big videos).

"transparent" could mean we switch to IPFS without user:

  • (A2) doing anything
  • (B2) noticing any experience degradation

A2 is easy, but there is a potential problem with B2: even if we have a confirmation of receiving old data (B1), that is only for that single root CID.

The nature of websites is that they can be very big, different subsets of IPFS peers can have different parts of a website and we are unable to guarantee all the other assets/subpages will load as fast as the initial one without introducing some content prefetching heuristics like C1.

Website prefetching is certainly something we could implement as an experiment in userland (this browser extension). We need to be mindful that it introduces known problems such as erosion of user agency (you fetch content that you did not request yet) and additional bandwidth costs. It needs to be researched and designed very carefully, but an opt-in experiment with metrics gathering could be a way to kick this work off.

@olizilla
Copy link
Member Author

A use case to aim for coulkd be "make pages i bookmark availble offline". I'm imagining

  1. I visit a page with a DNSLink. The page is fetched over HTTP.
  2. I bookmark the page
  3. The page, and all dependencies are fetched over IPFS to your local repository
  4. Kill the router
  5. Visit a the bookmarked page, hooray, it is loaded over IPFS and the page still works.

...I am aware that finding "all dependencies" will mean we have to parse the html, which is not ideal. An IPLD importer that was html aware would be rad! `ipfs add and have it store the files and track the links.

@lidel
Copy link
Member

lidel commented Dec 12, 2019

When #830 and #827 are merged, user will be able to disable redirect of DNSLink websites while still having the website data being preloaded to local node:

3-2019-12-11--16-46-48

I believe this is the best we can do for now, given the decisions from ipfs-inactive/docs#405 (comment)

@cwaring
Copy link
Member

cwaring commented Dec 12, 2019

Thanks for doing the groundwork here! Good to know that we are setup and ready should we wish to flip the switch in the future. Excited to see this ship.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: Needs Grooming
Development

No branches or pull requests

4 participants