Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

links should be exactly the same as wikipedia #2

Closed
jbenet opened this issue May 1, 2017 · 8 comments · Fixed by #77
Closed

links should be exactly the same as wikipedia #2

jbenet opened this issue May 1, 2017 · 8 comments · Fixed by #77

Comments

@jbenet
Copy link
Member

jbenet commented May 1, 2017

links should be exactly the same as wikipedia

  • pages have “.html” at the end, can we get no “.html”? (with directories + index.htmk or whatever)
  • i want the links to be the same so people can just change a prefix somewhere and have everything else just work
  • are media links the same as they are on wikipedia.org?
@Kubuxu
Copy link
Member

Kubuxu commented May 1, 2017

pages have “.html” at the end, can we get no “.html”? (with directories + index.html or whatever)

This is possible as a step after adding into IPFS but will result in a trailing slash after the article name (wikipedia won't accept it back: https://en.wikipedia.org/wiki/DNA/)

are media links the same as they are on wikipedia.org?

Quite similar but wikimeida does sharding in fs (wikimedia uploads is backed by filesystem and kiwix/zim files skip this step) So links are:

Where f/f7 is randomised/hashed.

@jbenet
Copy link
Member Author

jbenet commented May 5, 2017

  • going to distributed version is easy and obvious. (right now it's not)
  • going back to wikipedia is way easier than now
  • so let's do it

@jbenet
Copy link
Member Author

jbenet commented May 5, 2017

Quite similar

can we get them to be exactly the same?

@ghost
Copy link

ghost commented May 5, 2017

can we get them to be exactly the same?

Yes but this will take significant work on how the dumps are created

@jbenet
Copy link
Member Author

jbenet commented May 5, 2017

what's "significant work"? should we leave it for later?

i want to make sure people can take existing wikipedia links and apply a trivial transform manually (eg changing the domain) and have it all just work.

@ghost
Copy link

ghost commented May 5, 2017

For a quick fix at least on ipfs.io, we can apply a redirect that appends the .html suffix.

what's "significant work"? should we leave it for later?

@Kubuxu will know more, but I think it's either a) making the dumps closer to the original, or 2) adding links within ipfs.

Or maybe it's as simple as writing the files out without the .html suffix in the first place? I think it's okay to just not have that suffix?

@Kubuxu
Copy link
Member

Kubuxu commented May 5, 2017

can we get them to be exactly the same?

Not right now, it would require:

  1. knowing how sharding is done
  2. rendering pages ourselves (or rewriting all current pages to those).
  3. moving the files to shards

It is a lot of work t odo.

we can apply a redirect that appends the .html suffix.

This would mean that it wouldn't work on local gateways. Non-uniform UX is bad.

@jbenet is trailing / ok?

@JanZerebecki
Copy link

wikimedia: https://upload.wikimedia.org/wikipedia/commons/f/f7/Tower_185_during_construction_phase_on_a_sunny_afternoon-Edit.jpg
Where f/f7 is randomised/hashed.

I think it is the first byte, a slash and the first 2 bytes of the md5sum string of the file name, didn't check in the configuration nor code and this may vary by installation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants