Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use separate asset path for different build #18348

Closed
wants to merge 2 commits into from

Conversation

wxiaoguang
Copy link
Contributor

@wxiaoguang wxiaoguang commented Jan 21, 2022

Background

Gitea used to use /assets/js/index.js?v={{MD5 AppVer}} to access static files. This mechanism has some problems:

  1. not CDN friendly, some CDN would ignore the ?v=... query string.
  2. some files are not rendered by templates and can not get the versioned URL, ex: /assets/vendor/plugins/codemirror/addon/mode/loadmode.js and others.

Plan

This PR introduces a dynamic path for web URL:

  • with dynamic asset files (in development), the path prefix is: /public~dynamic: /public~dynamic/js/index.js
  • with static asset files, the path prefix is: /public~17e7b119ec1 (versioned with timestamp): /public~17e7b119ec1/js/index.js

It's possible to help #18347 and other problems.

There are still more work to do (there are many hard-coded /assets strings in code). If this PR is fine, I can continue to work on it.

More details:

Why not using index-{ver}.js or index.{ver}.js?

There are a lot of legacy files (for example, /plugins/codemirror/addon/mode/loadmode.js, they are not controlled by webpack). Using hash in filenames could be done in future, but not now.

Why not using the /assets path?

The assets is not a good name indeed. Why the GITEA_CUSTOM/public was still called public but used as /assets? We should either call all of them assets (rename public) OR public (as comment above with ~) OR /assets/public-{ver}. I do not like keeping getting new names for one concept.

What about reserved path name?

We can use ~ which is not a valid user/repo name, then we do not need restrict any names anymore.

What about existing custom templates?

We provide template variable and JS variable, AssetUrlPrefix. It's totally transparent to all users.

If the writer of the custom templates used hard-coded /assets/, then they should update. I believe such usage is not recommended, they should use {{AssetUrlPrefix}}

Possible solutions

Please vote:

  • Use /public~{ver}/ for GITEA_CUSTOM/public
  • Use /assets~{ver}/ and rename GITEA_CUSTOM/public to GITEA_CUSTOM/assets
  • Use /assets/public-{ver}/ for GITEA_CUSTOM/public

@wxiaoguang wxiaoguang added the pr/wip This PR is not ready for review label Jan 21, 2022
@wxiaoguang wxiaoguang added this to the 1.17.0 milestone Jan 21, 2022
@silverwind
Copy link
Member

Removal of v query is good, it was rather pointless anyways because it never applied to all assets. Not sure on the public- prefix, I think we can just use hash only.

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Jan 21, 2022
@wxiaoguang
Copy link
Contributor Author

Not sure on the public- prefix, I think we can just use hash only.

We had better have a special prefix, otherwise it's difficult to grep or proxy requests.

ps: I updated the original comment to describe it clearly: /public-17e7b119ec1/js/index.js

@silverwind
Copy link
Member

silverwind commented Jan 21, 2022

I think in any case, we should retain the /assets/ directory as the first path segment. I would still prefer /assets/<hash>/index.js.

@wxiaoguang
Copy link
Contributor Author

I think in any case, we should retain the /assets/ directory as the first path segment. I would still prefer /assets/<hash>/index.js.

I do not want to use assets in the new PR. Because there are too many asset concepts (and directories) in code, some are just for templates.

And /public-{ver} or /public.{ver} is the best choice, because if users put their customized files in GITEA_CUSTOM/public/foo.js, they could access it from http://xxx/public-{ver}/foo.js, it's more consistent than http://xxx/assets/foo.js or http://xxx/public/{ver}/foo.js. I saw some related issues that users couldn't put files in the correct directory because the names and layouts were confusing.

@lafriks
Copy link
Member

lafriks commented Jan 21, 2022

why not use /public/foo-{ver}.js? Adding {ver} to directory seems counter intuitive for me at least

@wxiaoguang
Copy link
Contributor Author

why not use /public/foo-{ver}.js? Adding {ver} to directory seems counter intuitive for me at least

It's impossible now. Using /public/foo-{ver}.js has two problems which are very hard to handle:

  1. We have to do rewrite (parse) the URLs for all files, to remove these suffixes
  2. Many third-party files can not use this pattern directly (ex: /vendor/plugins/codemirror/addon/mode/loadmode.js and more)

@silverwind
Copy link
Member

silverwind commented Jan 21, 2022

I too prefer hash in the filename. I think it can be made to work at least for the dynamically loaded webpack assets which allow to define [contenthash] as part of the filename (which IIRC uses xxhash). To achieve the same for non-webpack loaded assets like index.js, backend changes will be necessary.

I do not want to use assets in the new PR. Because there are too many asset concepts (and directories) in code, some are just for templates.

I don't follow. We've worked so hard to get everything under a common assets directory which is a good name and it allows users to easily move those files to another domain or virtual host based on this path prefix. I think we shouldn't change this prefix it now.

@zeripath
Copy link
Contributor

Any top level path you add means we have to restrict another username and we break every installation that is already using that username.

I think you should use /assets

Copy link
Contributor

@zeripath zeripath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any top level path means that path must become a restricted username.

Either use /assets or a subdirectory of it - or add to

My very strong preference is that we don't add more restricted usernames therefore I strongly recommend that you use /assets or change our URL structure so we can avoid this in future.

@wxiaoguang
Copy link
Contributor Author

Any top level path means that path must become a restricted username.
My very strong preference is that we don't add more restricted usernames therefore I strongly recommend that you use /assets or change our URL structure so we can avoid this in future.

Maybe we do not need to restrict any top-level names any more. For internal usage, the char ~ is a valid (non-escaping) URL character. But ~ is not a valid char for username:

	// AlphaDashDotPattern characters prohibited in a user name (anything except A-Za-z0-9_.-)
	AlphaDashDotPattern = regexp.MustCompile(`[^\w-\.]`)
	// used by IsUsableRepoName and IsUsableUserName

So, what about using: /public~{ver}? @zeripath @silverwind

@wxiaoguang
Copy link
Contributor Author

wxiaoguang commented Jan 22, 2022

I too prefer hash in the filename. I think it can be made to work at least for the dynamically loaded webpack assets which allow to define [contenthash] as part of the filename (which IIRC uses xxhash). To achieve the same for non-webpack loaded assets like index.js, backend changes will be necessary.

There are a lot of legacy files (for example, /plugins/codemirror/addon/mode/loadmode.js, they are not controlled by webpack). Using hash in filenames could be done in future, but not now.

I do not want to use assets in the new PR. Because there are too many asset concepts (and directories) in code, some are just for templates.

I don't follow. We've worked so hard to get everything under a common assets directory which is a good name and it allows users to easily move those files to another domain or virtual host based on this path prefix. I think we shouldn't change this prefix it now.

assets is not a good name indeed. Why the GITEA_CUSTOM/public was still called public but used as /assets? We should either call all of them assets (rename public) OR public (as comment above with ~) OR /assets/public-{ver}. I do not like keeping getting new names for one concept.

Please vote:

  • Use /public~{ver}/ for GITEA_CUSTOM/public
  • Use /assets~{ver}/ and rename GITEA_CUSTOM/public to GITEA_CUSTOM/assets
  • Use /assets/public-{ver}/ for GITEA_CUSTOM/public

modules/public/public.go Outdated Show resolved Hide resolved
@silverwind
Copy link
Member

I think we should just copy GH in this regard, /assets/<file>-<hash>.js. This requires changes in both webpack (dynamic chunks) and backend (static files).

@wxiaoguang
Copy link
Contributor Author

I think we should just copy GH in this regard, /assets/<file>-<hash>.js. This requires changes in both webpack (dynamic chunks) and backend (static files).

  1. It's not ideal: it's really strange to map /assets/ to GITEA_CUSTOM/public/, why can not we make the names consistent?
  2. It's impossible now: there are many legacy files, this PR doesn't make all files managed by webpack.

So, could we vote in these 3 options to move on? And we can continue to improve in future.

@Gusted
Copy link
Contributor

Gusted commented Jan 22, 2022

Ah well, seems like everyone and their cat have a say on this, so I will just give my vote before this explodes. I think the "Use /assets~{ver}/ and rename GITEA_CUSTOM/public to GITEA_CUSTOM/assets" is the best option.

As from my understanding the /public folder is being served as /assets, so the renaming shouldn't cause conflicts.
As I don't think doing assets/public is a good thing, as there aren't any other files or folders that whole be like: assets/xxx.xxx, doing public/ is also meh, as that will more or less have a vague meaning like we currently have.

Btw, development files should most likely also have a time-stamp instead development, because otherwise it could be cached between rebuilding (the JS/CSS files) and not everyone have caching disabled while developing(or don't even know that it was a setting in the first place).

PS: Just vote, or bring your own PR to do the optimal index-{ver}.js or index.{ver}.js solution ;)

@wxiaoguang
Copy link
Contributor Author

Thank you Gusted!

Since Gitea PR requires 2 approvals to get merged, I hope I can find one more voter(reviewer) to continue the work.

@silverwind
Copy link
Member

I agree to renaming public to assets, but not with the suffix in the directory name, it should be in the file name instead.

@lafriks
Copy link
Member

lafriks commented Jan 24, 2022

Sorry but I'm also against hash in directory name

@wxiaoguang
Copy link
Contributor Author

wxiaoguang commented Jan 24, 2022

OK, then we need some new solutions.

Since most do not like this PR, so we do not need to work on it. I still can not understand why it should be no version in directory name . If anyone can continue, it's free to take it or work on it.

@silverwind
Copy link
Member

Alternate solution that alters only the filenames: #18632, still WIP.

@go-gitea go-gitea locked and limited conversation to collaborators Apr 28, 2022
@wxiaoguang wxiaoguang deleted the fix-public-asset branch August 9, 2022 15:15
@wxiaoguang
Copy link
Contributor Author

There are a lot of people complaining the cache problem again. I have received at least 10 reports.

If this PR can be merged early, there will be no such problem any more and can be more friendly to end users.

This PR is very clear and simple, it's very easy to be refactored to other solutions (eg: dynamic asset names) later.

I still can not understand why the bug should be kept and make people complaining.

@lunny
Copy link
Member

lunny commented Aug 12, 2022

Development & Production version should enable cache when the asset is the same and disable cache when the asset is not the same. Since cache could be disabled from web browser. It seems it's unnecessary to diff Development & Production versions.

A directory prefix will make all the cache disabled even if the image is not changed between two updates. i.e. when a public site upgrade from 1.16.1 to 1.16.2, all the assets cache will be disabled especially monoca editors, it will load a bit time. But that could be avoid in an ideal situation.

@wxiaoguang
Copy link
Contributor Author

How much bandwidth will it waste to reload all assets? Are these assets larger than the Gitea rendered pages during daily use? Does the "a bit time" make users unhappy?

@lunny
Copy link
Member

lunny commented Aug 12, 2022

I don't think users cannot bear that, but if we can have an ideal method, that's better.

@wxiaoguang
Copy link
Contributor Author

Then, do you mean that people prefer to bear bugs (UI not working) instead of the only once "a bit time" delay?

Especially the delay only occurs for the first time after upgrade, which is pretty low frequency.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. pr/wip This PR is not ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants