Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MAX_ROWS option for CSV rendering #30268

Merged
merged 12 commits into from
Jun 6, 2024
Merged

Conversation

HenriquerPimentel
Copy link
Contributor

@HenriquerPimentel HenriquerPimentel commented Apr 3, 2024

This solution implements a new config variable MAX_ROWS, which corresponds to the “Maximum allowed rows to render CSV files. (0 for no limit)” and rewrites the Render function for CSV files in markup module. Now the render function only reads the file once, having MAX_FILE_SIZE+1 as a reader limit and MAX_ROWS as a row limit. When the file is larger than MAX_FILE_SIZE or has more rows than MAX_ROWS, it only renders until the limit, and displays a user-friendly warning informing that the rendered data is not complete, in the user's language.


Previously, when a CSV file was larger than the limit, the render function lost its function to render the code. There were also multiple reads to the file, in order to determine its size and render or pre-render.

The warning: image

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Apr 3, 2024
@pull-request-size pull-request-size bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Apr 3, 2024
@github-actions github-actions bot added the modifies/go Pull requests that update Go code label Apr 3, 2024
@lunny
Copy link
Member

lunny commented Apr 3, 2024

I think fall back render is still useful? Or we can view it as non-render mode?

@HenriquerPimentel
Copy link
Contributor Author

I think fall back render is still useful? Or we can view it as non-render mode?

The file will always render as a table until one of the limits is reached. If the entire table cannot be displayed, users will receive a warning with a link to the non-rendered file.

@silverwind
Copy link
Member

Maybe a bit higher default will be good, I'm thinking maybe 5k or 10k lines. Is the UI still reasonable fast with 10k lines and let's say 10 columns?

@silverwind silverwind changed the title Fix CSV rendering (#29663) Add MAX_ROWS option for CSV rendering Apr 4, 2024
@silverwind
Copy link
Member

Updated PR title. The linked issue #29663 seems incorrect. Is there a actually relevant issue for this?

@lunny
Copy link
Member

lunny commented Apr 7, 2024

It seems it's only 199 lines if you set MAX_ROWS = 200. One line was missed.
And the warning background color was also missed.
image

@silverwind
Copy link
Member

It seems it's only 199 lines if you set MAX_ROWS = 200. One line was missed. And the warning background color was also missed. image

Background removal is intentional.

@lunny lunny added this to the 1.23.0 milestone Apr 8, 2024
@lunny lunny added the type/enhancement An improvement of existing functionality label Apr 8, 2024
@HenriquerPimentel
Copy link
Contributor Author

Updated PR title. The linked issue #29663 seems incorrect. Is there a actually relevant issue for this?

Initially, I started looking for the code intending to fix that specific problem. However, since it was resolved with another solution, I suppose we could define it as an enhancement. Should I open a new issue?

modules/markup/csv/csv.go Outdated Show resolved Hide resolved
@pull-request-size pull-request-size bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 10, 2024
@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Apr 10, 2024
@HenriquerPimentel
Copy link
Contributor Author

Maybe a bit higher default will be good, I'm thinking maybe 5k or 10k lines. Is the UI still reasonable fast with 10k lines and let's say 10 columns?

When the limit is 5k, seems to still have good performance!

modules/markup/csv/csv.go Outdated Show resolved Hide resolved
@silverwind
Copy link
Member

Is it intentional that it renders both the message and the CSV?

image

If so, I would put the message on bottom where the user would expect more items.

Also I think we can increase to 10k or 20k easily. My browser renders 5k items with three columns in 600ms and hardware tends to become faster over time.

@silverwind
Copy link
Member

silverwind commented Apr 13, 2024

I will do some more fine tunes on the UI, but I think we should closely follow what GitHub does and not render partial rows when the limit is exceeded, just render the message only in that case.

I think it is a matter of preference. For example, GitLab always render and shows the message on the bottom of the code.

I find such behaviour inconsistent with code view where we also do no partial render when file is too large. And I think a partial render might potentially be misleading to the user thinking they see the whole file. So I say no to partial render :)

Copy link
Contributor

@wxiaoguang wxiaoguang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idea is good, but there are some problems.

modules/markup/csv/csv.go Outdated Show resolved Hide resolved
modules/markup/csv/csv.go Outdated Show resolved Hide resolved
modules/markup/csv/csv.go Outdated Show resolved Hide resolved
custom/conf/app.example.ini Outdated Show resolved Hide resolved
Copy link
Contributor

@wxiaoguang wxiaoguang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the new changes, it looks much better.

In my mind, there are some nits, for example:

  • variable naming (camelCase, but not raw_link, the lint also fails due to it)
  • path escaping (maybe it should use PathEscapeSegments)
  • it's better to add a test to ensure MaxRows really works (mock it to MaxRows=2)

(if you don't mind, I could also help to edit the PR to address these nits)

And since in the RC period, 1.22 only receives necessary backports, so I think this PR could be merged when 1.22 gets a stable release (just wait for some time)

@HenriquerPimentel
Copy link
Contributor Author

HenriquerPimentel commented May 14, 2024

Thank you for the new changes, it looks much better.

In my mind, there are some nits, for example:

  • variable naming (camelCase, but not raw_link, the lint also fails due to it)
  • path escaping (maybe it should use PathEscapeSegments)
  • it's better to add a test to ensure MaxRows really works (mock it to MaxRows=2)

(if you don't mind, I could also help to edit the PR to address these nits)

And since in the RC period, 1.22 only receives necessary backports, so I think this PR could be merged when 1.22 gets a stable release (just wait for some time)

(if you don't mind, I could also help to edit the PR to address these nits)

Yes please :)

HenriquerPimentel and others added 8 commits May 26, 2024 13:54
Fixes go-gitea#29663

Previously, when a CSV file was larger than the limit, the render function lost its function to render the code. There were also multiple reads to the file, in order to determine its size and render or pre-render.

This solution implements a new config variable MAX_ROWS, which corresponds to the “Maximum allowed rows to render CSV files. (0 for no limit)” and rewrites the Render function for CSV files in markup module. Now the render function only reads the file once, having MAX_FILE_SIZE+1 as a reader limit and MAX_ROWS as a row limit. When the file is larger than MAX_FILE_SIZE or has more rows than MAX_ROWS, it only renders until the limit, and displays a user-friendly warning informing that the rendered data is not complete, in the user's language.

The warning: ![image](https://s3.amazonaws.com/i.snag.gy/ieROGx.jpg)
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
- Removed tags from sanitizer rules;
- The warning message now is a table (to reuse UI elements);
- ctx.RelativePath escaped;
- Implemented a panic catch when getting a translation error;
@GiteaBot GiteaBot added lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. and removed lgtm/need 1 This PR needs approval from one additional maintainer to be merged. labels Jun 6, 2024
@lunny lunny added the reviewed/wait-merge This pull request is part of the merge queue. It will be merged soon. label Jun 6, 2024
@lunny lunny merged commit f7125ab into go-gitea:main Jun 6, 2024
26 checks passed
@GiteaBot GiteaBot removed the reviewed/wait-merge This pull request is part of the merge queue. It will be merged soon. label Jun 6, 2024
silverwind added a commit to silverwind/gitea that referenced this pull request Jun 6, 2024
* origin/main: (231 commits)
  Allow including `Reviewed-on`/`Reviewed-by` lines for custom merge messages (go-gitea#31211)
  Add `MAX_ROWS` option for CSV rendering (go-gitea#30268)
  Update `golang.org/x/net` (go-gitea#31260)
  Add replacement module for `mholt/archiver` (go-gitea#31267)
  Fix Activity Page Contributors dropdown (go-gitea#31264)
  Optimize runner-tags layout to enhance visual experience (go-gitea#31258)
  fix: allow actions artifacts storage migration to complete succesfully (go-gitea#31251)
  Add `lint-go-gopls` (go-gitea#30729)
  Make blockquote attention recognize more syntaxes (go-gitea#31240)
  Fix admin oauth2 custom URL settings (go-gitea#31246)
  Replace `gt-word-break` with `tw-break-anywhere` (go-gitea#31183)
  Make pasted "img" tag has the same behavior as markdown image (go-gitea#31235)
  Remove .segment from .project-column (go-gitea#31204)
  Fix overflow on push notification (go-gitea#31179)
  Fix NuGet Package API for $filter with Id equality  (go-gitea#31188)
  Fix overflow on notifications (go-gitea#31178)
  Update chroma to v2.14.0 (go-gitea#31177)
  Update air package path (go-gitea#31233)
  Bump `@github/relative-time-element` to v4.4.1 (go-gitea#31232)
  Add option for mailer to override mail headers (go-gitea#27860)
  ...
zjjhot added a commit to zjjhot/gitea that referenced this pull request Jun 7, 2024
* giteaofficial/main:
  Fix and clean up `ConfirmModal` (go-gitea#31283)
  Enable poetry non-package mode (go-gitea#31282)
  fixed the dropdown menu for the top New button to expand to the left (go-gitea#31273)
  Optimize repo-list layout to enhance visual experience (go-gitea#31272)
  Allow including `Reviewed-on`/`Reviewed-by` lines for custom merge messages (go-gitea#31211)
  Add `MAX_ROWS` option for CSV rendering (go-gitea#30268)
  Update `golang.org/x/net` (go-gitea#31260)
  Add replacement module for `mholt/archiver` (go-gitea#31267)
  Fix Activity Page Contributors dropdown (go-gitea#31264)
@go-gitea go-gitea locked as resolved and limited conversation to collaborators Sep 5, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. modifies/go Pull requests that update Go code size/L Denotes a PR that changes 100-499 lines, ignoring generated files. type/enhancement An improvement of existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants