Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI: fast summary section rewrites #677

Closed
wkalt opened this issue Oct 25, 2022 · 4 comments
Closed

CLI: fast summary section rewrites #677

wkalt opened this issue Oct 25, 2022 · 4 comments
Labels
cli feature New feature or request

Comments

@wkalt
Copy link
Contributor

wkalt commented Oct 25, 2022

Currently the add attachment and add metadata subcommands rewrite the entire file, which takes time proportional to the size of the file. It would be way faster if we only rewrote the summary section. For this we need some tricky code to bump all the appropriate summary section indexes, depending on where the summary offsets for the metadata/attachment sections reside, but it should totally be possible.

One thing to be aware of - we will lose crash safety, vs the current tmpfile + move approach. Options to mitigate seem like a) copy the original file to a "original file" location, which has the downside of still being proportional to file size, or b) pairing this with a robust "reindex" subcommand that will regenerate a valid summary section from the presumed-valid original data section (we should not need to touch the data section). Option b) seems best IMO, to make the average case very fast.

@wkalt wkalt added the feature New feature or request label Oct 25, 2022
@jtbandes
Copy link
Member

a robust "reindex" subcommand that will regenerate a valid summary section from the presumed-valid original data section

Isn't this what you get when you run recover on a file with a valid data section? Is your point that reindex should also be in-place?

@wkalt
Copy link
Contributor Author

wkalt commented Oct 28, 2022

@jtbandes I don't think there's any reason that "reindex" needs to be in place, since the expectation would be that it would only get run in uncommon failures of this function. If we already have this in the recover command I think that's sufficient.

@wkalt
Copy link
Contributor Author

wkalt commented Mar 17, 2023

@jtbandes
Copy link
Member

Looks like this was done in #915

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cli feature New feature or request
Development

No branches or pull requests

2 participants