Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-rayon version #30

Closed
dvtomas opened this issue Jun 15, 2021 · 7 comments · Fixed by #38
Closed

Non-rayon version #30

dvtomas opened this issue Jun 15, 2021 · 7 comments · Fixed by #38

Comments

@dvtomas
Copy link

dvtomas commented Jun 15, 2021

Would it be a lot of work to implement a version that does not depend on rayon? I understand it makes the implementation much simpler and more elegant, but in the end it is quite a dependency for something that basically deletes a directory.

@rbtcollins
Copy link
Collaborator

So the point of the Rayon dependency is to be able to delete directories in a reasonable timeframe on Windows. The Windows model requires multiple syscalls to remove readonly files (see fs.rs:delete_readonly), and may even require more work added in future. Those syscalls do not complete when the file metadata is updated in memory like in Unix OSes, but when the buffers with the data describing the changes are no longer needed : ownership does not get handed off to the OS. This leads to higher individual IO latency even though aggregate IO latency is roughly equivalent - unless threads (or fibres) are used.

Put another way, on Windows, the model from the OS vendor is "USE THREADS", and to work properly with the OS they are required, so we need to figure out the best way to do that.

There is a request I haven't captured as a bug for a thread-free version for one of tempdir/tempfile (I forget temporarily :)) to have a thread-free version as a feature, which I think we should probably support for other platforms. Probably not a great deal of work, just not having the rayon iterator:

let iter = path.read_dir()?.par_bridge();

would degrade us from parallel to serial behaviour. Of course then we need to convince higher crates to use threads on windows etc etc.

Note that this does need to be a feature - the parallel implementation was added as part of the performance done to increase rustup's performance: deleting the > 20k files in an upgrade of a docs package is otherwise a dominating part of the upgrade process.

If you or someone else want to work on this I think the first thing is patch with a default feature.

@dvtomas
Copy link
Author

dvtomas commented Jun 15, 2021

Thank you for your detailed explanation. I was not thinking about removing parallelism altogether, just perhaps implement it manually using thread primitives. Or, maybe, yes, a serial version as a feature might be useful as well.

Anyway, I have a very simple use-case which in the end I've solved just by attempting fs::remove_dir_all several times with a sleep between. It's a very blunt approach, but for my particular app it's OK and without any dependencies. So, I guess I'll just stick with that.

@XAMPPRocky
Copy link
Owner

Thank you for your detailed explanation. I was not thinking about removing parallelism altogether, just perhaps implement it manually using thread primitives.

Thank you for your issue! I'm not sure what the benefit of removing rayon would be other than obscuring what's happening. We'd need to implement much of the same behaviour and/or depend all the same crates to able to properly support parallelism, and we'd know have to maintain that implementation ourselves. There's no real benefit to implementing it ourselves, removing the dependency might give the false impression that it would somehow be quicker or simpler because it prints less things to the screen, but the truth is that it wouldn't save actually save us much if any time, especially in terms of implementation or support.

Having single-threaded or multi-threaded enabled by compile time features, and enabling single-threaded by default feels like the right approach to me.

@rbtcollins
Copy link
Collaborator

@dvtomas if I may ask a few follow up questions:

  • does the Rayon version handle the problem you're dealing with well, or does it also display the locking/erroring case you're facing?
  • What is it about the Rayon version that is a problem? Size of the binary? Compilation time? Security concerns due to dependency size? Licencing?

@dvtomas
Copy link
Author

dvtomas commented Jun 16, 2021

The main concern for me is binary size and compilation time. I mean it would not be a huge problem, but I try to keep my dependencies slim if possible and clean them up regularly. For similar reasons I've already rejected e. g. regex (replaced with simple manual string parsing for my use-case) or clap (in favor of gumdrop). You may also be interested in Stebalien/tempfile#129

As for the locking behavior, I'll give the latest version a try. Last time I've tried with remove_dir_all v0.5, and it actually did NOT fix the locking problems I've had, while simply repeating fs::remove_dir_all with sleeps in between did.

@dvtomas
Copy link
Author

dvtomas commented Jun 16, 2021

Ok, for the locking problems - I've thoroughly examined my code again, and there has been a file I forgot to close, so that probably was the actual cause of my locking problems. I've tried various combinations of fs::remove_dir all vs remove_dir_all::remove_dir_all on local or network drive, but the turnaround time for the deployment and test is very slow, so I'm sorry but I can't give you precise info on the problem I was facing before the file closing bug was fixed.

@rbtcollins
Copy link
Collaborator

Thank you very much for following up.

I have a patch for making parallelism optional, which should help with Stebalian/tempfile#129 but I think there's no evidence that having a non-rayon parallel version is needed at this point, so will close this when the parallel feature merges.

rbtcollins added a commit to rbtcollins/remove_dir_all that referenced this issue Mar 30, 2022
rbtcollins added a commit to rbtcollins/remove_dir_all that referenced this issue Mar 30, 2022
XAMPPRocky pushed a commit that referenced this issue Mar 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants