fix(iroh-blobs): use async_channel instead of flume for local_pool #2533
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
During soft shutdown of the local pool, a Finish message is sent to all threads. On main, this occasionally hangs. Further investigation showed that this is a message that is being sent but not received despite being in the channel. Adding a simple timeout to the select! so the flume recv call is executed again fixes it.
See discussion in https://discord.com/channels/949724860232392765/950683937661935667/1265205285618847744
So it seems that there is a bug in flume that occasionally leads to notifications being dropped.
This PR just does a 1:1 replacement of flume with async_channel.
Before this change, I can get the test_shutdown test to fail easily by running it 1000 times:
Result:
After this change, I can not get test_finish (renamed because it tests finish, not shutdown) to fail at all even after several 1000 tests.
Breaking Changes
None
Notes & open questions
Can somebody talk me out of this? I would prefer to keep flume, but the evidence above seems conclusive...
Note: why not tokio::sync::mpsc::channel? I need a mpmc channel. The handle can be cloned, and can send to any of the n worker threads.
Change checklist
Documentation updates following the style guide, if relevant.Tests if relevant.All breaking changes documented.