Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libflux: support setting prep watcher priority #6367

Closed

Conversation

chu11
Copy link
Member

@chu11 chu11 commented Oct 11, 2024

Problem: Watcher priorities are only configurable in check watchers, but would be useful in prepare watchers as well.

Add support for prepare watcher priorities. Add unit tests.


split out from #6353. the "initial credits" from a subprocess should be sent immediately after the subprocess is returned to the caller. So we want to make the "initial credits" prepare watcher the first thing to be called.

Problem: Watcher priorities are only configurable in check
watchers, but would be useful in prepare watchers as well.

Add support for prepare watcher priorities.  Add unit tests.
@garlick
Copy link
Member

garlick commented Oct 12, 2024

The prep watchers are always called last, just before event loop might block in poll(2). I think this would only increase the priority of one prep watcher over another? Is that what you want?

@chu11
Copy link
Member Author

chu11 commented Oct 12, 2024

For my work in #6353 I want the "initial credits" to be sent to the client first before anything else happens. I suppose what you're saying is that because all prepare watchers are called before blocking, setting a priority for the prepare watcher doesn't matter?

I suppose that makes sense. I guess I figured "something" could happen in other prepare watchers, so we'd want the "initial credits" prepare watcher to send initial credits before anything else might happen.

@garlick
Copy link
Member

garlick commented Oct 12, 2024

Doesn't seem likely to have much effect? Any regular watcher that is ready would run before the prep watchers, and for the most part, work is not done in prep watchers. For example in the future implementation, continuations are called from check watchers.

What about just having the subprocess server ignore the first credit callback from each stream and instead use info it already has to immediately generate one initial add-credit response for all active input streams? It has access to SUBPROCESS_DEFAULT_BUFSIZE and the flux_cmd_t object sent by the client.

@garlick
Copy link
Member

garlick commented Oct 12, 2024

Oh maybe getopt works to get the buffer sizes already?

@chu11
Copy link
Member Author

chu11 commented Oct 12, 2024

What about just having the subprocess server ignore the first credit callback from each stream and instead use info it already has to immediately generate one initial add-credit response for all active input streams? It has access to SUBPROCESS_DEFAULT_BUFSIZE and the flux_cmd_t object sent by the client.

I'm loosely already doing this.

I think there may be a subtlety lost in this discussion. Lemme explain PR #6353 at a high level regarding the use of the prepare watcher.

For local subprocesses, we basically fork/exec and return a flux_subprocess_t * to the client. Because we need to return the flux_subprocess_t *, I cannot issue a callback to on_credit before returning to the user. So I need to create a watcher to issue the "initial credits" to the user AFTER returning flux_subprocess_t *. In order to guarantee that the initial credits are sent first thing, i create a prepare watcher and set the priority high. So the user should get those initial credits as early as possible. If the client is already in the "check" part of the ev-loop, then it'll be first thing during the next prepare cycle of the ev-loop. If the user hasn't called flux_reactor_run() yet, it'll be the first thing called when ev_run() happens.

For remote subprocesses, I basically do the same thing except I aggregate all of the channel credits into a single RPC back. By setting the prepare watcher priority high, I should be able to return the initial credits, even before the client gets notified that the remote subprocess has started (state == RUNNING, server sends "started" message).

Thinking about it more (maybe this was your point in this discussion) I could simply just respond with an add-credit message right away instead of doing the prepare watcher. I guess that would work. But on the local subprocess side, I still need the watcher.

So the question is ... does setting the prepare watcher priority matter? Maybe for most cases, it doesn't. But it would be good to do it to be on the safe side?

@garlick
Copy link
Member

garlick commented Oct 12, 2024

Got it. I guess my two points were:

  • raising the priority of the prep watcher still has it running in essentially the same place, after all other watchers except prep watchers, which generally are not doing "real work" anyway (so being first won't buy much).
  • a local subprocess user could, right after flux_local_exec() returns, determine the initial size of all input buffers without waiting for the credit callbacks. The subprocess server could do this and send back the add-credit response before control is returned from the exec message handler to the reactor.

@chu11
Copy link
Member Author

chu11 commented Oct 12, 2024

raising the priority of the prep watcher still has it running in essentially the same place, after all other watchers except prep watchers, which generally are not doing "real work" anyway (so being first won't buy much).

Ok, if we feel the odds is it doesn't buy us much, then we could drop this PR.

a local subprocess user could, right after flux_local_exec() returns, determine the initial size of all input buffers without waiting for the credit callbacks.

Certainly is doable, which is related to the "NO_INITIAL_CREDITS" flag I mentioned in another thread. But I think having initial credits sent is important for the "api". We can discuss this in another PR thread though.

@chu11 chu11 closed this Oct 12, 2024
Copy link

codecov bot commented Oct 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.28%. Comparing base (50f8dc2) to head (d41f7cc).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6367      +/-   ##
==========================================
- Coverage   83.28%   83.28%   -0.01%     
==========================================
  Files         524      524              
  Lines       86230    86232       +2     
==========================================
- Hits        71820    71816       -4     
- Misses      14410    14416       +6     
Files with missing lines Coverage Δ
src/common/libflux/reactor.c 96.21% <100.00%> (+0.02%) ⬆️

... and 10 files with indirect coverage changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants