Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Overload: Reset H2 server stream only use codec level reset mechanism #18895
Overload: Reset H2 server stream only use codec level reset mechanism #18895
Changes from 1 commit
3533da7
22c6521
8008823
c8f7bdc
ff20da9
b6211fc
3d320f0
2dc1eff
0b98a9c
4bb2619
c3fa8ab
f4c4199
189d9b2
84c517d
b75d8ac
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the underlying problem here, is that the filter manager believes the stream to be complete, and the codec does not? I think we don't want the buffer accounting to be the entity handling this disconnect because there could be other entities with this lifetime confusion.
either entities which can down-call a reset should be explicitly made aware of a stream being complete, or the filter manager should handle reset being called after it thinks that the stream is complete. Make sense?
cc @snowp for second opinion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Envoy level stream object (
ConnectionManagerImpl::ActiveStream
is complete) and so the filter manager also thinks the stream is complete, but the codec doesn't as you said as it has yet to callonStreamClose
.AFAIK, the only entities that'll try to reset the stream at this point is either the
reset_high_memory_stream
action or theOnPendingFlushTimer
though in the latter case the codec will know the stream has ended.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's also cases in envoy mobile where we've had a reset-after-fin (Snow has reviewed) so again I'd rather make this a bit more resistant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some more offline discussion I'd suggested using local_end_stream_ rather than adding yet another way to communicate a stream is done. @KBaichoo thinks that's sketchy because of interplay between local_end_stream_ and the deferred reset code.
Pulling in @yanavlasov and @antoniovicente to comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interplay between local_end_stream and
runResetCallbacks(reason)
I think can be sketchy (sorry if I didn't convey that well)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change feels brittle.
I think we need to dig deeper and make object lifetimes more clear. We need to eliminate all cases where an object remains registered after deletion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah that's my thought too. @KBaichoo if we're in a hurry to get this landed I'd be OK merging with a TODO to fix if you can help resolve the real lifetime issues right after, but if we're not in a rush I think we want to sort out the lifetime issues now in a less fragile way.
/wait
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pushback. I've dug deeper and have done the following:
doDeferredStreamDestroy
access the response encoder (codec stream) and remove the envoy level stream from callbacks, and notify the codec level stream it's done.