-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Envoy is spinning CPU on Windows #13189
Comments
Given that the fact that newer versions of Windows will support edge-based events and that it requires the least amount of changes, I want to start experimenting with approach (1). I got some input from @antoniovicente on the approach and I will start working on it for the beta release. |
It would be good to see if approach (1) works out. It certainly seems the least disruptive. The level trigger mode of operation could also be an option under linux, so testing sockets that operate in this paradigm should be possible on multiple platforms. I hope that the changes can be relatively localized: either Network::Connection and/or IoHandle. Keep me posted on progress. |
Don't we already subscribe to write ready events when we have data to write? |
The high level concept is to optimize the time that we are subscribed to write-ready events. The idea is to only listen for write ready events on specific conditions Like when the previous write call to the socket got blocked or when we have data to the buffer. |
Don't you have the same problem with read events when we don't consume the entire read buffer? |
In the current implementation, the call to readDisable(true) when above high watermark (or after we have a full H1 request message) removes read from the fd registration mask, so we already avoid having spurious read upcalls. Of course there's this change I want to make once we have benchmarks that measure impact. The removal of spurious changes to the fd registraion mask should provide performance gains. master...antoniovicente:optimize_read_disable |
Ah OK, cool. I had assumed we already did that. :) |
In lieu of windows/wepoll support for EPOLLET, I'd suggest to use EPOLLONESHOT instead. EPOLLONESHOT is similar to EPOLLET in the sense that when an event (like EPOLLIN) is reported, it is also automatically disabled. The difference is that in edge triggered mode the event is also automatically turned back on after calling read()/write(), whereas in ONESHOT mode this needs to be done "manually" by envoy. So every time you're "done" reading from/writing to a socket (typically that is when you get a EWOULDBLOCK error, although there can be other reasons to "move on" from interacting with a particular socket), you have to re-enable the EPOLLIN/EPOLLOUT event with It might require a bit of work but it should be straightforward, almost mechanical. W.r.t wepoll performance - this change would not introduce additional overhead. In default (level-triggered) mode wepoll is doing exactly the same thing internally. I'd suggest to to avoid hacks that switch level-triggered events on and off all the time. It's a lot more complicated, and it can make wepoll a bit slower (that depends on the exact epoll_ctl()/epoll_wait() call pattern). |
@piscisaureus thanks a lot for your input! Highly appreciated.
Conceptually this is pretty similar to what we are trying to do. This is good validation for what we are trying to achieve.
Does libevent give you access to modify the flags of the backend? I have to investigate that further. @nigriMSFT do you happen to know?
I am not sure I fully understand what you mean here.
I have implemented this as part of #13787. But I do not see much extra complication. Am I missing something? |
Title: Envoy is spinning CPU on Windows
Description:
In 2019 version of Windows Server there is no support for
level-based
events which causes Envoy to spin CPU to process "empty" write ready events.To resolve this issue we consider the following three approaches:
cc: @envoyproxy/windows-dev
The text was updated successfully, but these errors were encountered: