Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a deadlock in TransportSet. #2258

Merged
merged 1 commit into from
Sep 12, 2016
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion core/src/main/java/io/grpc/internal/TransportSet.java
Original file line number Diff line number Diff line change
Expand Up @@ -249,9 +249,14 @@ public void run() {
delayedTransport.endBackoff();
boolean shutdownDelayedTransport = false;
Runnable runnable = null;
// TransportSet as a channel layer class should not call into transport methods while
// holding the lock, thus we call hasPendingStreams() outside of the lock. It will cause
// a _benign_ race where the TransportSet may transition to CONNECTING when there is not
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the race is also pre-existing, since at any time obtainActiveTransport() can return the delayed transport and a new stream be created.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds a real issue. If EndOfCurrentBackoff wins in the race with newStream(), the application will see a spurious shutdown error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see where the shutdown error would come from because the delayed transport becomes pass-through to the supplier for newStream(), instead of failing new streams after shutdown.

Instead of calling startNewTransport() here, we would instead enter IDLE. But since we call setTransportSupplier with a supplier that calls obtainActiveTransport, then if the race occurs there will just be a call to obtainActiveTransport that will trigger entering CONNECTING.

The only issue is that the stream will wrapped twice with a DelayedStream. But since the race should only happen at most once per stream, we decided that it wasn't a big deal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right... I lost my touch with delayed transport after not touching it for a while

// pending stream.
boolean hasPendingStreams = delayedTransport.hasPendingStreams();
synchronized (lock) {
reconnectTask = null;
if (delayedTransport.hasPendingStreams()) {
if (hasPendingStreams) {
// Transition directly to CONNECTING
runnable = startNewTransport(delayedTransport);
} else {
Expand Down