-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
upstream adding QUIC-to-TCP failover #15894
Conversation
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
CC @envoyproxy/api-shepherds: Your approval is needed for changes made to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exciting stuff. A left some ideas for discussion. Thank you!
/wait-any
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Tsan failure was CI failure, not test fail. This should be ready for a pass! |
@snowp would you have time today to do a first pass? Matt's on call for the release and I was hoping to land this before I take off for a week. |
ugggh, accidentally pushed to the wrong branch. reverted but CI is of course running again (was green, and barring flakes will be green again) |
CI is unrelated flake: 2021-04-15T17:51:08.2680556Z Pulling zipkin (openzipkin/zipkin:)... |
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good. There may be a threading issue in the call to maybeCreateFallbackFactory if the factories are not per-worker.
Network::TransportSocketPtr | ||
createTransportSocket(Network::TransportSocketOptionsSharedPtr options) const override { | ||
// TODO(#15649) make createTransportSocket non-const to avoid the const cast. | ||
const_cast<QuicClientTransportSocketFactory*>(this)->maybeCreateFallbackFactory(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are factories per worker or can createTransportSocket be called concurrently by multiple theads? If the later, there's a thread race in this code.
I would consider creating this fallback factory in QuicClientTransportSocketFactory's constructor in order to guarantee that it always exists if createTransportSocket is ever called.
Envoy::Http::ConnectivityGrid::ConnectivityOptions coptions{protocols}; | ||
return std::make_unique<Http::ConnectivityGrid>( | ||
dispatcher, api_.randomGenerator(), host, priority, options, transport_socket_options, | ||
state, source, std::chrono::milliseconds(300), coptions); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious, are there plans to make next_attempt_duration configurable in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I figured I'd make @RyanTheOptimist take that on both as a simple API change, and to reduce bikeshedding on this PR
ASSERT_TRUE(cluster1->info()->idleTimeout().has_value()); | ||
EXPECT_EQ(std::chrono::hours(1), cluster1->info()->idleTimeout().value()); | ||
|
||
const std::string explicit_http3 = R"EOF( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: changing "explicit" to "alpn" or "auto" would be slightly clearer
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the helpful review!
I think default-creating the fallback isn't enough for secrets update, but I'm going to leave that as a TODO and fix when I get back. Assuming I manage to get away =P
Envoy::Http::ConnectivityGrid::ConnectivityOptions coptions{protocols}; | ||
return std::make_unique<Http::ConnectivityGrid>( | ||
dispatcher, api_.randomGenerator(), host, priority, options, transport_socket_options, | ||
state, source, std::chrono::milliseconds(300), coptions); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I figured I'd make @RyanTheOptimist take that on both as a simple API change, and to reduce bikeshedding on this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the change and enjoy your vacation next week!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm api
Risk Level: n/a (hidden by default) Testing: e2e tests Docs Changes: n/a Release Notes: n/a Part of envoyproxy#14829 Signed-off-by: Xin Zhuang <stevenzzz@google.com>
Risk Level: n/a (hidden by default) Testing: e2e tests Docs Changes: n/a Release Notes: n/a Part of envoyproxy#14829 Signed-off-by: Douglas Reid <douglas-reid@users.noreply.github.com>
Risk Level: n/a (hidden by default) Testing: e2e tests Docs Changes: n/a Release Notes: n/a Part of envoyproxy#14829 Signed-off-by: Gokul Nair <gnair@twitter.com>
Hey Matt, can I get your thoughts on 2 things?
if we're Ok with having a hard-coded something (right now "use_quic") matcher to handle TCP-TLS and QUIC-TLS in the same cluster (or maybe we want to do something to combine the two configs and auto-select based on transport?)
how we want to handle the connection pool being able to select a non-default transportSocket. I have "use_quic" as a placeholder argument but I think if we extend the API the boolean should probably be replaced with an optional metadata. I'd like your thoughts if you have a better solution before I go change the API and all the mocks though.
Because I haven't cleaned up all the mocks so this will not build, but the integration test both provides a proof of concept for failover working, and the annoyance in configuring both TCP-TLS and QUIC-TLS transport sockets.
cc @RyanTheOptimist
Commit Message:
Additional Description:
Risk Level:
Testing:
Docs Changes:
Release Notes:
Part of #14829