Skip to content

Commit

Permalink
quic: Use GRO for reading packets from the UDP socket in upstream QUIC (
Browse files Browse the repository at this point in the history
#32628)

This behavior is guarded by the runtime flag `envoy.restart_features.prefer_udp_gro`.

We previously turned off GRO in `EnvoyQuicClientConnection` because we weren't sure about the performance (see
#19088). However, kernel experts have recommended using GRO over recvmmsg as a much more CPU efficient solution for reading multiple UDP packets into a single buffer.

This change also enables setting the option of whether `recvmmsg` should be used or not, so we can
disable it for client connections, while keeping it for listener sockets.

Risk Level: low (can be turned off with the runtime guard)
Testing: unit tests
Release Notes: included
Runtime guard: envoy.reloadable_features.prefer_udp_gro

Signed-off-by: Ali Beyad <abeyad@google.com>
  • Loading branch information
abeyad authored Mar 12, 2024
1 parent 55ea014 commit 10a1003
Show file tree
Hide file tree
Showing 23 changed files with 361 additions and 106 deletions.
11 changes: 11 additions & 0 deletions changelogs/current.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,17 @@ minor_behavior_changes:
change: |
Simplifies integration with the codec by removing translation between nghttp2 callbacks and Http2VisitorInterface events.
Guarded by ``envoy.reloadable_features.http2_skip_callback_visitor``.
- area: http3
change: |
Use GRO (Generic Receive Offload) for reading packets from a client QUIC UDP socket. See
https://www.kernel.org/doc/html/next/networking/segmentation-offloads.html for a description of
GRO. This behavior change can be reverted by setting
``envoy.reloadable_features.prefer_quic_client_udp_gro`` to ``false``.
- area: http3
change: |
Disables recvmmsg (multi-message) for reading packets from a client QUIC UDP socket, if GRO
is not set or not supported. recvmsg will be used instead. This behavior change can be
reverted by setting ``envoy.reloadable_features.disallow_quic_client_udp_mmsg`` to ``false``.
bug_fixes:
# *Changes expected to improve the state of the world and are unlikely to have negative effects*
Expand Down
2 changes: 1 addition & 1 deletion source/common/network/udp_listener_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ void UdpListenerImpl::handleReadCallback() {
cb_.onReadReady();
const Api::IoErrorPtr result = Utility::readPacketsFromSocket(
socket_->ioHandle(), *socket_->connectionInfoProvider().localAddress(), *this, time_source_,
config_.prefer_gro_, packets_dropped_);
config_.prefer_gro_, /*allow_mmsg=*/true, packets_dropped_);
if (result == nullptr) {
// No error. The number of reads was limited by read rate. There are more packets to read.
// Register to read more in the next event loop.
Expand Down
52 changes: 35 additions & 17 deletions source/common/network/utility.cc
Original file line number Diff line number Diff line change
Expand Up @@ -598,13 +598,13 @@ void passPayloadToProcessor(uint64_t bytes_read, Buffer::InstancePtr buffer,
std::move(buffer), receive_time);
}

Api::IoCallUint64Result Utility::readFromSocket(IoHandle& handle,
const Address::Instance& local_address,
UdpPacketProcessor& udp_packet_processor,
MonotonicTime receive_time, bool use_gro,
uint32_t* packets_dropped) {

if (use_gro) {
Api::IoCallUint64Result
Utility::readFromSocket(IoHandle& handle, const Address::Instance& local_address,
UdpPacketProcessor& udp_packet_processor, MonotonicTime receive_time,
UdpRecvMsgMethod recv_msg_method, uint32_t* packets_dropped) {
if (recv_msg_method == UdpRecvMsgMethod::RecvMsgWithGro) {
ASSERT(Api::OsSysCallsSingleton::get().supportsUdpGro(),
"cannot use GRO when the platform doesn't support it.");
Buffer::InstancePtr buffer = std::make_unique<Buffer::OwnedImpl>();
IoHandle::RecvMsgOutput output(1, packets_dropped);

Expand Down Expand Up @@ -646,7 +646,9 @@ Api::IoCallUint64Result Utility::readFromSocket(IoHandle& handle,
return result;
}

if (handle.supportsMmsg()) {
if (recv_msg_method == UdpRecvMsgMethod::RecvMmsg) {
ASSERT(Api::OsSysCallsSingleton::get().supportsMmsg(),
"cannot use recvmmsg when the platform doesn't support it.");
const auto max_rx_datagram_size = udp_packet_processor.maxDatagramSize();

// Buffer::ReservationSingleSlice is always passed by value, and can only be constructed
Expand Down Expand Up @@ -720,24 +722,40 @@ Api::IoCallUint64Result Utility::readFromSocket(IoHandle& handle,
Api::IoErrorPtr Utility::readPacketsFromSocket(IoHandle& handle,
const Address::Instance& local_address,
UdpPacketProcessor& udp_packet_processor,
TimeSource& time_source, bool prefer_gro,
uint32_t& packets_dropped) {
TimeSource& time_source, bool allow_gro,
bool allow_mmsg, uint32_t& packets_dropped) {
UdpRecvMsgMethod recv_msg_method = UdpRecvMsgMethod::RecvMsg;
if (allow_gro && handle.supportsUdpGro()) {
recv_msg_method = UdpRecvMsgMethod::RecvMsgWithGro;
} else if (allow_mmsg && handle.supportsMmsg()) {
recv_msg_method = UdpRecvMsgMethod::RecvMmsg;
}

// Read at least one time, and attempt to read numPacketsExpectedPerEventLoop() packets unless
// this goes over MAX_NUM_PACKETS_PER_EVENT_LOOP.
size_t num_packets_to_read = std::min<size_t>(
MAX_NUM_PACKETS_PER_EVENT_LOOP, udp_packet_processor.numPacketsExpectedPerEventLoop());
const bool use_gro = prefer_gro && handle.supportsUdpGro();
size_t num_reads =
use_gro ? (num_packets_to_read / NUM_DATAGRAMS_PER_RECEIVE)
: (handle.supportsMmsg() ? (num_packets_to_read / NUM_DATAGRAMS_PER_RECEIVE)
: num_packets_to_read);
size_t num_reads;
switch (recv_msg_method) {
case UdpRecvMsgMethod::RecvMsgWithGro:
num_reads = (num_packets_to_read / NUM_DATAGRAMS_PER_RECEIVE);
break;
case UdpRecvMsgMethod::RecvMmsg:
num_reads = (num_packets_to_read / NUM_DATAGRAMS_PER_RECEIVE);
break;
case UdpRecvMsgMethod::RecvMsg:
num_reads = num_packets_to_read;
break;
}
// Make sure to read at least once.
num_reads = std::max<size_t>(1, num_reads);

do {
const uint32_t old_packets_dropped = packets_dropped;
const MonotonicTime receive_time = time_source.monotonicTime();
Api::IoCallUint64Result result = Utility::readFromSocket(
handle, local_address, udp_packet_processor, receive_time, use_gro, &packets_dropped);
Api::IoCallUint64Result result =
Utility::readFromSocket(handle, local_address, udp_packet_processor, receive_time,
recv_msg_method, &packets_dropped);

if (!result.ok()) {
// No more to read or encountered a system error.
Expand Down
33 changes: 24 additions & 9 deletions source/common/network/utility.h
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,17 @@ struct ResolvedUdpSocketConfig {
bool prefer_gro_;
};

// The different options for receiving UDP packet(s) from system calls.
enum class UdpRecvMsgMethod {
// The `recvmsg` system call.
RecvMsg,
// The `recvmsg` system call using GRO (generic receive offload). This is the preferred method,
// if the platform supports it.
RecvMsgWithGro,
// The `recvmmsg` system call.
RecvMmsg,
};

/**
* Common network utility routines.
*/
Expand Down Expand Up @@ -352,15 +363,15 @@ class Utility {
* @param udp_packet_processor is the callback to receive the packet.
* @param receive_time is the timestamp passed to udp_packet_processor for the
* receive time of the packet.
* @param prefer_gro supplies whether to use GRO if the OS supports it.
* @param recv_msg_method the type of system call and socket options combination to use when
* receiving packets from the kernel.
* @param packets_dropped is the output parameter for number of packets dropped in kernel. If the
* caller is not interested in it, nullptr can be passed in.
*/
static Api::IoCallUint64Result readFromSocket(IoHandle& handle,
const Address::Instance& local_address,
UdpPacketProcessor& udp_packet_processor,
MonotonicTime receive_time, bool use_gro,
uint32_t* packets_dropped);
static Api::IoCallUint64Result
readFromSocket(IoHandle& handle, const Address::Instance& local_address,
UdpPacketProcessor& udp_packet_processor, MonotonicTime receive_time,
UdpRecvMsgMethod recv_msg_method, uint32_t* packets_dropped);

/**
* Read some packets from a given UDP socket and pass the packet to a given
Expand All @@ -369,7 +380,11 @@ class Utility {
* @param local_address is the socket's local address used to populate port.
* @param udp_packet_processor is the callback to receive the packets.
* @param time_source is the time source used to generate the time stamp of the received packets.
* @param prefer_gro supplies whether to use GRO if the OS supports it.
* @param allow_gro whether to use GRO, iff the platform supports it. This function will check
* the IoHandle to ensure the platform supports GRO before using it.
* @param allow_mmsg whether to use recvmmsg, iff the platform supports it. This function will
* check the IoHandle to ensure the platform supports recvmmsg before using it. If `allow_gro` is
* true and the platform supports GRO, then it will take precedence over using recvmmsg.
* @param packets_dropped is the output parameter for number of packets dropped in kernel.
* Return the io error encountered or nullptr if no io error but read stopped
* because of MAX_NUM_PACKETS_PER_EVENT_LOOP.
Expand All @@ -384,8 +399,8 @@ class Utility {
static Api::IoErrorPtr readPacketsFromSocket(IoHandle& handle,
const Address::Instance& local_address,
UdpPacketProcessor& udp_packet_processor,
TimeSource& time_source, bool prefer_gro,
uint32_t& packets_dropped);
TimeSource& time_source, bool allow_gro,
bool allow_mmsg, uint32_t& packets_dropped);

private:
static void throwWithMalformedIp(absl::string_view ip_address);
Expand Down
2 changes: 2 additions & 0 deletions source/common/quic/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,7 @@ envoy_cc_library(
"//envoy/http:codec_interface",
"//envoy/http:persistent_quic_info_interface",
"//envoy/registry",
"//source/common/runtime:runtime_lib",
"//source/common/tls:ssl_socket_lib",
"//source/extensions/quic/crypto_stream:envoy_quic_crypto_client_stream_lib",
"@com_github_google_quiche//:quic_core_http_spdy_session_lib",
Expand Down Expand Up @@ -362,6 +363,7 @@ envoy_cc_library(
"//envoy/event:dispatcher_interface",
"//source/common/network:socket_option_factory_lib",
"//source/common/network:udp_packet_writer_handler_lib",
"//source/common/runtime:runtime_lib",
"@com_github_google_quiche//:quic_core_connection_lib",
"@envoy_api//envoy/config/core/v3:pkg_cc_proto",
],
Expand Down
5 changes: 4 additions & 1 deletion source/common/quic/client_connection_factory_impl.cc
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
#include "source/common/quic/client_connection_factory_impl.h"

#include "source/common/runtime/runtime_features.h"

namespace Envoy {
namespace Quic {

Expand Down Expand Up @@ -45,7 +47,8 @@ std::unique_ptr<Network::ClientConnection> createQuicNetworkConnection(
ASSERT(!quic_versions.empty());
auto connection = std::make_unique<EnvoyQuicClientConnection>(
quic::QuicUtils::CreateRandomConnectionId(), server_addr, info_impl->conn_helper_,
info_impl->alarm_factory_, quic_versions, local_addr, dispatcher, options, generator);
info_impl->alarm_factory_, quic_versions, local_addr, dispatcher, options, generator,
Runtime::runtimeFeatureEnabled("envoy.reloadable_features.prefer_quic_client_udp_gro"));

// TODO (danzh) move this temporary config and initial RTT configuration to h3 pool.
quic::QuicConfig config = info_impl->quic_config_;
Expand Down
20 changes: 12 additions & 8 deletions source/common/quic/envoy_quic_client_connection.cc
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
#include "source/common/network/socket_option_factory.h"
#include "source/common/network/udp_packet_writer_handler_impl.h"
#include "source/common/quic/envoy_quic_utils.h"
#include "source/common/runtime/runtime_features.h"

namespace Envoy {
namespace Quic {
Expand All @@ -30,35 +31,38 @@ EnvoyQuicClientConnection::EnvoyQuicClientConnection(
const quic::ParsedQuicVersionVector& supported_versions,
Network::Address::InstanceConstSharedPtr local_addr, Event::Dispatcher& dispatcher,
const Network::ConnectionSocket::OptionsSharedPtr& options,
quic::ConnectionIdGeneratorInterface& generator)
quic::ConnectionIdGeneratorInterface& generator, const bool prefer_gro)
: EnvoyQuicClientConnection(
server_connection_id, helper, alarm_factory, supported_versions, dispatcher,
createConnectionSocket(initial_peer_address, local_addr, options), generator) {}
createConnectionSocket(initial_peer_address, local_addr, options, prefer_gro), generator,
prefer_gro) {}

EnvoyQuicClientConnection::EnvoyQuicClientConnection(
const quic::QuicConnectionId& server_connection_id, quic::QuicConnectionHelperInterface& helper,
quic::QuicAlarmFactory& alarm_factory, const quic::ParsedQuicVersionVector& supported_versions,
Event::Dispatcher& dispatcher, Network::ConnectionSocketPtr&& connection_socket,
quic::ConnectionIdGeneratorInterface& generator)
quic::ConnectionIdGeneratorInterface& generator, const bool prefer_gro)
: EnvoyQuicClientConnection(
server_connection_id, helper, alarm_factory,
new EnvoyQuicPacketWriter(
std::make_unique<Network::UdpDefaultWriter>(connection_socket->ioHandle())),
/*owns_writer=*/true, supported_versions, dispatcher, std::move(connection_socket),
generator) {}
generator, prefer_gro) {}

EnvoyQuicClientConnection::EnvoyQuicClientConnection(
const quic::QuicConnectionId& server_connection_id, quic::QuicConnectionHelperInterface& helper,
quic::QuicAlarmFactory& alarm_factory, quic::QuicPacketWriter* writer, bool owns_writer,
const quic::ParsedQuicVersionVector& supported_versions, Event::Dispatcher& dispatcher,
Network::ConnectionSocketPtr&& connection_socket,
quic::ConnectionIdGeneratorInterface& generator)
quic::ConnectionIdGeneratorInterface& generator, const bool prefer_gro)
: quic::QuicConnection(server_connection_id, quic::QuicSocketAddress(),
envoyIpAddressToQuicSocketAddress(
connection_socket->connectionInfoProvider().remoteAddress()->ip()),
&helper, &alarm_factory, writer, owns_writer,
quic::Perspective::IS_CLIENT, supported_versions, generator),
QuicNetworkConnection(std::move(connection_socket)), dispatcher_(dispatcher) {}
QuicNetworkConnection(std::move(connection_socket)), dispatcher_(dispatcher),
prefer_gro_(prefer_gro), disallow_mmsg_(Runtime::runtimeFeatureEnabled(
"envoy.reloadable_features.disallow_quic_client_udp_mmsg")) {}

void EnvoyQuicClientConnection::processPacket(
Network::Address::InstanceConstSharedPtr local_address,
Expand Down Expand Up @@ -175,7 +179,7 @@ void EnvoyQuicClientConnection::probeWithNewPort(const quic::QuicSocketAddress&
// The probing socket will have the same host but a different port.
auto probing_socket =
createConnectionSocket(connectionSocket()->connectionInfoProvider().remoteAddress(),
new_local_address, connectionSocket()->options());
new_local_address, connectionSocket()->options(), prefer_gro_);
setUpConnectionSocket(*probing_socket, delegate_);
auto writer = std::make_unique<EnvoyQuicPacketWriter>(
std::make_unique<Network::UdpDefaultWriter>(probing_socket->ioHandle()));
Expand Down Expand Up @@ -249,7 +253,7 @@ void EnvoyQuicClientConnection::onFileEvent(uint32_t events,
if (connected() && (events & Event::FileReadyType::Read)) {
Api::IoErrorPtr err = Network::Utility::readPacketsFromSocket(
connection_socket.ioHandle(), *connection_socket.connectionInfoProvider().localAddress(),
*this, dispatcher_.timeSource(), /*prefer_gro=*/false, packets_dropped_);
*this, dispatcher_.timeSource(), prefer_gro_, !disallow_mmsg_, packets_dropped_);
if (err == nullptr) {
// In the case where the path validation fails, the probing socket will be closed and its IO
// events are no longer interesting.
Expand Down
8 changes: 5 additions & 3 deletions source/common/quic/envoy_quic_client_connection.h
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ class EnvoyQuicClientConnection : public quic::QuicConnection,
Network::Address::InstanceConstSharedPtr local_addr,
Event::Dispatcher& dispatcher,
const Network::ConnectionSocket::OptionsSharedPtr& options,
quic::ConnectionIdGeneratorInterface& generator);
quic::ConnectionIdGeneratorInterface& generator, bool prefer_gro);

EnvoyQuicClientConnection(const quic::QuicConnectionId& server_connection_id,
quic::QuicConnectionHelperInterface& helper,
Expand All @@ -69,7 +69,7 @@ class EnvoyQuicClientConnection : public quic::QuicConnection,
const quic::ParsedQuicVersionVector& supported_versions,
Event::Dispatcher& dispatcher,
Network::ConnectionSocketPtr&& connection_socket,
quic::ConnectionIdGeneratorInterface& generator);
quic::ConnectionIdGeneratorInterface& generator, bool prefer_gro);

// Network::UdpPacketProcessor
void processPacket(Network::Address::InstanceConstSharedPtr local_address,
Expand Down Expand Up @@ -135,7 +135,7 @@ class EnvoyQuicClientConnection : public quic::QuicConnection,
const quic::ParsedQuicVersionVector& supported_versions,
Event::Dispatcher& dispatcher,
Network::ConnectionSocketPtr&& connection_socket,
quic::ConnectionIdGeneratorInterface& generator);
quic::ConnectionIdGeneratorInterface& generator, bool prefer_gro);

void onFileEvent(uint32_t events, Network::ConnectionSocket& connection_socket);

Expand All @@ -150,6 +150,8 @@ class EnvoyQuicClientConnection : public quic::QuicConnection,
bool migrate_port_on_path_degrading_{false};
uint8_t num_socket_switches_{0};
size_t num_packets_with_unknown_dst_address_{0};
const bool prefer_gro_;
const bool disallow_mmsg_;
};

} // namespace Quic
Expand Down
7 changes: 6 additions & 1 deletion source/common/quic/envoy_quic_utils.cc
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
#include "envoy/common/platform.h"
#include "envoy/config/core/v3/base.pb.h"

#include "source/common/api/os_sys_calls_impl.h"
#include "source/common/http/utility.h"
#include "source/common/network/socket_option_factory.h"
#include "source/common/network/utility.h"
Expand Down Expand Up @@ -137,7 +138,8 @@ Http::StreamResetReason quicErrorCodeToEnvoyRemoteResetReason(quic::QuicErrorCod
Network::ConnectionSocketPtr
createConnectionSocket(const Network::Address::InstanceConstSharedPtr& peer_addr,
Network::Address::InstanceConstSharedPtr& local_addr,
const Network::ConnectionSocket::OptionsSharedPtr& options) {
const Network::ConnectionSocket::OptionsSharedPtr& options,
const bool prefer_gro) {
if (local_addr == nullptr) {
local_addr = Network::Utility::getLocalAddress(peer_addr->ip()->version());
}
Expand All @@ -149,6 +151,9 @@ createConnectionSocket(const Network::Address::InstanceConstSharedPtr& peer_addr
}
connection_socket->addOptions(Network::SocketOptionFactory::buildIpPacketInfoOptions());
connection_socket->addOptions(Network::SocketOptionFactory::buildRxQueueOverFlowOptions());
if (prefer_gro && Api::OsSysCallsSingleton::get().supportsUdpGro()) {
connection_socket->addOptions(Network::SocketOptionFactory::buildUdpGroOptions());
}
if (options != nullptr) {
connection_socket->addOptions(options);
}
Expand Down
3 changes: 2 additions & 1 deletion source/common/quic/envoy_quic_utils.h
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,8 @@ Http::StreamResetReason quicErrorCodeToEnvoyRemoteResetReason(quic::QuicErrorCod
Network::ConnectionSocketPtr
createConnectionSocket(const Network::Address::InstanceConstSharedPtr& peer_addr,
Network::Address::InstanceConstSharedPtr& local_addr,
const Network::ConnectionSocket::OptionsSharedPtr& options);
const Network::ConnectionSocket::OptionsSharedPtr& options,
bool prefer_gro = false);

// Convert a cert in string form to X509 object.
// Return nullptr if the bytes passed cannot be passed.
Expand Down
Loading

0 comments on commit 10a1003

Please sign in to comment.