Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set specific keepalive options by default on supported platforms #59278

Merged
merged 18 commits into from
Jul 27, 2020

Conversation

ywelsch
Copy link
Contributor

@ywelsch ywelsch commented Jul 9, 2020

keepalives tell any intermediate devices that the connection remains alive, which helps with overzealous firewalls that are killing idle connections. keepalives are enabled by default in Elasticsearch, but use system defaults for their configuration, which often times do not have reasonable defaults (e.g. 7200s for TCP_KEEP_IDLE) in the context of distributed systems such as Elasticsearch.

This PR sets the socket-level keep_alive options for transport.tcp.{keep_idle,keep_interval} to 5 minutes on configurations that support it (>= Java 11 & (MacOS || Linux)) and where the system defaults are set to something higher than 5 minutes. This helps keep the connections alive while not interfering with system defaults or user-specified settings unless they are deemed to be set too high by providing better out-of-the-box defaults.

@ywelsch ywelsch added >enhancement :Distributed/Network Http and internode communication implementations v8.0.0 v7.9.0 labels Jul 9, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Network)

@elasticmachine elasticmachine added the Team:Distributed Meta label for distributed team label Jul 9, 2020
Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 from me, I left some small comments.

docs/reference/modules/transport.asciidoc Outdated Show resolved Hide resolved
docs/reference/modules/transport.asciidoc Outdated Show resolved Hide resolved
docs/reference/modules/transport.asciidoc Outdated Show resolved Hide resolved
@ywelsch
Copy link
Contributor Author

ywelsch commented Jul 20, 2020

I've reworked this with a new approach (see 29b18d5), which makes it clearer that the main goal here is to keep the connections alive, while not interfering with system defaults or user-specified settings unless they are deemed to be set too high. I currently have a slight preference for this, but we can certainly do either of them. Let me know what you think @DaveCTurner

@ywelsch ywelsch requested a review from DaveCTurner July 21, 2020 11:59
@ywelsch ywelsch added v7.10.0 and removed v7.9.0 labels Jul 22, 2020
@ywelsch ywelsch requested a review from Tim-Brooks July 22, 2020 16:07
@Tim-Brooks
Copy link
Contributor

I will take a look at this tomorrow morning.

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs are pretty convoluted, I think we'll get questions about them. Suggested tidy-up:

diff --git a/docs/reference/modules/transport.asciidoc b/docs/reference/modules/transport.asciidoc
index e2cd92a9bdc..c33a2b7d9e7 100644
--- a/docs/reference/modules/transport.asciidoc
+++ b/docs/reference/modules/transport.asciidoc
@@ -79,36 +79,30 @@ this node connects to other nodes in the cluster.
 The following parameters can be configured on each transport profile, as in the
 example above:
 
-* `port`: The port to bind to
-* `bind_host`: The host to bind
-* `publish_host`: The host which is published in informational APIs
-* `tcp.no_delay`: Configures the `TCP_NO_DELAY` option for this socket
-* `tcp.keep_alive`: Configures the `SO_KEEPALIVE` option for this socket
+* `port`: The port to which to bind.
+* `bind_host`: The host to which to bind.
+* `publish_host`: The host which is published in informational APIs.
+* `tcp.no_delay`: Configures the `TCP_NO_DELAY` option for this socket.
+* `tcp.keep_alive`: Configures the `SO_KEEPALIVE` option for this socket, which
+   determines whether it sends TCP keepalive probes.
 * `tcp.keep_idle`: Configures the `TCP_KEEPIDLE` option for this socket, which
    determines the time in seconds that a connection must be idle before
-   starting to send TCP keepalive probes. A value of `-1` means not to set
-   this option at the socket level but to use the system default instead.
-   Only applicable on Linux and Mac, and requires Java 11 or newer.
-   Defaults to -1. On applicable configurations, this value is set to
-   300 seconds (5 minutes) if the system default or user-configured setting
-   is higher than that. Values above 300 seconds are rejected for this setting.
+   starting to send TCP keepalive probes. Defaults to `-1` which means to use
+   the smaller of 300 or the system default. May not be greater than 300. Only
+   applicable on Linux and macOS, and requires Java 11 or newer.
 * `tcp.keep_interval`: Configures the `TCP_KEEPINTVL` option for this socket,
    which determines the time in seconds between sending TCP keepalive probes.
-   A value of `-1` means not to set this option at the socket level but to
-   use the system default instead.
-   Only applicable on Linux and Mac, and requires Java 11 or newer.
-   Defaults to -1. On applicable configurations, this value is set to
-   300 seconds (5 minutes) if the system default or user-configured setting
-   is higher than that. Values above 300 seconds are rejected for this setting.
+   Defaults to `-1` which means to use the smaller of 300 or the system
+   default. May not be greater than 300. Only applicable on Linux and macOS,
+   and requires Java 11 or newer.
 * `tcp.keep_count`: Configures the `TCP_KEEPCNT` option for this socket, which
    determines the number of unacknowledged TCP keepalive probes that may be
-   sent on a connection before it is dropped. A value of `-1` means not to set
-   this option at the socket level but to use the system default instead.
-   Only applicable on Linux and Mac, and requires Java 11 or newer.
-   Defaults to -1.
-* `tcp.reuse_address`: Configures the `SO_REUSEADDR` option for this socket
-* `tcp.send_buffer_size`: Configures the send buffer size of the socket
-* `tcp.receive_buffer_size`: Configures the receive buffer size of the socket
+   sent on a connection before it is dropped. Defaults to `-1` which means to
+   use the system default. Only applicable on Linux and macOS, and requires
+   Java 11 or newer.
+* `tcp.reuse_address`: Configures the `SO_REUSEADDR` option for this socket.
+* `tcp.send_buffer_size`: Configures the send buffer size of the socket.
+* `tcp.receive_buffer_size`: Configures the receive buffer size of the socket.
 
 [[long-lived-connections]]
 ===== Long-lived idle connections

@@ -51,9 +51,9 @@
public static final Setting<Boolean> TCP_KEEP_ALIVE =
Setting.boolSetting("network.tcp.keep_alive", true, Property.NodeScope);
public static final Setting<Integer> TCP_KEEP_IDLE =
Setting.intSetting("network.tcp.keep_idle", -1, -1, Property.NodeScope);
Setting.intSetting("network.tcp.keep_idle", -1, -1, 300, Property.NodeScope);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bit needs a breaking change tag (and docs in the backport)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, done so

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation itself looks ok to me but I'll let Tim give a better-informed opinion there.

Copy link
Contributor

@Tim-Brooks Tim-Brooks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation looks good to me.

@ywelsch ywelsch requested a review from DaveCTurner July 27, 2020 09:56
Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ywelsch ywelsch merged commit 2078509 into elastic:master Jul 27, 2020
ywelsch added a commit to ywelsch/elasticsearch that referenced this pull request Jul 28, 2020
…stic#59278)

keepalives tell any intermediate devices that the connection remains alive, which helps with overzealous firewalls that are
killing idle connections. keepalives are enabled by default in Elasticsearch, but use system defaults for their
configuration, which often times do not have reasonable defaults (e.g. 7200s for TCP_KEEP_IDLE) in the context of
distributed systems such as Elasticsearch.

This PR sets the socket-level keep_alive options for network.tcp.{keep_idle,keep_interval} to 5 minutes on configurations
that support it (>= Java 11 & (MacOS || Linux)) and where the system defaults are set to something higher than 5
minutes. This helps keep the connections alive while not interfering with system defaults or user-specified settings
unless they are deemed to be set too high by providing better out-of-the-box defaults.
ywelsch added a commit that referenced this pull request Jul 28, 2020
)

keepalives tell any intermediate devices that the connection remains alive, which helps with overzealous firewalls that are
killing idle connections. keepalives are enabled by default in Elasticsearch, but use system defaults for their
configuration, which often times do not have reasonable defaults (e.g. 7200s for TCP_KEEP_IDLE) in the context of
distributed systems such as Elasticsearch.

This PR sets the socket-level keep_alive options for network.tcp.{keep_idle,keep_interval} to 5 minutes on configurations
that support it (>= Java 11 & (MacOS || Linux)) and where the system defaults are set to something higher than 5
minutes. This helps keep the connections alive while not interfering with system defaults or user-specified settings
unless they are deemed to be set too high by providing better out-of-the-box defaults.
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Feb 11, 2021
The default interval for keepalives today is (mostly) 300s. This value
was chosen to be short enough to keep connections alive even if the
network drops idle connections after 600s, which is the most
unreasonable idle-connection timeout we have seen in practice.

This nice round number happens to coincide with the Linux kernel default
for the apparently-unrelated `nf_conntrack_tcp_timeout_unacknowledged`
timeout. We have observed that in rare cases if the keepalive timer
expires within a few milliseconds of the conntrack unacknowledged timer
then the connection may erroneously be dropped. Although the ultimate
fix for this lies elsewhere, to reduce the probability of spurious
connection drops in affected environments this commit extends the
default keepalive interval to a more unusual 325s, avoiding the
problematic coincidence.

Relates elastic#59278
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>breaking :Distributed/Network Http and internode communication implementations >enhancement Team:Distributed Meta label for distributed team v7.10.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants