[SPARK-23182][CORE] Allow enabling TCP keep alive on the RPC connections #20512

peshopetrov · 2018-02-05T17:46:11Z

What changes were proposed in this pull request?

Make it possible for the master to enable TCP keep alive on the RPC connections with clients.

How was this patch tested?

Manually tested.

Added the following:

spark.rpc.io.enableTcpKeepAlive  true

to spark-defaults.conf.

Observed the following on the Spark master:

$ netstat -town | grep 7077
tcp6       0      0 10.240.3.134:7077       10.240.1.25:42851       ESTABLISHED keepalive (6736.50/0/0)
tcp6       0      0 10.240.3.134:44911      10.240.3.134:7077       ESTABLISHED keepalive (4098.68/0/0)
tcp6       0      0 10.240.3.134:7077       10.240.3.134:44911      ESTABLISHED keepalive (4098.68/0/0)

Which proves that the keep alive setting is taking effect.

It's currently possible to enable TCP keep alive on the worker / executor, but is not possible to configure on other RPC connections. It's unclear to me why this could be the case. Keep alive is more important for the master to protect it against suddenly departing workers / executors, thus I think it's very important to have it. Particularly this makes the master resilient in case of using preemptible worker VMs in GCE. GCE has the concept of shutdown scripts, which it doesn't guarantee to execute. So workers often don't get shutdown gracefully and the TCP connections on the master linger as there's nothing to close them. Thus the need of enabling keep alive.

This enables keep-alive on connections besides the master's connections, but that shouldn't cause harm.

…onnections.

jerryshao · 2018-02-08T08:34:36Z

Is it possible that TCP keepalive is disable by kernel, so that your approach cannot be worked? I was thinking if it is better to add application level heartbeat msg to detect lost workers?

peshopetrov · 2018-02-08T10:40:19Z

For completeness it should be possible to enable OS-level TCP keep alives. The client does enable TCP keepalive on its side and it should be possible on the server too.

However, independent of that it perhaps makes sense to also have application level heartbeats because in the JVM it seems it's not possible to tune the timeouts of TCP keepalive.

peshopetrov · 2018-02-15T14:01:39Z

Any update?
We have rolled out our Spark clusters with this change and it seems to be working great. We see no lingering connections on the masters.

vundela · 2018-04-24T14:10:57Z

cc @squito @vanzin
Can you please comment on this PR?

squito · 2018-04-24T15:56:28Z

this is just far enough outside my expertise I don't have an opinion -- but @zsxwing might have some thoughts

srowen · 2018-12-16T23:50:48Z

Is there any downside to enabling keepalive? should it be on by default, or always? seems OK to me.

SparkQA · 2018-12-17T03:42:03Z

Test build #4475 has finished for PR 20512 at commit c5e2d98.

This patch fails Spark unit tests.
This patch does not merge cleanly.
This patch adds no public classes.

peshopetrov · 2018-12-17T09:10:14Z

I don't think there is, but didn't want to change the default behavior.

srowen · 2018-12-17T11:13:57Z

Ignore the test failure for now, I think it's unrelated.
For Spark 3, I think we can make the default 'true' and leave this as a safety valve, if this is probably what people want to set always.
For back-porting to Spark 2.x, we could then open an identical PR that sets it to false.

SparkQA · 2018-12-22T19:53:29Z

Test build #4483 has finished for PR 20512 at commit c5e2d98.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen

I took another look at the code and as you say it's already enabled in TransportClientFactory.createClient, always. It also sets things like TCP_NODELAY. I think at least this should be on by default, or could be? and I am not sure if it's worth another config. I'm thinking of consistency here. @rxin wrote the code for that part in the client a looong time ago.

squito

I took another look at this again. Its worth noting that this change is much broader than what is suggested in the summary -- its changing the setting for every server, eg. driver's rpc server, external shuffle server, etc.

I'm not a real expert here, but I think turning on keep-alive is probably fine. The downside is a few more msgs sent over idle connections to make sure they're alive, and perhaps false positives in cases where one end is just under heavy load.

@srowen you mentioned NODELAY as well -- that seems like a bigger change to me, one I'd investigate more before changing (maybe others have enough experience to say more definitively whether its OK to flip, just saying that I myself would proceed more cautiously on that one).

squito · 2019-01-02T20:45:51Z

common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java

@@ -172,6 +174,14 @@ public boolean verboseMetrics() {
    return conf.getBoolean(SPARK_NETWORK_VERBOSE_METRICS, false);
  }

+  /**
+   * Whether to enable TCP keep-alive. If true, the TCP keep-alives are enabled, which removes
+   * connections that are idle for too long.


not really accurate -- it sends a msg when the connection is idle, but if the connection is still up then it leaves the connection up. Its only if the keep alive msg doesnt' get ack'ed that the connection is removed. I'd either say "... to detect broken connections" OR just leave it at "whether to enable TCP keep-alive", as its perhaps best explained by easily searchable external references.

rxin · 2019-01-02T21:15:05Z

I pinged some people offline who know more about networking to take a look at this. It might take a while before they can due to holiday schedule though.

coderplay · 2019-01-08T05:13:41Z

Clarification: I am not a spark expert, just got an invitation from @rxin because he think I have some knowledge about linux TCP.

Generally, the patch looks good to me for the purpose of preventing inactivity from disconnecting the channel. But from the diff, looks like this commit will impact other RPCs or shuffling transport as well. It's not only for master RPCs as the title declared.

Min

srowen · 2019-01-08T05:37:17Z

I see, @peshopetrov , could you maybe narrow the scope of the change to only affect master RPCs? that seems OK, and seems like your intent.

peshopetrov · 2019-01-08T22:13:04Z

TCP keepalive will be disabled unless explicitly set per transport type. E.g.:

spark.rpc.io.enableTcpKeepAlive      true
spark.shuffle.io.enableTcpKeepAlive  true

We actually set both, but if only spark.rpc.io.enableTcpKeepAlive is set then it only applies for RPCs, or am I missing something?

srowen · 2019-01-09T01:16:41Z

@peshopetrov it looks like the TransportServer is used in two implementations of the shuffle service as well as the block transfer service, in addition to the RPC server. I think that's what people are getting at.

I don't see an objection to setting keep-alive on all of these connections, so one resolution is just to change the JIRA/PR to reflect the fact that it's going to affect more than RPCs.

Another resolution is to add new arguments or configs to make only the RPC server enable this new setting. That could be fine too.

I personally would favor the first option; just update the description. I don't see an argument that there's substantial downside to enabling this for all of these server connections.

rxin · 2019-01-09T01:36:03Z

If it is almost never the case that this change would bring benefits, why would we want to merge the option in?

…

On Jan 8, 2019, at 5:16 PM, Sean Owen ***@***.***> wrote: @peshopetrov it looks like the TransportServer is used in two implementations of the shuffle service as well as the block transfer service, in addition to the RPC server. I think that's what people are getting at. I don't see an objection to setting keep-alive on all of these connections, so one resolution is just to change the JIRA/PR to reflect the fact that it's going to affect more than RPCs. Another resolution is to add new arguments or configs to make only the RPC server enable this new setting. That could be fine too. I personally would favor the first option; just update the description. I don't see an argument that there's substantial downside to enabling this for all of these server connections. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

srowen · 2019-01-09T01:39:37Z

My understanding is that there is a benefit for the master's RPC connections, as described in the PR description. The benefit or cost for other server connections is unclear; probably no harm other than a few extra messages on slow connection; probably no real upside.

The argument for this change as-is is that it addresses the RPC master connection and just applies the setting consistently elsewhere, where it's probably neither positive nor negative.

There is also an argument for making this change a little more complex to affect only the master RPCs, as described in the PR currently.

I could live with either one.

srowen · 2019-01-12T15:31:14Z

@rxin just to finish this off, which argument is more compelling to you? I tend to favor making this change, but could live with a variation that is narrower too.

coderplay · 2019-01-12T18:10:56Z

TCP keep alive is non-invasive, the only minor downside is that it would generate extra network packets. But thinking about spark's big data exchange use cases, these packets are negligible. I am good with this one config ruling all transport types, just please change the title/commit messages.

rxin · 2019-01-12T21:31:13Z

Let’s go ahead and do it then!

…

On Sat, Jan 12, 2019 at 10:11 AM Min Zhou ***@***.***> wrote: TCP keep alive is non-invasive, the only minor downside is that it would generate extra network packets. But thinking about spark's big data exchange use cases, these packets are negligible. I am good with this cone config ruling all transport types, just please change the title/commit messages. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#20512 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AATvPK3gL73BnB_rjCzVdWXB-zZ31yU6ks5vCiU0gaJpZM4R51bN> .

SparkQA · 2019-01-13T01:35:48Z

Test build #4506 has finished for PR 20512 at commit c5e2d98.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-01-13T18:53:30Z

Test build #4509 has finished for PR 20512 at commit c5e2d98.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2019-01-13T19:39:37Z

Merged to master

## What changes were proposed in this pull request? Make it possible for the master to enable TCP keep alive on the RPC connections with clients. ## How was this patch tested? Manually tested. Added the following: ``` spark.rpc.io.enableTcpKeepAlive true ``` to spark-defaults.conf. Observed the following on the Spark master: ``` $ netstat -town | grep 7077 tcp6 0 0 10.240.3.134:7077 10.240.1.25:42851 ESTABLISHED keepalive (6736.50/0/0) tcp6 0 0 10.240.3.134:44911 10.240.3.134:7077 ESTABLISHED keepalive (4098.68/0/0) tcp6 0 0 10.240.3.134:7077 10.240.3.134:44911 ESTABLISHED keepalive (4098.68/0/0) ``` Which proves that the keep alive setting is taking effect. It's currently possible to enable TCP keep alive on the worker / executor, but is not possible to configure on other RPC connections. It's unclear to me why this could be the case. Keep alive is more important for the master to protect it against suddenly departing workers / executors, thus I think it's very important to have it. Particularly this makes the master resilient in case of using preemptible worker VMs in GCE. GCE has the concept of shutdown scripts, which it doesn't guarantee to execute. So workers often don't get shutdown gracefully and the TCP connections on the master linger as there's nothing to close them. Thus the need of enabling keep alive. This enables keep-alive on connections besides the master's connections, but that shouldn't cause harm. Closes apache#20512 from peshopetrov/master. Authored-by: Petar Petrov <petar.petrov@leanplum.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>

[SPARK-23182][CORE] Allow enabling TCP keep alive on the master RPC c…

c5e2d98

…onnections.

srowen reviewed Dec 27, 2018

View reviewed changes

squito reviewed Jan 2, 2019

View reviewed changes

srowen changed the title ~~[SPARK-23182][CORE] Allow enabling TCP keep alive on the master RPC connections.~~ [SPARK-23182][CORE] Allow enabling TCP keep alive on the RPC connections Jan 12, 2019

srowen closed this in c01152d Jan 13, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-23182][CORE] Allow enabling TCP keep alive on the RPC connections #20512

[SPARK-23182][CORE] Allow enabling TCP keep alive on the RPC connections #20512

peshopetrov commented Feb 5, 2018 •

edited by srowen

Loading

jerryshao commented Feb 8, 2018

peshopetrov commented Feb 8, 2018

peshopetrov commented Feb 15, 2018

vundela commented Apr 24, 2018

squito commented Apr 24, 2018

srowen commented Dec 16, 2018

SparkQA commented Dec 17, 2018

peshopetrov commented Dec 17, 2018

srowen commented Dec 17, 2018

SparkQA commented Dec 22, 2018

srowen left a comment

squito left a comment

squito Jan 2, 2019

rxin commented Jan 2, 2019

coderplay commented Jan 8, 2019 •

edited

Loading

srowen commented Jan 8, 2019

peshopetrov commented Jan 8, 2019 •

edited

Loading

srowen commented Jan 9, 2019

rxin commented Jan 9, 2019 via email

srowen commented Jan 9, 2019

srowen commented Jan 12, 2019

coderplay commented Jan 12, 2019 •

edited

Loading

rxin commented Jan 12, 2019 via email

SparkQA commented Jan 13, 2019

SparkQA commented Jan 13, 2019

srowen commented Jan 13, 2019

[SPARK-23182][CORE] Allow enabling TCP keep alive on the RPC connections #20512

[SPARK-23182][CORE] Allow enabling TCP keep alive on the RPC connections #20512

Conversation

peshopetrov commented Feb 5, 2018 • edited by srowen Loading

What changes were proposed in this pull request?

How was this patch tested?

jerryshao commented Feb 8, 2018

peshopetrov commented Feb 8, 2018

peshopetrov commented Feb 15, 2018

vundela commented Apr 24, 2018

squito commented Apr 24, 2018

srowen commented Dec 16, 2018

SparkQA commented Dec 17, 2018

peshopetrov commented Dec 17, 2018

srowen commented Dec 17, 2018

SparkQA commented Dec 22, 2018

srowen left a comment

Choose a reason for hiding this comment

squito left a comment

Choose a reason for hiding this comment

squito Jan 2, 2019

Choose a reason for hiding this comment

rxin commented Jan 2, 2019

coderplay commented Jan 8, 2019 • edited Loading

srowen commented Jan 8, 2019

peshopetrov commented Jan 8, 2019 • edited Loading

srowen commented Jan 9, 2019

rxin commented Jan 9, 2019 via email

srowen commented Jan 9, 2019

srowen commented Jan 12, 2019

coderplay commented Jan 12, 2019 • edited Loading

rxin commented Jan 12, 2019 via email

SparkQA commented Jan 13, 2019

SparkQA commented Jan 13, 2019

srowen commented Jan 13, 2019

peshopetrov commented Feb 5, 2018 •

edited by srowen

Loading

coderplay commented Jan 8, 2019 •

edited

Loading

peshopetrov commented Jan 8, 2019 •

edited

Loading

coderplay commented Jan 12, 2019 •

edited

Loading