Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store/tikv: recycle idle connection in tikv client #10616

Merged
merged 2 commits into from
May 29, 2019

Conversation

tiancaiamao
Copy link
Contributor

What problem does this PR solve?

If a tikv addr has been idle for a while, recycle its connection.

Fix goroutine leak when tikv is offline:

goroutine 4619 [chan receive, 1145 minutes]:
github.com/pingcap/tidb/store/tikv.fetchAllPendingRequests(0xc00245ac60, 0x80, 0xc0026e2e20, 0xc0026e2e08)
/home/jenkins/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb/store/tikv/client.go:339 +0x42
github.com/pingcap/tidb/store/tikv.(*connArray).batchSendLoop(0xc00245acc0, 0x10, 0xa, 0x3, 0xc0000479b8, 0x3, 0x24e, 0x80, 0xc8, 0x0, ...)
/home/jenkins/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb/store/tikv/client.go:434 +0x4db
created by github.com/pingcap/tidb/store/tikv.(*connArray).Init
/home/jenkins/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb/store/tikv/client.go:293 +0xb05

The leak happens when tikv is offline, and the background batch loop goroutine never exit, even the connection is idle.

What is changed and how it works?

In the batch send loop, add a idle detect timer, if the batch commands channel doesn't receive message,
mark this connArray as idle and notify rpcClient to recycle the connArray.

Check List

Tests

  • Integration test

Unit test doesn't use tikv client, so it must be tested in the integration test

Side effects

  • Increased code complexity

Related changes

  • Need to cherry-pick to the release branch

Should cherry pick to 3.0

if a tikv addr has been idle for a while, recycle its connection
@tiancaiamao
Copy link
Contributor Author

PTAL @hicqu @lysu

@codecov
Copy link

codecov bot commented May 28, 2019

Codecov Report

Merging #10616 into master will decrease coverage by 0.0384%.
The diff coverage is 25%.

@@               Coverage Diff                @@
##             master     #10616        +/-   ##
================================================
- Coverage   77.7226%   77.6842%   -0.0385%     
================================================
  Files           413        413                
  Lines         87515      87548        +33     
================================================
- Hits          68019      68011         -8     
- Misses        14351      14383        +32     
- Partials       5145       5154         +9

@codecov
Copy link

codecov bot commented May 28, 2019

Codecov Report

Merging #10616 into master will decrease coverage by 0.0182%.
The diff coverage is 25%.

@@               Coverage Diff                @@
##             master     #10616        +/-   ##
================================================
- Coverage   77.7027%   77.6844%   -0.0183%     
================================================
  Files           413        413                
  Lines         87522      87540        +18     
================================================
- Hits          68007      68005         -2     
- Misses        14361      14383        +22     
+ Partials       5154       5152         -2

@tiancaiamao
Copy link
Contributor Author

/run-all-tests

@hicqu
Copy link
Contributor

hicqu commented May 29, 2019

LGTM. Thank you!

@hicqu hicqu added the status/LGT1 Indicates that a PR has LGTM 1. label May 29, 2019
@disksing
Copy link
Contributor

What if batch message is disabled?

@tiancaiamao
Copy link
Contributor Author

@disksing
Copy link
Contributor

LGTM

@tiancaiamao tiancaiamao added status/LGT2 Indicates that a PR has LGTM 2. component/tikv and removed status/LGT1 Indicates that a PR has LGTM 1. labels May 29, 2019
@tiancaiamao tiancaiamao merged commit 403991b into pingcap:master May 29, 2019
@tiancaiamao tiancaiamao deleted the idle-recycle branch May 29, 2019 05:35
tiancaiamao added a commit to tiancaiamao/tidb that referenced this pull request May 29, 2019
if a tikv addr has been idle for a while, recycle its connection
@guo-shaoge guo-shaoge mentioned this pull request Jul 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/tikv status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants