Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: Migrate watch test to common framework #14345

Merged
merged 1 commit into from
Aug 25, 2022

Conversation

nic-chen
Copy link
Contributor

Part of #13637

@serathius
Copy link
Member

Left some small comments, but overall looks great! Good job.

@nic-chen nic-chen force-pushed the tests/watch branch 2 times, most recently from ce9eab1 to 69d854b Compare August 20, 2022 01:04
@codecov-commenter
Copy link

Codecov Report

Merging #14345 (69d854b) into main (a1405e9) will decrease coverage by 0.36%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main   #14345      +/-   ##
==========================================
- Coverage   75.60%   75.24%   -0.37%     
==========================================
  Files         457      457              
  Lines       37084    37116      +32     
==========================================
- Hits        28038    27927     -111     
- Misses       7312     7424     +112     
- Partials     1734     1765      +31     
Flag Coverage Δ
all 75.24% <ø> (-0.37%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
server/etcdserver/api/membership/store.go 50.00% <0.00%> (-10.00%) ⬇️
server/auth/simple_token.go 80.00% <0.00%> (-8.47%) ⬇️
client/pkg/v3/tlsutil/tlsutil.go 83.33% <0.00%> (-8.34%) ⬇️
client/pkg/v3/fileutil/purge.go 66.03% <0.00%> (-7.55%) ⬇️
server/storage/mvcc/watchable_store.go 85.86% <0.00%> (-6.89%) ⬇️
api/v3rpc/rpctypes/error.go 84.61% <0.00%> (-5.87%) ⬇️
server/lease/lease.go 94.87% <0.00%> (-5.13%) ⬇️
server/etcdserver/api/v3election/election.go 66.66% <0.00%> (-5.13%) ⬇️
raft/rafttest/node.go 95.00% <0.00%> (-5.00%) ⬇️
server/proxy/grpcproxy/watch.go 92.48% <0.00%> (-4.05%) ⬇️
... and 24 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Signed-off-by: nic-chen <chenjunxu6@gmail.com>
}

ch := make(chan clientv3.WatchResponse)
go func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered to wait for the goroutine to exit when the test case finish?

Copy link
Contributor Author

@nic-chen nic-chen Aug 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible (although with low possibility) that the test case finish before the goroutine exit, eventually the test might report goroutine leak errors. Our pipeline fails due to this kind of error from time to time. So for safety, we need to make sure the goroutine exits before the the test case returns/exits.

Copy link
Member

@serathius serathius Aug 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a problem. We should be good as long as we propagate signal to goroutine that causes it to exit. It's up to leak detection to wait for this signal to propagate. If you read the etcd test goroutine leak implementation, it gives test up to 5 seconds to cleanup its goroutines and uses runtime.Gosched() to give goroutine time to execute and exit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could merge it first. If there is a real problem, I will fix it. What do you think? @serathius @ahrtr

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @ahrtr is worried that the problem will not just imminently show up after merging. Leak goroutine issues like this one are very subtle (for example I cannot reproduce this issue locally at all), so I understand his position.

I still think my argument holds and this should not be a problem, but I would prefer to wait for confirmation from @ahrtr

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you read the etcd test goroutine leak implementation, it gives test up to 5 seconds to cleanup its goroutines and uses runtime.Gosched() to give goroutine time to execute and exit.

The goroutine leak checker indeed waits up to 1 second in the afterTest check.

This is just a mitigation solution instead of best practice to me. We should always make sure gracefully shutdown of all threads/goroutines in almost all cases, including production code and test code, unless the cases that we intentionally create.

Previously we also saw a situation that a test case (in test code) or the main goroutine (in production code) already finishes/exits, but a goroutine it creates still prints log, and it isn't allowed by golang. Of course, I do not see such issue in this PR.

I just searched the repo, there are lots of similar cases which create goroutines but do not gracefully shutdown them. It's OK to merge this PR for now. But I'd like to revisit all such cases afterwards in separate PR(s). WDYT? @serathius @spzala

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the is a problem worth looking into.

@serathius
Copy link
Member

Merging as we agreed to follow up with separate PRs.

@serathius serathius merged commit f36a878 into etcd-io:main Aug 25, 2022
@nic-chen nic-chen deleted the tests/watch branch August 26, 2022 03:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants