Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grpc/grpc-go panics on concurrent Watch SendMsg #8627

Closed
gyuho opened this issue Sep 29, 2017 · 1 comment
Closed

grpc/grpc-go panics on concurrent Watch SendMsg #8627

gyuho opened this issue Sep 29, 2017 · 1 comment

Comments

@gyuho
Copy link
Contributor

gyuho commented Sep 29, 2017

Turns out #8595 still happens.
Root cause is grpc/grpc-go#1111.

EXPECT_DEBUG=1 go test -v -race -run TestCtlV3WatchInteractiveClientTLS
=== RUN   TestCtlV3WatchInteractiveClientTLS
../bin/etcd-83037: 2017-09-28 18:58:16.514531 W | pkg/flags: unrecognized environment variable ETCD_UNSUPPORTED_ARCH=amd64
../bin/etcd-83037: 2017-09-28 18:58:16.514776 I | etcdmain: etcd Version: 3.2.0+git
../bin/etcd-83037: 2017-09-28 18:58:16.514788 I | etcdmain: Git SHA: 2cfe0d677
../bin/etcd-83037: 2017-09-28 18:58:16.514792 I | etcdmain: Go Version: go1.9
../bin/etcd-83037: 2017-09-28 18:58:16.514800 I | etcdmain: Go OS/Arch: darwin/amd64
../bin/etcd-83037: 2017-09-28 18:58:16.514805 I | etcdmain: setting maximum number of CPUs to 4, total number of available CPUs is 4
../bin/etcd-83037: 2017-09-28 18:58:16.514816 N | etcdmain: failed to detect default host (default host not supported on darwin_amd64)
../bin/etcd-83037: 2017-09-28 18:58:16.515331 I | embed: listening for peers on http://localhost:20001
../bin/etcd-83037: 2017-09-28 18:58:16.515516 I | embed: listening for client requests on localhost:20000
../bin/etcd-83037: 2017-09-28 18:58:16.517176 I | etcdserver: name = testname0
../bin/etcd-83037: 2017-09-28 18:58:16.517195 I | etcdserver: data dir = /var/folders/1x/0sf85stn6sb01sx6ncs7x_ph0000gn/T/testname0.etcd760111403
../bin/etcd-83037: 2017-09-28 18:58:16.517205 I | etcdserver: member dir = /var/folders/1x/0sf85stn6sb01sx6ncs7x_ph0000gn/T/testname0.etcd760111403/member
../bin/etcd-83037: 2017-09-28 18:58:16.517212 I | etcdserver: heartbeat = 100ms
../bin/etcd-83037: 2017-09-28 18:58:16.517218 I | etcdserver: election = 1000ms
../bin/etcd-83037: 2017-09-28 18:58:16.517224 I | etcdserver: snapshot count = 100000
../bin/etcd-83037: 2017-09-28 18:58:16.517237 I | etcdserver: advertise client URLs = https://localhost:20000
../bin/etcd-83037: 2017-09-28 18:58:16.517247 I | etcdserver: initial advertise peer URLs = http://localhost:20001
../bin/etcd-83037: 2017-09-28 18:58:16.517260 I | etcdserver: initial cluster = testname0=http://localhost:20001
../bin/etcd-83037: 2017-09-28 18:58:16.658485 I | etcdserver: starting member ca50e9357181d758 in cluster 34f27e83b3bc2ff
../bin/etcd-83037: 2017-09-28 18:58:16.658545 I | raft: ca50e9357181d758 became follower at term 0
../bin/etcd-83037: 2017-09-28 18:58:16.658566 I | raft: newRaft ca50e9357181d758 [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
../bin/etcd-83037: 2017-09-28 18:58:16.658575 I | raft: ca50e9357181d758 became follower at term 1
../bin/etcd-83037: 2017-09-28 18:58:16.666461 W | auth: simple token is not cryptographically signed
../bin/etcd-83037: 2017-09-28 18:58:16.672107 I | etcdserver: starting server... [version: 3.2.0+git, cluster version: to_be_decided]
../bin/etcd-83037: 2017-09-28 18:58:16.673085 I | embed: ClientTLS: cert = ../integration/fixtures/server.crt, key = ../integration/fixtures/server.key.insecure, ca = ../integration/fixtures/ca.crt, trusted-ca = , client-cert-auth = false, crl-file =
../bin/etcd-83037: 2017-09-28 18:58:16.675047 E | etcdserver: cannot monitor file descriptor usage (cannot get FDUsage on darwin)
../bin/etcd-83037: 2017-09-28 18:58:16.677483 I | etcdserver/membership: added member ca50e9357181d758 [http://localhost:20001] to cluster 34f27e83b3bc2ff
../bin/etcd-83037: 2017-09-28 18:58:16.759832 I | raft: ca50e9357181d758 is starting a new election at term 1
../bin/etcd-83037: 2017-09-28 18:58:16.759917 I | raft: ca50e9357181d758 became candidate at term 2
../bin/etcd-83037: 2017-09-28 18:58:16.759936 I | raft: ca50e9357181d758 received MsgVoteResp from ca50e9357181d758 at term 2
../bin/etcd-83037: 2017-09-28 18:58:16.759952 I | raft: ca50e9357181d758 became leader at term 2
../bin/etcd-83037: 2017-09-28 18:58:16.759960 I | raft: raft.node: ca50e9357181d758 elected leader ca50e9357181d758 at term 2
../bin/etcd-83037: 2017-09-28 18:58:16.760137 I | etcdserver: setting up the initial cluster version to 3.2
../bin/etcd-83037: 2017-09-28 18:58:16.765780 N | etcdserver/membership: set the initial cluster version to 3.2
../bin/etcd-83037: 2017-09-28 18:58:16.766030 I | etcdserver/api: enabled capabilities for version 3.2
../bin/etcd-83037: 2017-09-28 18:58:16.766061 I | embed: ready to serve client requests
../bin/etcd-83037: 2017-09-28 18:58:16.766118 I | etcdserver: published {Name:testname0 ClientURLs:[https://localhost:20000]} to cluster 34f27e83b3bc2ff
PUT: [../bin/etcdctl3 --endpoints=https://localhost:20000 --dial-timeout=7s --cacert=../integration/fixtures/ca.crt --cert=../integration/fixtures/server.crt --key=../integration/fixtures/server.key.insecure put sample value]
../bin/etcd-83037: 2017-09-28 18:58:16.771967 I | embed: serving client requests on 127.0.0.1:20000
watch interactive: [../bin/etcdctl3 --cert=../integration/fixtures/server.crt --key=../integration/fixtures/server.key.insecure --endpoints=https://localhost:20000 --dial-timeout=7s --cacert=../integration/fixtures/ca.crt watch --interactive]
../bin/etcdctl-83038: watch sample --rev 1
../bin/etcdctl-83039: OK
../bin/etcdctl-83038: PUT
../bin/etcdctl-83038: sample
../bin/etcdctl-83038: value
PUT: [../bin/etcdctl3 --endpoints=https://localhost:20000 --dial-timeout=7s --cacert=../integration/fixtures/ca.crt --cert=../integration/fixtures/server.crt --key=../integration/fixtures/server.key.insecure put key1 val1]
watch interactive: [../bin/etcdctl3 --endpoints=https://localhost:20000 --dial-timeout=7s --cacert=../integration/fixtures/ca.crt --cert=../integration/fixtures/server.crt --key=../integration/fixtures/server.key.insecure watch --interactive]
../bin/etcdctl-83040: watch key --rev 1 --prefix
../bin/etcdctl-83041: OK
PUT: [../bin/etcdctl3 --endpoints=https://localhost:20000 --dial-timeout=7s --cacert=../integration/fixtures/ca.crt --cert=../integration/fixtures/server.crt --key=../integration/fixtures/server.key.insecure put key2 val2]
../bin/etcdctl-83040: PUT
../bin/etcdctl-83040: key1
../bin/etcdctl-83040: val1
../bin/etcdctl-83040: PUT
../bin/etcdctl-83040: key2
../bin/etcdctl-83040: val2
../bin/etcdctl-83042: OK
PUT: [../bin/etcdctl3 --endpoints=https://localhost:20000 --dial-timeout=7s --cacert=../integration/fixtures/ca.crt --cert=../integration/fixtures/server.crt --key=../integration/fixtures/server.key.insecure put key3 val3]
../bin/etcdctl-83040: PUT
../bin/etcdctl-83040: key3
../bin/etcdctl-83040: val3
../bin/etcdctl-83043: OK
PUT: [../bin/etcdctl3 --endpoints=https://localhost:20000 --dial-timeout=7s --cacert=../integration/fixtures/ca.crt --cert=../integration/fixtures/server.crt --key=../integration/fixtures/server.key.insecure put etcd revision_1]
watch interactive: [../bin/etcdctl3 --endpoints=https://localhost:20000 --dial-timeout=7s --cacert=../integration/fixtures/ca.crt --cert=../integration/fixtures/server.crt --key=../integration/fixtures/server.key.insecure watch --interactive]
../bin/etcdctl-83044: watch etcd --rev 2
../bin/etcdctl-83045: OK
PUT: [../bin/etcdctl3 --endpoints=https://localhost:20000 --dial-timeout=7s --cacert=../integration/fixtures/ca.crt --cert=../integration/fixtures/server.crt --key=../integration/fixtures/server.key.insecure put etcd revision_2]
../bin/etcdctl-83044: PUT
../bin/etcdctl-83044: etcd
../bin/etcdctl-83044: revision_1
../bin/etcdctl-83046: OK
../bin/etcdctl-83044: PUT
../bin/etcdctl-83044: etcd
../bin/etcdctl-83044: revision_2
PUT: [../bin/etcdctl3 --endpoints=https://localhost:20000 --dial-timeout=7s --cacert=../integration/fixtures/ca.crt --cert=../integration/fixtures/server.crt --key=../integration/fixtures/server.key.insecure put etcd revision_3]
../bin/etcdctl-83044: PUT
../bin/etcdctl-83044: etcd
../bin/etcdctl-83044: revision_3
../bin/etcdctl-83047: OK
PUT: [../bin/etcdctl3 --dial-timeout=7s --cacert=../integration/fixtures/ca.crt --cert=../integration/fixtures/server.crt --key=../integration/fixtures/server.key.insecure --endpoints=https://localhost:20000 put key1 val1]
watch interactive: [../bin/etcdctl3 --endpoints=https://localhost:20000 --dial-timeout=7s --cacert=../integration/fixtures/ca.crt --cert=../integration/fixtures/server.crt --key=../integration/fixtures/server.key.insecure watch --interactive]
../bin/etcdctl-83048: watch key key3 --rev 1
../bin/etcdctl-83049: OK
PUT: [../bin/etcdctl3 --cert=../integration/fixtures/server.crt --key=../integration/fixtures/server.key.insecure --endpoints=https://localhost:20000 --dial-timeout=7s --cacert=../integration/fixtures/ca.crt put key3 val3]
../bin/etcdctl-83048: PUT
../bin/etcdctl-83048: key1
../bin/etcdctl-83048: val1
../bin/etcdctl-83048: PUT
../bin/etcdctl-83048: key2
../bin/etcdctl-83048: val2
../bin/etcdctl-83048: PUT
../bin/etcdctl-83048: key1
../bin/etcdctl-83048: val1
../bin/etcd-83037: panic: send on closed channel
../bin/etcd-83037:
../bin/etcd-83037: goroutine 387 [running]:
../bin/etcd-83037: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport.(*serverHandlerTransport).do(0xc42025ad90, 0xc4201ce040, 0x0, 0x0)
../bin/etcd-83037: 	/Users/gyuho/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport/handler_server.go:170 +0x115
../bin/etcd-83037: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport.(*serverHandlerTransport).WriteStatus(0xc42025ad90, 0xc420246b00, 0xc42000c028, 0xc420180120, 0x1bae3d8)
../bin/etcd-83037: 	/Users/gyuho/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport/handler_server.go:187 +0xbd
../bin/etcd-83037: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc.(*serverStream).RecvMsg.func1(0xc42014acf0, 0xc420689d00, 0x1ada540, 0xc420047d20)
../bin/etcd-83037: 	/Users/gyuho/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/stream.go:654 +0xb0
../bin/etcd-83037: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc.(*serverStream).RecvMsg(0xc42014acf0, 0x1ada540, 0xc420047d20, 0x2119ee0, 0xc420180120)
../bin/etcd-83037: 	/Users/gyuho/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/stream.go:668 +0x170
../bin/etcd-83037: github.com/coreos/etcd/cmd/vendor/github.com/grpc-ecosystem/go-grpc-prometheus.(*monitoredServerStream).RecvMsg(0xc420141fe0, 0x1ada540, 0xc420047d20, 0x180184a, 0xc420630e01)
../bin/etcd-83037: 	/Users/gyuho/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/grpc-ecosystem/go-grpc-prometheus/server.go:69 +0x4b
../bin/etcd-83037: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/etcdserverpb.(*watchWatchServer).Recv(0xc420047cc0, 0x0, 0x0, 0x0)
../bin/etcd-83037: 	/Users/gyuho/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/etcdserverpb/rpc.pb.go:3499 +0x62
../bin/etcd-83037: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*serverWatchStream).recvLoop(0xc4206baf00, 0xc42067a7c0, 0x12a0b7e)
../bin/etcd-83037: 	/Users/gyuho/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/watch.go:175 +0x4c
../bin/etcd-83037: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*watchServer).Watch.func2(0xc4206baf00, 0xc42018c900)
../bin/etcd-83037: 	/Users/gyuho/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/watch.go:142 +0x2b
../bin/etcd-83037: created by github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*watchServer).Watch
../bin/etcd-83037: 	/Users/gyuho/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/watch.go:141 +0x2e5
--- FAIL: TestCtlV3WatchInteractiveClientTLS (15.29s)
	testutil.go:55: goroutine 69 [running]:
		github.com/coreos/etcd/pkg/testutil.FatalStack(0xc4202840f0, 0xc4200ab500, 0x18)
			/Users/gyuho/go/src/github.com/coreos/etcd/pkg/testutil/testutil.go:54 +0x82
		github.com/coreos/etcd/e2e.testCtl(0xc4202840f0, 0x1c8cc98, 0xc420036f58, 0x2, 0x2)
			/Users/gyuho/go/src/github.com/coreos/etcd/e2e/ctl_v3_test.go:164 +0x625
		github.com/coreos/etcd/e2e.TestCtlV3WatchInteractiveClientTLS(0xc4202840f0)
			/Users/gyuho/go/src/github.com/coreos/etcd/e2e/ctl_v3_watch_test.go:36 +0x135
		testing.tRunner(0xc4202840f0, 0x1c8c8a8)
			/usr/local/go/src/testing/testing.go:746 +0x16d
		created by testing.(*T).Run
			/usr/local/go/src/testing/testing.go:789 +0x569

		goroutine 1 [chan receive]:
		testing.(*T).Run(0xc420284000, 0x1c7253e, 0x22, 0x1c8c8a8, 0xc420253c01)
			/usr/local/go/src/testing/testing.go:790 +0x59b
		testing.runTests.func1(0xc420284000)
			/usr/local/go/src/testing/testing.go:1004 +0xa8
		testing.tRunner(0xc420284000, 0xc420253d48)
			/usr/local/go/src/testing/testing.go:746 +0x16d
		testing.runTests(0xc42020a260, 0x21725e0, 0x96, 0x96, 0xc420253e30)
			/usr/local/go/src/testing/testing.go:1002 +0x522
		testing.(*M).Run(0xc420253f20, 0x1c69949)
			/usr/local/go/src/testing/testing.go:921 +0x207
		github.com/coreos/etcd/e2e.TestMain(0xc420253f20)
			/Users/gyuho/go/src/github.com/coreos/etcd/e2e/main_test.go:46 +0x54d
		main.main()
			github.com/coreos/etcd/e2e/_test/_testmain.go:340 +0x1ce

		goroutine 12 [chan receive]:
		github.com/coreos/etcd/pkg/logutil.(*MergeLogger).outputLoop(0xc42000d2c0)
			/Users/gyuho/go/src/github.com/coreos/etcd/pkg/logutil/merge_logger.go:174 +0x5bb
		created by github.com/coreos/etcd/pkg/logutil.NewMergeLogger
			/Users/gyuho/go/src/github.com/coreos/etcd/pkg/logutil/merge_logger.go:92 +0x133

		goroutine 72 [chan receive]:
		github.com/coreos/etcd/e2e.watchTest(0xc4202840f0, 0xc42023c040, 0xb, 0x0, 0x0, 0x0, 0x1, 0x0, 0x0, 0x4e20, ...)
			/Users/gyuho/go/src/github.com/coreos/etcd/e2e/ctl_v3_watch_test.go:86 +0x853
		github.com/coreos/etcd/e2e.testCtl.func2(0xc4202828a0, 0x1c8cc98, 0xc4202841e0)
			/Users/gyuho/go/src/github.com/coreos/etcd/e2e/ctl_v3_test.go:155 +0xb1
		created by github.com/coreos/etcd/e2e.testCtl
			/Users/gyuho/go/src/github.com/coreos/etcd/e2e/ctl_v3_test.go:153 +0x3db

		goroutine 49 [IO wait]:
		internal/poll.runtime_pollWait(0x4220d30, 0x72, 0x2122700)
			/usr/local/go/src/runtime/netpoll.go:173 +0x5e
		internal/poll.(*pollDesc).wait(0xc4201ac928, 0x72, 0x2122701, 0x211d720, 0xffffffffffffffff)
			/usr/local/go/src/internal/poll/fd_poll_runtime.go:85 +0xfb
		internal/poll.(*pollDesc).waitRead(0xc4201ac928, 0xc4202e7001, 0x1000, 0x1000)
			/usr/local/go/src/internal/poll/fd_poll_runtime.go:90 +0x4b
		internal/poll.(*FD).Read(0xc4201ac910, 0xc4202e7000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
			/usr/local/go/src/internal/poll/fd_unix.go:125 +0x23d
		os.(*File).read(0xc4201f24d8, 0xc4202e7000, 0x1000, 0x1000, 0x0, 0x211f480, 0xc42006e050)
			/usr/local/go/src/os/file_unix.go:216 +0x71
		os.(*File).Read(0xc4201f24d8, 0xc4202e7000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
			/usr/local/go/src/os/file.go:103 +0x8f
		bufio.(*Reader).fill(0xc4202e0f70)
			/usr/local/go/src/bufio/bufio.go:97 +0x1ac
		bufio.(*Reader).ReadSlice(0xc4202e0f70, 0x386240a, 0x2199b58, 0x0, 0x0, 0x106a797, 0x106a80f)
			/usr/local/go/src/bufio/bufio.go:338 +0x4b
		bufio.(*Reader).ReadBytes(0xc4202e0f70, 0x10a960a, 0xc4201b2620, 0xc4201f9e08, 0x10abba6, 0xc4201b2620, 0x100000000)
			/usr/local/go/src/bufio/bufio.go:416 +0x83
		bufio.(*Reader).ReadString(0xc4201f9f70, 0xa, 0x1000, 0xc4202e7000, 0x1000, 0x1000)
			/usr/local/go/src/bufio/bufio.go:456 +0x46
		github.com/coreos/etcd/pkg/expect.(*ExpectProcess).read(0xc4201b2600)
			/Users/gyuho/go/src/github.com/coreos/etcd/pkg/expect/expect.go:80 +0x269
		created by github.com/coreos/etcd/pkg/expect.NewExpectWithEnv
			/Users/gyuho/go/src/github.com/coreos/etcd/pkg/expect/expect.go:70 +0x415

		goroutine 47 [semacquire]:
		sync.runtime_notifyListWait(0xc4201b9c50, 0x0)
			/usr/local/go/src/runtime/sema.go:507 +0x11f
		sync.(*Cond).Wait(0xc4201b9c40)
			/usr/local/go/src/sync/cond.go:56 +0x8e
		github.com/coreos/etcd/pkg/expect.(*ExpectProcess).ExpectFunc(0xc4201b2600, 0x1c8cbe0, 0x10, 0xc4201b2600, 0x0, 0x0)
			/Users/gyuho/go/src/github.com/coreos/etcd/pkg/expect/expect.go:104 +0x69
		github.com/coreos/etcd/e2e.spawnWithExpects(0xc4201b0300, 0x9, 0x10, 0xc4202bbbf0, 0x1, 0x1, 0xdf, 0x0)
			/Users/gyuho/go/src/github.com/coreos/etcd/e2e/util.go:57 +0xe6
		github.com/coreos/etcd/e2e.spawnWithExpect(0xc4201b0300, 0x9, 0x10, 0x1c5bba9, 0x2, 0x0, 0x8)
			/Users/gyuho/go/src/github.com/coreos/etcd/e2e/util.go:41 +0xaa
		github.com/coreos/etcd/e2e.ctlV3Put(0xc4202840f0, 0xc42023c040, 0xb, 0x0, 0x0, 0x0, 0x1, 0x0, 0x0, 0x4e20, ...)
			/Users/gyuho/go/src/github.com/coreos/etcd/e2e/ctl_v3_kv_test.go:312 +0x452
		github.com/coreos/etcd/e2e.watchTest.func1(0xc4202842d0, 0xc4201bc600, 0x3, 0xc42021a4e0, 0x3, 0x3)
			/Users/gyuho/go/src/github.com/coreos/etcd/e2e/ctl_v3_watch_test.go:75 +0x195
		created by github.com/coreos/etcd/e2e.watchTest
			/Users/gyuho/go/src/github.com/coreos/etcd/e2e/ctl_v3_watch_test.go:73 +0x982

	testutil.go:56: test timed out after 15s
	ctl_v3_test.go:148: error closing etcd processes (exit status 2)
FAIL
exit status 1
FAIL	github.com/coreos/etcd/e2e	15.336s
@gyuho gyuho changed the title grpc/grpc-go panics on concurrent SendMsg grpc/grpc-go panics on concurrent Watch SendMsg Sep 29, 2017
@gyuho
Copy link
Contributor Author

gyuho commented Oct 4, 2017

Should be fixed via grpc/grpc-go#1546.
Will bump up the grpc-go to 1.7 next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant