Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grpc-go/transport: "panic: send on closed channel" #8595

Closed
zbindenren opened this issue Sep 22, 2017 · 7 comments
Closed

grpc-go/transport: "panic: send on closed channel" #8595

zbindenren opened this issue Sep 22, 2017 · 7 comments
Assignees
Labels

Comments

@zbindenren
Copy link
Contributor

Hi

Since the upgrade to latest version 3.2.7, our server crashes every 1-2 days and we have to start it manually.

The error and stack:

Sep 21 13:58:33 p1-linux-mlsu008 etcd[110657]: purged file /appl/etcd/data/p1-linux-mlsu008/member/snap/0000000000002aef-000000010aa32f1b.snap successfully
Sep 21 13:58:33 p1-linux-mlsu008 etcd[110657]: purged file /appl/etcd/data/p1-linux-mlsu008/member/snap/0000000000002aef-000000010aa342a5.snap successfully
Sep 21 13:58:39 p1-linux-mlsu008 etcd[110657]: apply entries took too long [4.026007176s for 11 entries]
Sep 21 13:58:39 p1-linux-mlsu008 etcd[110657]: avoid queries with large range/delete range!
Sep 21 13:58:45 p1-linux-mlsu008 etcd[110657]: start to snapshot (applied: 4473469142, lastsnap: 4473464141)
Sep 21 13:58:45 p1-linux-mlsu008 etcd[110657]: saved snapshot at index 4473469142
Sep 21 13:58:45 p1-linux-mlsu008 etcd[110657]: compacted raft log at 4473464142
Sep 21 13:58:45 p1-linux-mlsu008 etcd[110657]: panic: send on closed channel
Sep 21 13:58:45 p1-linux-mlsu008 etcd[110657]: goroutine 1025069877 [running]:
Sep 21 13:58:45 p1-linux-mlsu008 etcd[110657]: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport.(*serverHandlerTransport).do(0xc459e1a2a0, 0xc47f4c2240, 0xe0fc00, 0xc49395bd01)
Sep 21 13:58:45 p1-linux-mlsu008 etcd[110657]: /home/gyuho/go/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport/handler_server.go:177 +0x13d
Sep 21 13:58:45 p1-linux-mlsu008 etcd[110657]: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport.(*serverHandlerTransport).Write(0xc459e1a2a0, 0xc4365c9c20, 0xc45de8b8f0, 0x30, 0x30, 0xc472ca5d88, 0x0, 0x0)
Sep 21 13:58:45 p1-linux-mlsu008 etcd[110657]: /home/gyuho/go/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport/handler_server.go:250 +0xcb
Sep 21 13:58:45 p1-linux-mlsu008 etcd[110657]: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc.(*serverStream).SendMsg(0xc453bc60a0, 0xe62c20, 0xc462ee3300, 0x0, 0x0)
Sep 21 13:58:45 p1-linux-mlsu008 etcd[110657]: /home/gyuho/go/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/stream.go:584 +0x29a
Sep 21 13:58:45 p1-linux-mlsu008 etcd[110657]: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*serverStreamWithCtx).SendMsg(0xc4780204e0, 0xe62c20, 0xc462ee3300, 0x0, 0x1e)
Sep 21 13:58:45 p1-linux-mlsu008 etcd[110657]: <autogenerated>:5 +0x5d
Sep 21 13:58:45 p1-linux-mlsu008 etcd[110657]: github.com/coreos/etcd/cmd/vendor/github.com/grpc-ecosystem/go-grpc-prometheus.(*monitoredServerStream).SendMsg(0xc464b321c0, 0xe62c20, 0xc462ee3300, 0xc494256fc0, 0xc462728bd0)
Sep 21 13:58:45 p1-linux-mlsu008 etcd[110657]: /home/gyuho/go/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/grpc-ecosystem/go-grpc-prometheus/server.go:61 +0x4b
Sep 21 13:58:45 p1-linux-mlsu008 etcd[110657]: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/etcdserverpb.(*leaseLeaseKeepAliveServer).Send(0xc44b611dd0, 0xc462ee3300, 0xc437ab8240, 0x46d95e9a6f4e52a5)
Sep 21 13:58:45 p1-linux-mlsu008 etcd[110657]: /home/gyuho/go/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/etcdserverpb/rpc.pb.go:2688 +0x49
Sep 21 13:58:45 p1-linux-mlsu008 etcd[110657]: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*LeaseServer).leaseKeepAlive(0xc42c91c400, 0x1416400, 0xc44b611dd0, 0x0, 0x0)
Sep 21 13:58:45 p1-linux-mlsu008 etcd[110657]: /home/gyuho/go/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/lease.go:118 +0x1b7
Sep 21 13:58:45 p1-linux-mlsu008 etcd[110657]: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*LeaseServer).LeaseKeepAlive.func1(0xc459e1a4e0, 0xc42c91c400, 0x1416400, 0xc44b611dd0)
Sep 21 13:58:45 p1-linux-mlsu008 etcd[110657]: /home/gyuho/go/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/lease.go:74 +0x3f
Sep 21 13:58:45 p1-linux-mlsu008 etcd[110657]: created by github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*LeaseServer).LeaseKeepAlive
Sep 21 13:58:45 p1-linux-mlsu008 etcd[110657]: /home/gyuho/go/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/lease.go:75 +0x96
Sep 21 13:58:45 p1-linux-mlsu008 systemd[1]: etcd.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Sep 21 13:58:45 p1-linux-mlsu008 systemd[1]: Unit etcd.service entered failed state.
Sep 21 13:58:45 p1-linux-mlsu008 systemd[1]: etcd.service failed.

And then after a while it is possible that the second server also crashes and only one cluster is available.

We have a three node setup, and around 2000 clients. We have around 6000 watch streams and arround 3000 lease streams. I think the grpc proxy might help, but because of #8162 we do not use it yet.

@fanminshi
Copy link
Member

@zbindenren could you reproduce this issue with a simpler script as it will help me debug this for you?

@xiang90 xiang90 self-assigned this Sep 24, 2017
@xiang90
Copy link
Contributor

xiang90 commented Sep 24, 2017

I am going to look into this soon.

@zbindenren
Copy link
Contributor Author

zbindenren commented Sep 25, 2017

@fanminshi unfortunately it is not ease to reproduce. We get this issue since upgrading to the latest version. And it only happens when we have a lot of clients. We have a similar setup with a small number of clients and there we do not see the issue.

@gyuho
Copy link
Contributor

gyuho commented Sep 25, 2017

@xiang90 @fanminshi @zbindenren This has been partially fixed in upstream grpc/grpc-go#1115. The fix is available since grpc-go v1.4.x. Which means we would have to wait until etcd v3.3 release.

Root cause grpc/grpc-go#1111 hasn't been addressed, though.

@gyuho gyuho modified the milestone: v3.3.0 Sep 25, 2017
@gyuho gyuho changed the title panic: send on closed channel grpc-go/transport: "panic: send on closed channel " Sep 25, 2017
@gyuho gyuho changed the title grpc-go/transport: "panic: send on closed channel " grpc-go/transport: "panic: send on closed channel" Sep 25, 2017
@xiang90
Copy link
Contributor

xiang90 commented Sep 25, 2017

@gyuho thanks for the investigation. i marked this as a dependency issue.

@zbindenren
Copy link
Contributor Author

Thx for the info. Any plans when 3.3 is going to be released?

@xiang90
Copy link
Contributor

xiang90 commented Sep 28, 2017

closing. we already bumped gRPC to 1.6 in master to get this fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

4 participants