Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove resign DDL owner logic to keep things simple #1225

Closed
aylei opened this issue Nov 23, 2019 · 4 comments · Fixed by #1239
Closed

Remove resign DDL owner logic to keep things simple #1225

aylei opened this issue Nov 23, 2019 · 4 comments · Fixed by #1239
Assignees

Comments

@aylei
Copy link
Contributor

aylei commented Nov 23, 2019

Feature Request

AFAIK, TiDB server will clean up the etcd session upon shutdown(either gracefully or not), which release the lock of DDL owner path and trigger a leader election immediately. Therefore, we can remove the resign DDL owner logic and leave it to TiDB server.

Example:

  • Send SIGTERM (shutdown but not gracefully) to tidb-2, which is the current DDL owner:
➜  ~ kubectl exec -it wuyelei-tidb-2 -c tidb -- sh
/ # kill -s TERM 1
/ # command terminated with exit code 137
  • Log of tidb-1 and tidb-2 shows that tidb-2 clean up the session and tidb-1 get the owner immediately:
wuyelei-tidb-2 tidb [2019/11/23 14:28:29.000 +00:00] [INFO] [signal_posix.go:54] ["got signal to exit"] [signal=terminated]
wuyelei-tidb-2 tidb [2019/11/23 14:28:29.000 +00:00] [INFO] [server.go:572] ["[server] graceful shutdown."]
wuyelei-tidb-2 tidb [2019/11/23 14:28:29.001 +00:00] [WARN] [manager.go:281] ["is not the owner"] ["owner info"="[ddl] /tidb/ddl/fg/owner ownerManager 1f2b2e8c-2e3a-4913-940c-8c37617e06d7"]
wuyelei-tidb-2 tidb [2019/11/23 14:28:29.001 +00:00] [INFO] [http_status.go:263] ["listen failed"] [error="http: Server closed"]
wuyelei-tidb-2 tidb [2019/11/23 14:28:29.001 +00:00] [INFO] [manager.go:239] ["etcd session is done, creates a new one"] ["owner info"="[ddl] /tidb/ddl/fg/owner ownerManager 1f2b2e8c-2e3a-4913-940c-8c37617e06d7"]
wuyelei-tidb-2 tidb [2019/11/23 14:28:29.001 +00:00] [INFO] [manager.go:243] ["break campaign loop, NewSession failed"] ["owner info"="[ddl] /tidb/ddl/fg/owner ownerManager 1f2b2e8c-2e3a-4913-940c-8c37617e06d7"] [error="context canceled"] [errorVerbose="context canceled\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.4/errors.go:174\ngithub.com/pingcap/errors.Trace\n\tgithub.com/pingcap/errors@v0.11.4/juju_adaptor.go:15\ngithub.com/pingcap/tidb/owner.contextDone\n\tgithub.com/pingcap/tidb@/owner/manager.go:371\ngithub.com/pingcap/tidb/owner.NewSession\n\tgithub.com/pingcap/tidb@/owner/manager.go:142\ngithub.com/pingcap/tidb/owner.(*ownerManager).campaignLoop\n\tgithub.com/pingcap/tidb@/owner/manager.go:241\nruntime.goexit\n\truntime/asm_amd64.s:1357"]
wuyelei-tidb-2 tidb [2019/11/23 14:28:29.003 +00:00] [INFO] [manager.go:292] ["revoke session"] ["owner info"="[ddl] /tidb/ddl/fg/owner ownerManager 1f2b2e8c-2e3a-4913-940c-8c37617e06d7"] []
wuyelei-tidb-2 tidb [2019/11/23 14:28:29.003 +00:00] [INFO] [ddl_worker.go:113] ["[ddl] DDL worker closed"] [worker="worker 1, tp general"] ["take time"=16.577µs]
wuyelei-tidb-2 tidb [2019/11/23 14:28:29.003 +00:00] [INFO] [ddl_worker.go:113] ["[ddl] DDL worker closed"] [worker="worker 2, tp add index"] ["take time"=4.853µs]
wuyelei-tidb-2 tidb [2019/11/23 14:28:29.003 +00:00] [INFO] [session_pool.go:85] ["[ddl] closing sessionPool"]
wuyelei-tidb-1 tidb [2019/11/23 14:28:29.008 +00:00] [INFO] [manager.go:316] ["get owner"] ["owner info"="[ddl] /tidb/ddl/fg/owner ownerManager 16bb2bf4-6c29-4229-ba91-4208ca04a4fa"] [ownerID=16bb2bf4-6c29-4229-ba91-4208ca04a4fa]

cc @tennix @weekface @zimulala

@zimulala
Copy link

@aylei
If we don't use "kill -9" to kill TiDB server, we will revoke the DDL owner path immediately. So we can remove the resign DDL owner logic in this case.

@aylei
Copy link
Contributor Author

aylei commented Nov 25, 2019

@zimulala Thanks!

@weekface @tennix how do you think?

@weekface
Copy link
Contributor

LGTM

@tennix
Copy link
Member

tennix commented Nov 25, 2019

AFAIK, resigning ddl owner when upgrading is suggested for some corner cases that may not be handled correctly in tidb-server.

There are some bugs that when we don't resign ddl owner the data may be not consistent in early 2.1.x versions. But it's fixed now and hasn't appeared till now. Besides, there's no such operation logic in tidb-ansible. So I think we can remove this logic now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants