-
Notifications
You must be signed in to change notification settings - Fork 498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gracefully stop tidb pod #2597
Comments
maybe we can add a prestop lifeclycle to guarantee tidb-server have enough time to closing all SQL connections gracefully lifeCycle:
preStop:
exec:
command: ["/bin/bash", "-c", "sleep 15"] |
It seems a good solution. In addition to builtin 15 seconds wait time, users can configure extra wait time in
When a pod is going to be deleted (DeletionTimestamp != nil), its IP is ignored in endpoints controller and will be completely removed from endpoints. |
When we start to update TiDB, the StatefulSet controller sends command to delete Pod. The pod is removed from endpoints list for service, and are no longer considered part of the set of running Pods for replication controllers. Then the LB(internal or external) can remove the backend. Next, we should wait for the connections to become 0 and then stop the pod:
|
maybe we can just allow users to specify container lifecycle hooks for our components we can provide an example (and docs, etc.) for this scenario, but don't limit the possibilities, as if the application tolerate connection failures, they may don't need extra wait time or don't need to wait for all connections to be closed |
Agree. For the BTW, for |
Feature Request
When stopping a tidb pod, kubelet will first send
TERM
signal to tidb-server, tidb-server will wait for 15s before closing all SQL connections and then do all the cleanup. If it fails to finish all the cleanup, then kubelet will sendKILL
signal to tidb-server, and tidb-server will exit immediately.When tidb-server fails to respond to the client, the client will timeout or fail, thus increasing latency.
We might consider using a readiness probe to let the load balancer (internal or external) to remove the backend first before making the server fail to respond to the client. This would help to reduce the latency during tidb pods shutdown.
The key point is letting readiness probe fail but still be able to receive requests.
ref: https://github.com/pingcap/tidb/blob/v4.0.1/server/server.go#L553-L557
The text was updated successfully, but these errors were encountered: