New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

proposal: dynamic configuration change #13660

Closed

rleungx wants to merge 2 commits into pingcap:master from rleungx:dynamic-configuration

Member

rleungx commented Nov 21, 2019

Summary

This proposal proposes a unified way to manage the configuration options of TiDB, TiKV, and PD by storing them in PD and support dynamically change the configuration options by using the same way among the different components which can greatly improve usability.

Motivation

Here are some reasons why we need to do it:

For now, each component in TiDB cluster has its own configuration file, which is hard for management. we need a unified way to manage the configuration options of all components
Although some configuration options support dynamic modification, the operations need to learn a lot to use them properly since we have multiple entries, e.g., pd-ctl, tikv-ctl, and SQL, resulting in poor usability. For better usability, provide a unified way to modify them dynamically.

rleungx changed the title ~~proposal: dynamic configuration change proposal~~ proposal: dynamic configuration change

rleungx mentioned this pull request

Incubating Program: Dynamic Configuration Change pingcap/community#89

Open

codecov bot commented Nov 21, 2019 •

edited

Loading

Codecov Report

Merging #13660 into master will decrease coverage by 0.1506%.
The diff coverage is n/a.

@@               Coverage Diff                @@
##             master     #13660        +/-   ##
================================================
- Coverage   80.0635%   79.9129%   -0.1507%     
================================================
  Files           473        473                
  Lines        116440     115801       -639     
================================================
- Hits          93226      92540       -686     
- Misses        15924      15973        +49     
+ Partials       7290       7288         -2


          dynamic configuration change proposal

76a534d

Signed-off-by: Ryan Leung <rleungx@gmail.com>

rleungx force-pushed the dynamic-configuration branch from 9c70ce2 to 76a534d Compare

November 21, 2019 06:38

rleungx added the component/docs label

Contributor

SunRunAway commented Nov 21, 2019 •

edited

Loading

There are also several scenarios to consider, it is recommended to add to the workflow:

Let's say I want to upgrade the binary and change the configuration at the same time. What should I do in this scenario?
A certain TiDB instance needs to debug a configuration parameter.
Grayscale upgrade scenario
Users have their own etcd to do configuration management
Should an instance keep its local copy of a configuration file? When the remote configuration server is down, the instance can still start up.

In addition, it is best to describe how the workdflow is in the normal machines and kubernetes environment.

tennix reviewed

View reviewed changes

docs/design/2019-11-21-dynamic-configuration-change.md

+              - New cluster. Both TiDB and TiKV use the default configuration and send it to
+              PD to complete the registration. The registration needs to establish the mapping
+              relationship between the component ID, version and local configuration. For
+              customized requirements, such as modifying the size of block cache. It needs

Member

tennix Nov 21, 2019

When does the customization happen? I think it should be happened before TiKV and TiDB to register themselves. Because some configurations won't take effect after restart.

docs/design/2019-11-21-dynamic-configuration-change.md

+                      Global global = 2;
+                  }
+                  string name = 3;
+                  string value = 4;

Member

tennix Nov 21, 2019

All values are string?

docs/design/2019-11-21-dynamic-configuration-change.md Show resolved Hide resolved

docs/design/2019-11-21-dynamic-configuration-change.md

+              files of those components can be removed and don't need to learn how those
+              tools works. It reduces administrative costs. For configuration options that
+              cannot be modified dynamically, we still can change it using this unified way,
+              but we need to wait for the next restart after modification to take effect.

Member

tennix Nov 21, 2019

We need a mechanism to tell user if they need to restart the cluster to make the modification to take effect.

docs/design/2019-11-21-dynamic-configuration-change.md Outdated Show resolved Hide resolved

aylei reviewed

View reviewed changes

docs/design/2019-11-21-dynamic-configuration-change.md

+              The functions of each interface are as follows:
+              *Create* is used to register a configuration to PD when the components start.
+              *Get* is used to get the complete configuration of the component periodically

aylei Nov 21, 2019

Would a Watch API be helpful? Client could watch the configuration version and only Get on version change.

Member Author

rleungx Nov 25, 2019

We can directly use the version to decide if we need to return the configuration?

docs/design/2019-11-21-dynamic-configuration-change.md

+              *Create* is used to register a configuration to PD when the components start.
+              *Get* is used to get the complete configuration of the component periodically
+              from PD and decide if the component configuration need to update. *Update* is

aylei Nov 21, 2019

How does Update perform concurrency control?

docs/design/2019-11-21-dynamic-configuration-change.md

+              These two types are used to distinguish whether the configuration is shared by
+              components. For example, the label configuration of TiKV is individual for each
+              TiKV instance. So the type should be local. Each instance here is uniquely identified
+              by the *component_id*, which can be obtained by hashing *IP: port*.

aylei Nov 21, 2019

I know little about the behavior convention of components, but the IP of a logical instance would change in many scenarios, wouldn't it?

Member

tennix Nov 21, 2019

I think this is the registered address, in k8s the instance IP may change when upgraded but we can register a persistent address for each component.

docs/design/2019-11-21-dynamic-configuration-change.md Outdated

+              registers or queries the component ID. By comparing the version carried by the
+              request with the version stored in PD, it determines whether to return the
+              configuration of the component in the response. After receiving the reply, TiDB
+              or TiKV decides whether to update the configuration or not after comparing with

aylei Nov 21, 2019

Suppose PD lost the configurations in a disaster, can operator make configuration changes after recovery? Seems like that the changes will be ignored by component because the version in PD is less than version in component.

Member

tennix Nov 21, 2019

IIRC, the PD recover tool will recover the cluster-id and also pick a big enough alloc-id which needs bigger than the current allocated id. So I guess when storing configuration in PD, the corresponding config version needs to be handled the same way.

docs/design/2019-11-21-dynamic-configuration-change.md Outdated

+              or TiKV decides whether to update the configuration or not after comparing with
+              the version stored in the component.
+              - Delete the node. PD can directly delete the corresponding component ID, the

aylei Nov 21, 2019

What is the API of deleting a node?

Member Author

rleungx Nov 25, 2019

I think there are two ways to delete a node. One is adding a delete API. Another one is using the TTL mechanism.

docs/design/2019-11-21-dynamic-configuration-change.md Outdated

+              - Add a new component or restart the component. The initialization of
+              configuration calls the *Create* method. After receiving the request, PD first
+              registers or queries the component ID. By comparing the version carried by the

aylei Nov 21, 2019

How does a new component or a restarted component determine which version to send in the request?

Member Author

rleungx Nov 25, 2019

I think the version of each component also needs to persist itself.


          address comments

d142f37

Signed-off-by: Ryan Leung <rleungx@gmail.com>

rleungx mentioned this pull request

Tracking issue for Dynamic Configuration Change #13795

Open

30 tasks

qw4990 mentioned this pull request

config: introduce a config client to support load configs from PD online #14303

Merged

qw4990 mentioned this pull request

config: support to dynamically update some config items read from PD #14393

Closed

SunRunAway mentioned this pull request

Call For Participation: SIG-Exec 2020/Q1 Plan #14541

Closed

55 tasks

zz-jason added the status/PTAL label

Contributor

sre-bot commented Feb 7, 2020

@tennix, @aylei, PTAL.

1 similar comment

Contributor

sre-bot commented Feb 9, 2020

@tennix, @aylei, PTAL.

DanielZhangQD mentioned this pull request

Integrate with TiDB v4.0 for configuration online update pingcap/tidb-operator#1662

Closed

Contributor

sre-bot commented Feb 12, 2020

@tennix, @aylei, PTAL.

qw4990 mentioned this pull request

config: support to dynamically update some config items read from PD #14750

Merged

This was referenced Mar 6, 2020

config: bug fix for the config-check mode #15190

Merged

Support set config ... and show config ... syntaxes pingcap/parser#768

Closed

qw4990 mentioned this pull request

support 'SHOW CONFIG' syntax to show configs of PD and TiKV instances #16229

Closed

qw4990 mentioned this pull request

support 'SET CONFIG' syntax to change configs of TiKV/PD instances #16479

Closed

Contributor

AndreMouche commented Jun 19, 2020

/label test

Contributor

ti-srebot commented Jun 19, 2020

These labels are not found test.

Contributor

sre-bot commented Jun 19, 2020

No release note, Please follow https://github.com/pingcap/community/blob/master/contributors/release-note-checker.md

Contributor

AndreMouche commented Jun 19, 2020

/label test,wip

Contributor

ti-srebot commented Jun 19, 2020

These labels are not found test,wip.

Contributor

ti-srebot commented Jun 21, 2020

@tennix, @aylei, PTAL.

2 similar comments

Contributor

ti-srebot commented Jun 23, 2020

@tennix, @aylei, PTAL.

Contributor

ti-srebot commented Jun 26, 2020

@tennix, @aylei, PTAL.

Member

zz-jason commented Feb 9, 2021

I'm going to close this PR since it's hasn't been updated for a long time, feel free to reopen it if you are planning to continue this PR in the future. Thanks for your contribution.

zz-jason closed this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels