Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use 'ADMIN SHOW DDL JOB QUERIES LIMIT m OFFSET n' to retrieve DDL commands' content within a certain range (n+1, n+m) #36198

Closed
lyzx2001 opened this issue Jul 13, 2022 · 2 comments · Fixed by #36480
Assignees
Labels
feature/accepted This feature request is accepted by product managers type/feature-request Categorizes issue or PR as related to a new feature.

Comments

@lyzx2001
Copy link
Contributor

lyzx2001 commented Jul 13, 2022

Feature Request

Is your feature request related to a problem? Please describe:
The feature is related to an issue from TiFlow: pingcap/tiflow#4689
dm-worker keeps retrying to execute ddl when encounter "invalid connection" error, which is caused by the DDL costs a long duration or the DDL is queuing.

Proposed solution:
When encoutering "invalid connection", execute the 'ADMIN SHOW DDL' command every 10 seconds to check the status of the required DDL command in TiDB, and use the retrieved status to carry on the corresponding operation:

  • If the status is synced, that means the downstream has finished synchronizing this DDL. The downstream can return a message to the upstream that tells him not to re-execute the DDL, and can continue to execute the following commands.
  • If the status is cancelled, that means the user has cancelled the DDL or something went wrong during the synchronization. We cannot obtain the specific error, but we need to return an appropriate error to the user. Then the downstream can return a message to the upstream that tells him not to re-execute the DDL, and can continue to execute the following commands.
  • If the status is something other than synced/cancelled (like running/none...), that means the downstream has not executed or is currently executing the DDL. The downstream can return a message to the upstream that tells him the downstream has received his message and he do not need to re-execute the DDL, he just need to wait until the status turns to synced and can carry on the following commands.
  • If the DDL cannot be found in the DDL history job queue, that means the DDL was lost due to some reason during the synchronization, and the downstream has not received the DDL. The downstream needs to return a message to the upstream that tells him to re-execute the DDL.

To check the the current state of the specified DDL command (running/none/synced...), we use 'ADMIN SHOW DDL JOBS' to retrive the JOB_ID corresponding to each DDL command. Then use 'ADMIN SHOW DDL JOB QUERIES JOB_ID' to find the content of each DDL command. By matching the database name, table name, and content with those of the specified DDL that we want to query, we can retrieve its state.

Describe the feature you'd like:
https://docs.pingcap.com/tidb/stable/sql-statement-admin-show-ddl#examples
Now the 'ADMIN SHOW DDL JOB QUERIES JOB_ID' command can only use JOB_ID to retrieve the last ten rows of DDL commands' content. Just like described in the website: 'You can only search the running DDL job corresponding to job_id within the last ten results in the DDL history job queue.'

I would like to request a feature that the users can use commands like 'ADMIN SHOW DDL JOB QUERIES LIMIT m OFFSET n' to retrieve DDL commands' content within a certain range (n+1, n+m) that the users can assign themselves instead of only the last ten results.

Example:

mysql> ADMIN SHOW DDL JOBS;
+--------+---------+--------------------+--------------+----------------------+-----------+----------+-----------+-----------------------------------------------------------------+---------+
| JOB_ID | DB_NAME | TABLE_NAME         | JOB_TYPE     | SCHEMA_STATE         | SCHEMA_ID | TABLE_ID | ROW_COUNT | CREATE_TIME         | START_TIME          | END_TIME            | STATE   |
+--------+---------+--------------------+--------------+----------------------+-----------+----------+-----------+---------------------+-------------------------------------------+---------+
|     59 | test    | t1                 | add index    | write reorganization |         1 |       55 |     88576 | 2020-08-17 07:51:58 | 2020-08-17 07:51:58 | NULL                | running |
|     60 | test    | t2                 | add index    | none                 |         1 |       57 |         0 | 2020-08-17 07:51:59 | 2020-08-17 07:51:59 | NULL                | none    |
|     58 | test    | t2                 | create table | public               |         1 |       57 |         0 | 2020-08-17 07:41:28 | 2020-08-17 07:41:28 | 2020-08-17 07:41:28 | synced  |
|     56 | test    | t1                 | create table | public               |         1 |       55 |         0 | 2020-08-17 07:41:02 | 2020-08-17 07:41:02 | 2020-08-17 07:41:02 | synced  |
|     54 | test    | t1                 | drop table   | none                 |         1 |       50 |         0 | 2020-08-17 07:41:02 | 2020-08-17 07:41:02 | 2020-08-17 07:41:02 | synced  |
|     53 | test    | t1                 | drop index   | none                 |         1 |       50 |         0 | 2020-08-17 07:35:44 | 2020-08-17 07:35:44 | 2020-08-17 07:35:44 | synced  |
|     52 | test    | t1                 | add index    | public               |         1 |       50 |    451010 | 2020-08-17 07:34:43 | 2020-08-17 07:34:43 | 2020-08-17 07:35:16 | synced  |
|     51 | test    | t1                 | create table | public               |         1 |       50 |         0 | 2020-08-17 07:34:02 | 2020-08-17 07:34:02 | 2020-08-17 07:34:02 | synced  |
|     49 | test    | t1                 | drop table   | none                 |         1 |       47 |         0 | 2020-08-17 07:34:02 | 2020-08-17 07:34:02 | 2020-08-17 07:34:02 | synced  |
|     48 | test    | t1                 | create table | public               |         1 |       47 |         0 | 2020-08-17 07:33:37 | 2020-08-17 07:33:37 | 2020-08-17 07:33:37 | synced  |
|     46 | mysql   | stats_extended     | create table | public               |         3 |       45 |         0 | 2020-08-17 06:42:38 | 2020-08-17 06:42:38 | 2020-08-17 06:42:38 | synced  |
|     44 | mysql   | opt_rule_blacklist | create table | public               |         3 |       43 |         0 | 2020-08-17 06:42:38 | 2020-08-17 06:42:38 | 2020-08-17 06:42:38 | synced  |
+--------+---------+--------------------+--------------+----------------------+-----------+----------+-----------+---------------------+---------------------+-------------------------------+
12 rows in set (0.00 sec)
mysql> ADMIN SHOW DDL JOB QUERIES LIMIT 3 OFFSET 2;
+--------+------------------------------------------------------------------+
| JOB_ID | QUERY                                                            | 
+--------+------------------------------------------------------------------+
|     58 | CREATE TABLE t2 (id INT NOT NULL PRIMARY KEY auto_increment)     | 
|     56 | CREATE TABLE t1 (id INT NOT NULL PRIMARY KEY auto_increment)     | 
|     54 | DROP TABLE IF EXISTS t3                                          | 
+--------+------------------------------------------------------------------+
3 rows in set (0.00 sec)

Details and possible outcomes of this request can refer to this link:
https://docs.google.com/document/d/1shMk7UdczcWEdvd5YCPN7Ywr4L4JCbIWKtLEGZYSKD8/edit#heading=h.lj3nhy183z6n

Describe alternatives you've considered:
Alternative solution:
When encoutering "invalid connection", execute the 'ADMIN SHOW DDL' command every 10 seconds to check the status of the required DDL command in TiDB.
Select the current time of TiDB everytime a DDL command is executed.
Use 'ADMIN SHOW DDL JOBS' command to retrieve CREATE_TIME of every DDL command ('ADMIN SHOW DDL JOBS' can show more than 10 results), then retrieve those DDL commands whose CREATE_TIME are later than the create time of the required DDL command.
Then match the DB_NAME, TABLE_NAME, JOB_TYPE of those retrieved DDLs with the corresponding variables of the required DDL, if all the corresponding variables are equal, then return the state of the DDL that meets this requirement.

Obviously, we can see that this solution is not quite rigorous. If the content of the selected DDL cannot be guaranteed to be exactly the same as that of the required DDL, only matching other variables would probably return the wrong state.

Besides, since dm does not use TiDB HTTP API, the solution related to it does not work.

Teachability, Documentation, Adoption, Migration Strategy:
The users can use commands like 'ADMIN SHOW DDL JOB QUERIES LIMIT m OFFSET n' to retrieve DDL commands' content within a certain range (n+1, n+m) that the users can assign themselves instead of only the last ten results.

@lyzx2001 lyzx2001 added the type/feature-request Categorizes issue or PR as related to a new feature. label Jul 13, 2022
@lyzx2001
Copy link
Contributor Author

/assign @lyzx2001

@lance6716
Copy link
Contributor

@bb7133 @zimulala do you have comment for the syntax and output of ADMIN SHOW DDL JOB QUERIES LIMIT 3 OFFSET 2;?

@lyzx2001 lyzx2001 changed the title Use 'ADMIN SHOW DDL JOB QUERIES LIMIT m OFFSET n' to retrieve DDL commands' content within a certain range (n, n+m) Use 'ADMIN SHOW DDL JOB QUERIES LIMIT m OFFSET n' to retrieve DDL commands' content within a certain range (n+1, n+m) Jul 18, 2022
@yudongusa yudongusa added the feature/accepted This feature request is accepted by product managers label Jul 20, 2022
ti-chi-bot pushed a commit that referenced this issue Jul 22, 2022
ti-chi-bot pushed a commit that referenced this issue Jul 25, 2022
… retrieve DDL commands within a certain range (#36480)

close #36198
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature/accepted This feature request is accepted by product managers type/feature-request Categorizes issue or PR as related to a new feature.
Projects
None yet
3 participants