Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store/tikv: Fix the issue that context canceled error during commit RPC is not treated as undetermined error (#20857) #20925

Merged
merged 2 commits into from
Nov 11, 2020

Conversation

ti-srebot
Copy link
Contributor

cherry-pick #20857 to release-4.0


What problem does this PR solve?

Issue Number: close #20733

#20030 introduced an issue: When a transaction's commit request to primary gets an context canceled error during RPC, it's returned early without setting the sender's rpcError field, however 2pc committer need to check that field to check if the transaction should be in undetermined state. In this case, we don't know if TiKV have received and handled the request successfully, so it should be undetermined. The problem also exists when context canceled occurs during prewrite requests of async commit transactions.

What is changed and how it works?

This PR moves the s.rpcError = err statement before returning context error, and adds some failpoint tests. However I think my tests are too tricky and I didn't find better way..

Related changes

  • Need to cherry-pick to the release branch
    • release-4.0

Check List

Tests

  • Unit test

Side effects

-

Release note

  • Fix the issue that sometimes a transaction that has undetermined result may be treated as failed.

@ti-srebot
Copy link
Contributor Author

/run-all-tests

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
@cfzjywxk
Copy link
Contributor

cfzjywxk commented Nov 9, 2020

LGTM

@ti-srebot ti-srebot added the status/LGT1 Indicates that a PR has LGTM 1. label Nov 9, 2020
Copy link
Contributor

@sticnarf sticnarf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-srebot ti-srebot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Nov 9, 2020
@cfzjywxk
Copy link
Contributor

cfzjywxk commented Nov 9, 2020

/merge

@ti-srebot
Copy link
Contributor Author

Sorry @cfzjywxk, this branch cannot be merged without an approval of release maintainers

@sticnarf
Copy link
Contributor

sticnarf commented Nov 10, 2020

/lgtm cancel
In my other tests, it will cause lots of transactions which does not meet an RPC error to become undetermined. I need to take a look again.

Never mind, my issue is related to this TODO:

// TODO: Now we return an undetermined error as long as one of the prewrite

@zz-jason
Copy link
Member

/approve

@zz-jason
Copy link
Member

/merge

@ti-srebot ti-srebot added status/can-merge Indicates a PR has been approved by a committer. status/LGT3 The PR has already had 3 LGTM. and removed status/LGT2 Indicates that a PR has LGTM 2. labels Nov 11, 2020
@ti-srebot
Copy link
Contributor Author

/run-all-tests

@ti-srebot ti-srebot merged commit 5866391 into pingcap:release-4.0 Nov 11, 2020
@tiancaiamao tiancaiamao deleted the release-4.0-af8dee160e11 branch November 30, 2020 06:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/tikv status/can-merge Indicates a PR has been approved by a committer. status/LGT3 The PR has already had 3 LGTM. type/bugfix This PR fixes a bug. type/4.0-cherry-pick
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants