-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: rollback to savepoint after schema change can lead to deadlock #24885
Comments
…epoint Prior to this patch, the following sequence would write a row that's then unaccessible: begin; savepoint cockroach_restart; create table t(x int primary key); rollback to savepoint cockroach_restart; insert into t(x) values(1); release savepoint cockroach_restart; commit; The insert manages to resolve the table because we weren't clearing the "uncommitted descriptors" set that the session maintains. Since the table descriptor doesn't end up being committed, the row will be essentially unreachable. This patch is original work on the 1.1 branch. This bug is not present in the 2.0 release, where we're better about reseting state on transaction retries. I believe on 1.1 there's still a similar issue present when doing automatic retries (as opposed to user-directed ones through rollback to savepoint). I think fixing those would be more involved because it's less clear where to stick the cleanup done in this patch; I'd rather not do anything this point. Also note that, with the fix, the sequence above doesn't actually work. Instead, it deadlocks, just like it does on 2.0, because of cockroachdb#24885. However, the following works: begin; savepoint cockroach_restart; create table t(x int primary key); rollback to savepoint cockroach_restart; create table t(x int primary key); insert into t(x) values(1); release savepoint cockroach_restart; commit; Fixes cockroachdb#24785
A similar deadlock happens when you create a table in an open transaction and run some commands that require that schema in other transactions. |
Note to self: #24888 added a test that (when ported to 2.0) can be simplified if we fix this issue. |
@vivekmenezes that's not a deadlock, is it? That's normal reader blocks for writer behavior. |
@andreimatei yeah you're right that's just requests blocking on an uncommitted intent. Please add the test in #24888 to 2.0 so we don't regress on it. Thanks! |
24888: sql: Fix "dangling rows" left over by schema change + rollback to savepoint r=andreimatei a=andreimatei Prior to this patch, the following sequence would write a row that's then unaccessible: begin; savepoint cockroach_restart; create table t(x int primary key); rollback to savepoint cockroach_restart; insert into t(x) values(1); release savepoint cockroach_restart; commit; The insert manages to resolve the table because we weren't clearing the "uncommitted descriptors" set that the session maintains. Since the table descriptor doesn't end up being committed, the row will be essentially unreachable. This patch is original work on the 1.1 branch. This bug is not present in the 2.0 release, where we're better about reseting state on transaction retries. I believe on 1.1 there's still a similar issue present when doing automatic retries (as opposed to user-directed ones through rollback to savepoint). I think fixing those would be more involved because it's less clear where to stick the cleanup done in this patch; I'd rather not do anything this point. Also note that, with the fix, the sequence above doesn't actually work. Instead, it deadlocks, just like it does on 2.0, because of #24885. However, the following works: begin; savepoint cockroach_restart; create table t(x int primary key); rollback to savepoint cockroach_restart; create table t(x int primary key); insert into t(x) values(1); release savepoint cockroach_restart; commit; Fixes #24785 cc @cockroachdb/release @gpaul Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>
@andreimatei this isn't 2.1, is it? |
It's not the worst thing in the world. In any case I'll pass it along to @vivekmenezes , thanks for the ping.
deadlocks, but
doesn't. |
@andreimatei I've not studied this issue. It is my intention to move all schema change writes to commit time, which will also fix #20526 I've been holding off on that because of the worry about leases that keep renewing themselves pushing the schema change writer. But I do plan on doing that as soon as epoch based table leases is merged, because epoch based table leases is going to considerably reduce the read demands on the table descriptors |
moving this to the backlog. This is still an extant issue |
Hey @andreimatei, would it be possible to make this happen as part of your SAVEPOINT work this release? I talked with @ajwerner, who suggests that the right way to move forward is to have DDL-containing transactions who get their epoch bumped use a new transaction (which functionally drops all of their outstanding intents) in the conn executor state machine. |
That's what the original message in this thread suggests too. But how is it related to savepoints? Nice try, but I'd rather let someone else do it. I can review the PR, though. |
I thought it was related enough because you are already working heavily in the conn executor state machine - I figured it wouldn't be too much extra trouble to add this too. But maybe I figured wrong. |
I played around a bit. There's a couple of weird things going on.
client 2: close client 1. Client 2 is still blocked! Any new client will also block. Only restarting the server seems to fix it. Closing the connection is supposed to rollback the respective txn, and it generally does. The 2nd interesting thing is that depending on exactly how I issue the operations (or their timing?), I don't always get the deadlock. When the thing deadlocks, there's a TableCollection.getTableVersion() which is deadlocked. The stack looks like this When the final select doesn't deadlock, its trace looks like this:
I'll understand what's going on. |
I don't see any evidence that points to that. The lock issue was related to lease transfers, which don't seem to be at play here. Also, I tested this with the fix and could still reproduce. Doesn't it seem more likely that whatever is deadlocking the single transaction on itself is also blocking the other one? |
Yes, but my concern is with the rollback that's supposed to happen on connection close. The canceling of the connection is supposed to get out of the deadlock by canceling various contexts, and then everything is supposed to be rolled back - and so other clients are supposed to be unblocked. |
Fixing the connection close bug in #45835 |
Back to why we sometimes deadlock and sometimes don't, the difference seems to come down to |
When we don't deadlock because reading Line 380 in 1007293
@otan , could you please tell me what the deal is around here? I thought that the way things work is that once a txn has created some tables, those tables are resolved locally by the tableCollection without needing to go to the store. Is that not the case?
|
Not quite sure, I haven't done too much with table collections. I can try uncover it tomorrow. |
I'm trying to understand Lines 1476 to 1485 in f5fe582
which seems to be the reason we fall back to Lines 376 to 381 in 1007293
it seems we have to fallback to uncached accessor (with no txn context) as the p.Tables() makes it look like it's disappeared, but we want to see if it's actually gone. in that case, it does appear as if using the same txn for that is broken. i'm not sure if not using the txn is the behaviour we always want, though - i.e. does it break anything if we use a fresh transaction for this? |
Oliver, don't worry about it if you've got nothing to do with this. I thought you had introduced that fallback since I saw your name on the blame. I'll bring it to the attention of @ajwerner when I get back to the office; he seems to want to own this zoo. The fallback is only peripheral to this issue anyway. |
I've figured out why the sequence doesn't always deadlock. If the writes to the namespace table happen to run into the tscache, then the respective intents will be written at a pushed timestamp. I've implemented the fix for the deadlock that we were talking about: having the connExecutor transparently switch the KV txn to a new one in certain cases. It's not much code, but it's kinda unfortunate and there's no great place to put it. It makes me a bit unhappy. I'm now thinking that perhaps it'd be better to deal with this problem in KV and have a feature there that says that locks on a particular span of keys must be released when restarting or rolling back to savepoints. I want to discuss it with my peeps. |
This patch makes reading the system.namespace table and reading table descriptors in high-priority transactions. The transactions will push locks owned by schema changes out of their way. The idea is that regular SQL transactions reading schema don't want to wait for the transactions performing DDL statements. Instead, those DDLs will be pushed and forced to refresh. Besides the benefit to regular transactions, this patch also prevents deadlocks for the transactions performing DDL. Before this patch, the final select in the following sequence would deadlock: begin; savepoint s; create table t(x int); rollback to savepoint s; select * from t; The select is reading the namespace table, using a different txn from its own. That read would block on the intent laid down by the prior create. With this patch, the transaction effectively pushes itself, but gets to otherwise run. Fixes cockroachdb#24885 Release justification: Fix for an old bug that became more preminent when we introduced SAVEPOINTs recently. Release note: A rare bug causing transactions that have performed schema changes to deadlock after they restart has been fixed.
46170: sql: resolve schema in high-priority transactions r=andreimatei a=andreimatei This patch makes reading the system.namespace table and reading table descriptors in high-priority transactions. The transactions will push locks owned by schema changes out of their way. The idea is that regular SQL transactions reading schema don't want to wait for the transactions performing DDL statements. Instead, those DDLs will be pushed and forced to refresh. Besides the benefit to regular transactions, this patch also prevents deadlocks for the transactions performing DDL. Before this patch, the final select in the following sequence would deadlock: begin; savepoint s; create table t(x int); rollback to savepoint s; select * from t; The select is reading the namespace table, using a different txn from its own. That read would block on the intent laid down by the prior create. With this patch, the transaction effectively pushes itself, but gets to otherwise run. Fixes #24885 Release justification: Fix for an old bug that became more preminent when we introduced SAVEPOINTs recently. Release note: A rare bug causing transactions that have performed schema changes to deadlock after they restart has been fixed. Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>
interesting, i can still repro #44385 pretty easily with this change in -- now i don't even have to run the "SHOW TABLES" query to get the deadlock. repro:
edit: nvm they just take a minute now:
|
I don't know what's up with with the delay. Mind filing it separately for Lucy or Werner? |
This patch makes reading the system.namespace table and reading table descriptors in high-priority transactions. The transactions will push locks owned by schema changes out of their way. The idea is that regular SQL transactions reading schema don't want to wait for the transactions performing DDL statements. Instead, those DDLs will be pushed and forced to refresh. Besides the benefit to regular transactions, this patch also prevents deadlocks for the transactions performing DDL. Before this patch, the final select in the following sequence would deadlock: begin; savepoint s; create table t(x int); rollback to savepoint s; select * from t; The select is reading the namespace table, using a different txn from its own. That read would block on the intent laid down by the prior create. With this patch, the transaction effectively pushes itself, but gets to otherwise run. Fixes cockroachdb#24885 Release justification: Fix for an old bug that became more preminent when we introduced SAVEPOINTs recently. Release note: A rare bug causing transactions that have performed schema changes to deadlock after they restart has been fixed.
The following deadlocks:
The reason why it deadlocks is that the create left an intent on something (probably a table name record); the intent will never commit because txn's epoch is incremented by the rollback to savepoint, but it still exists. The select then tries to resolve that name in order to get a lease on that table - and that name resolution process uses a different txn which gets blocked on the intent.
Funny enough, if you do another
create
after the rollback, that works - probably because all reads/writes are done using the main txn in that case.Found while investigating #24785
One way to fix this is for transactions that perform schema changes to employ a policy that says that they don't get to keep intents from old epochs - or rather, whenever we would bump the epoch, we'd create a new txn instead. This would mean that they don't get to benefit from the "reservation" effect of leaving intents around, but who cares. The moral justification is that these transactions touched "special" keys.
Another way is to do all lease acquisitions and name resolution and such by using the transaction running the statement that caused the lease. But I don't like that at all; it goes against divorcing the table descriptor caching layer from SQL transactions which I think is generally a great thing.
Opinions?
cc @vivekmenezes @bdarnell
The text was updated successfully, but these errors were encountered: