-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rfc: add an RFC describing an improved client.Txn interface #42766
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @bdarnell, @knz, @nvanbenschoten, and @tbg)
docs/RFCS/20191125_txn_interface_improvements.md, line 53 at r1 (raw file):
requesting a refresh * the `TxnCoordSender` successfully refreshes all the requests prior to *r* and *r1* (note that r*'s read spans have not yet been recorded) and bumps the
note that r
docs/RFCS/20191125_txn_interface_improvements.md, line 188 at r1 (raw file):
An alternative to the concurrency part is to simply dissallow concurrency. We have write pipelining on the server-side, which is already a form of
Write pipelining is a much less effective form of concurrency when the leaseholder is in a remote region.
I think it’s worth looking at different cases where write concurrency on top of pipelining might occur.
Two that come to mind are
- The implementation of some schema changes
Currently they perform multiple batches to write to different system tables. These tables rarely have a collocated leaseholder in global deployments.
- CTEs with multiple mutations
These will be executed sequentially IIUC.
In both these cases it might be even better to fuse separate write batches into a single batch than to run batches in parallel (up to some point). However, especially for the first case, fusing batches makes it difficult to craft abstractions which operate on a client.Txn
.
I’m not sure I have a concrete proposal re: fusing batches as it feels like a radically different programming model. Enabling parallel writes seems more approachable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for putting these thoughts in writing. I recognize your proposal from a month or two ago. I like the API you propose too, and 💯 on aligning the proposal with the savepoints RFC.
Reviewed 1 of 1 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @bdarnell, @nvanbenschoten, and @tbg)
docs/RFCS/20191125_txn_interface_improvements.md, line 19 at r1 (raw file):
concurrent actors before restarting. # Motivation
I think some xrefs with the TCS tech note would be useful in this section.
docs/RFCS/20191125_txn_interface_improvements.md, line 128 at r1 (raw file):
### Refreshes
I think the alternatives in this section would benefit from sequence diagrams akin to those in the tech note. The visual sequencing would clarify the prosaic explanation.
docs/RFCS/20191125_txn_interface_improvements.md, line 183 at r1 (raw file):
allow further operations on the handle that encountered the retriable error. However, this seems likely to confuse users.
The TCS tech note identifies various kinds of errors, not all of which are retry errors. What are the non-retry errors that would keep a txnhandle able to process further requests?
docs/RFCS/20191125_txn_interface_improvements.md, line 187 at r1 (raw file):
## Rationale and Alternatives An alternative to the concurrency part is to simply dissallow concurrency. We
nit: disallow
docs/RFCS/20191125_txn_interface_improvements.md, line 200 at r1 (raw file):
behavior of concurrent overlapping reads and writes is undefined. Is there a reason to allow a particular handle to be associated with a different read sequence number so it can be isolated from concurrent overlapping writes?
I could see the constraint checks for multiple previous CTEs in a single statement run concurrently with each other at different timestamps? This would be an optimization of course, but maybe it's desirable. I'd ask Radu.
docs/RFCS/20191125_txn_interface_improvements.md, line 206 at r1 (raw file):
extant concurrent actors , whereas the point of leaves is to allow txn state to be disseminated to remote nodes, updated , and collected back. So I don't think there's much to unify, but perhaps someone can open my eyes.
I'd say that a Leaf is a particular kind of txnhandle, one that can do fewer things. But it probably should implement the interface.
docs/RFCS/20191125_txn_interface_improvements.md, line 208 at r1 (raw file):
there's much to unify, but perhaps someone can open my eyes. 3. Should we take the opportunity and make a `client.Txn` unusable after a `TransactionAbortedError`? The current code is pretty surprising - the
yes!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @nvanbenschoten, and @tbg)
docs/RFCS/20191125_txn_interface_improvements.md, line 142 at r1 (raw file):
recorded and their response is returned to the client), then we can refresh. We’d also block new requests from starting. The problem here is that, if one of these concurrent requests causes the refresh to fail, it’s too to retry that
s/too/too late/
docs/RFCS/20191125_txn_interface_improvements.md, line 144 at r1 (raw file):
these concurrent requests causes the refresh to fail, it’s too to retry that request at the new read timestamp (since the result was already delivered to the client). This seems suboptimal, but it’s also analogous the handling of all the
Just to be explicit, the downside here is that we'd return a user-visible retry error, right? Would option 2 or 3 be able to avoid that?
... allowing for concurrent use of a txn. Release note: None
... allowing for concurrent use of a txn.
Release note: None