Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Push must not use header.Txn #2300

Closed
tbg opened this issue Aug 31, 2015 · 1 comment
Closed

Push must not use header.Txn #2300

tbg opened this issue Aug 31, 2015 · 1 comment

Comments

@tbg
Copy link
Member

tbg commented Aug 31, 2015

since that means that it acts on behalf of the transaction. This has two issues:

  • it means that the transaction can have multiple things going on in parallel (async intent resolution pushes and the client's main control thread can have the Writing flag race), which can mess with the continuity of the txn record
  • we don't allow multiple coordinators to take part in a transaction, but the push doesn't know which coordinator the client is on and always uses its local one.
@tbg
Copy link
Member Author

tbg commented Aug 31, 2015

side note: @spencerkimball already did this for ResolveIntent in his WIP.

tbg added a commit to tbg/cockroach that referenced this issue Sep 4, 2015
spencerkimball added a commit that referenced this issue Sep 12, 2015
This removes the dependency on RequestHeader.Txn, which was a bad practice
and potentially problematic as it's not semantically correct. The request
header txn value is meant always to indicate direct client operations using
a txn, and this isn't the case for pushes. Amongst other things, the request
to push will go through the coordinator local to the node encountering the
intent, which is not necessarily the coordinator which the client is bound
to for the txn.

Fixes #2300.
tbg added a commit to tbg/cockroach that referenced this issue Sep 15, 2015
see cockroachdb#2130, cockroachdb#1998 for context.

* change `(*DistSender).Send` to wrap all requests in a single-
  element Batch (TxnCoordSender unwrapped any `Batch` previously).
* `(*Store).ExecuteCmd` unwraps this single-element `Batch`.
* verifyPermissions: check only the inside of BatchRequest.
* introduce `Reverse` flag on `BatchRequest` to use instead of checking for
  `ReverseScanRequest`. Goal: whole `Batch` is either `Reverse` or not, and
  any range-op in it is simply going to be using the reverse order. Not being
  able to mix reverse/non-reverse requests stems from the change in descriptor
  lookup logic.
* implemented `proto.Combinable` for `BatchRequest`; changed it to return an
`error`.
* implement logic to deal with `proto.Countable` and `proto.Bounded` in `Send`.
* batch expansion at (*Store.ExecuteCmd)
  updated ExecuteCmd so that it properly unrolls a Batch.
  Note that it has to do hacky things with `Key` and `EndKey`
  to make sure they are truncated appropriately if the `BatchRequest`
  is (after all, that's all `DistSender` adapts for cross-range ops).
  Other than that, we mostly propagate the whole header from the Batch
  and the Txn (if any) from each response to the next request.

  `TxnCoordSender` still unrolls Batches and `(*DistSender).Send` still
  wraps each individual request in a new Batch.

  Some store tests use batches "properly", and so does `resolveWriteIntentError`,
  so the new functionality is actually tested and it took a bit to get all
  tests to pass again.

  Note also that the first simplification was made: the test sender in ./storage
  lost its fake batch handling.
* complete Is{Read,Write,...} for BatchRequest
* add batch-compatible abstractions
  checking for EndTransaction and intents uses a (likely temp)
  helper that produces correct results for `BatchRequest`.
* proper truncation of batch at DistSender
  this change removes the provisional truncation code in
  `(*Store).ExecuteCmd` in favor of cleaned up code in
  `(*DistSender).Send/sendAttempt`, which has also seen
  some refactoring and many new TODOs.
* workaround for non-atomicity of batch on Store
  a retry half-way through a batch may leave an earlier CPut
  fail upon retry because it reads the intent it previously
  wrote.
* refactor range descriptor cache
  keep the recursion in LookupRange and adapt the
* retryableLocalSender -> DistSender
  remove retryableLocalSender and plug in a
  DistSender which uses a LocalSender as its
  rangeDescriptorDB.

  KV tests still fail occasionally, but this
  is likely due to lack of non-atomicity on
  Store.
  Now after rebase, they fail but I'm sure
  I can fix that.
  ./kv and ./storage pass with a few tests disabled (which are expected to fail)
  ./sql passes in 17s, have to check why
* {client->batch}.Sender
* refactor DistSender and TxnCoordSender
* send only batches from client
* don't send replies on errors (at least in {TxnCoord,Dist}Sender)
  we're now pretty close to req->(resp,err) everywhere.
* re-enable most storage tests
  changed multiTestContext and some other test contexts used
  to use a DistSender attached to a LocalSender (instead of
  previously only LocalSender). This allows re-enabling all
  of the tests which had to be disabled because they would
  involve batches which required multi-range logic not
  available without a DistSender inbetween.
* fix server.TestIntentResolution
* expand kv.TestTruncate, refactor truncate() slightly
* re-add auto txn wrap for cross-range reqs
* CmdID generation in ChunkingSender
* various test fixes
* fix RangeLookup when in reverse mode looking up EndKey of range with intent
* tweak RangeLookup logic so that empty results are never returned. Panic if it
  does happen.
* client workaround for request errors
  the client code wants to know which request failed, but
  with the (response, error) pattern no reply is returned
  on error, so no reply header can be used to find out the
  location of the error.
  this needs further fixing, see cockroachdb#1891.
* deactivate some assertions pending cockroachdb#2300
* disabled TestTree
* feedback and shape-up for merge
  attempt to remove issues slowing down tests.
  those likely arose since sending of batches
  is more sensitive to intent errors (since
  a single "command" now covers more keys)
* descNext -> bool in DistSender
* fixed various comments/TODOs
* fix rds<->kvs correspondence in RangeLookup when
  scanning extra descriptor in Reverse mode
* disable txn auto gc'ing (b/c store atomicity missing)
* make trace.Finalize() panic-aware (prints active trace to stdout, re-panics).
* carry an error over in the temporary batch unwrapping logic
* clarify the ConditionalPut hack
* disable two tests temporarily
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant