storage: measure/optimize intent resolution #7503
Labels
C-investigation
Further steps needed to qualify. C-label will change.
C-performance
Perf of queries or internals. Solution not expected to change functional behavior.
Milestone
When operations on a Range encounter intents which they manage to push (see also #627), they propose
ResolveIntent
operations to Raft (which in turn either proposes them directly, or relays to the leader).It's wasteful to propose individual commands like that, especially if there are a lot of them. There are two immediate optimizations that come to mind:
This should be possible because there's no client waiting at the sending end of an intent resolution, and so it can be carried out as a "best effort", the result of the Raft command being whatever it would be without the attempt at (separately) resolving the intents.
This is written for a pre-leader-proposed-Raft world, but translates over.
We don't have definitive evidence that intent resolution is a problem, but it's obvious that it clogs Raft for certain transactional workloads since one write = one intent (or even \infinity intents for range deletes) in some scenarios, and timeouts during intent resolution are commonly observed on even moderately busy clusters, and since we put back-pressure in this path, it may lead to all sorts of issues.
Measuring the amount of intent resolution traffic appropriately is a good first step.
The text was updated successfully, but these errors were encountered: