Skip to content

Commit

Permalink
[poc,dnm]: storage: use learner replicas
Browse files Browse the repository at this point in the history
This is a PR just to show some code to the interested parties. The real thing
to look at here is the explanation and suggested strategy below. Don't
review the code.

----

Learner replicas are full Replicas minus the right to vote (i.e
they don't count for quorum). They are interesting to us because they
allow us to phase out preemptive snapshots (via [delayed preemptive
snaps] as a migration strategy; see there for all the things we don't
like about preemptive snapshots), and because they can help us avoid
spending more time in vulnerable configurations than we need to.

To see an example of the latter, assume we're trying to upreplicate from
three replicas to five replicas. As is, we need to add a fourth replica,
wait for a preemptive snapshot, and add the fifth. We spend
approximately the duration of one preemptive snapshot in an even replica
configuration. In theory, we could send two preemptive snapshots first
and then carry out two replica additions, but with learners, it would be
much more straightforward and less error-prone. This doesn't solve the
problems in cockroachdb#12768, but it helps avoid them.

This PR shows the bare minimum of code changes to upreplicate using
learner replicas and suggests further steps to make them a reality.

Principally, to use learner replicas, we barely need to make any
changes. Our current up-replication code is this:

1. send preemptive snapshot
1. run the ChangeReplicas txn, which adds a `ReplicaDescriptor` to the
replicated range descriptor and, on commit, induces a Raft configuration
change (`raftpb.ConfChangeAddNode`).

The new up-replication code (note that the old one has to stick around,
because compatibility):

1. run the ChangeReplicas txn, which adds a `ReplicaDescriptor` to the
replicated range descriptor with the `Learner` flag set to true and, on commit, induces a Raft configuration
change (`raftpb.ConfChangeAddLearnerNode`).
2. wait for the learner to have caught up or send it a Raft snapshot
proactively (either works, just have to make sure not to duplicate work)
3. run a ChangeReplicas txn which removes the `Learner` flag from the `ReplicaDescriptor`
and induces a Raft conf change of `raftpb.ConfChangeAddNode` (upgrading
the learner).

The existence of learners will need updates throughout the allocator so
that it realizes that they don't count for quorum and are either
upgraded or removed in a timely manner. None of that is in this POC.

[delayed preemptive snaps]: cockroachdb#35786

Release note: None
  • Loading branch information
tbg committed Mar 15, 2019
1 parent 70e3468 commit 412c9b8
Show file tree
Hide file tree
Showing 12 changed files with 460 additions and 306 deletions.
1 change: 1 addition & 0 deletions c-deps/libroach/protos/roachpb/data.pb.cc

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion c-deps/libroach/protos/roachpb/data.pb.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

44 changes: 35 additions & 9 deletions c-deps/libroach/protos/roachpb/metadata.pb.cc

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

32 changes: 32 additions & 0 deletions c-deps/libroach/protos/roachpb/metadata.pb.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 412c9b8

Please sign in to comment.