Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: excessively large raft logs #27772

Closed
petermattis opened this issue Jul 19, 2018 · 8 comments · Fixed by #27774
Closed

storage: excessively large raft logs #27772

petermattis opened this issue Jul 19, 2018 · 8 comments · Fixed by #27774
Assignees
Labels
A-kv-client Relating to the KV client and the KV interface. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-1-stability Severe stability issues that can be fixed by upgrading, but usually don’t resolve by restarting
Milestone

Comments

@petermattis
Copy link
Collaborator

We've seen a handful of occurrences of excessively large Raft logs in clusters in the wild without being able to identify the root cause. A large Raft log should normally not occur as the proposal quota mechanism should limit the size of the Raft log if a range is healthy. If one node in the range is down the proposal quota mechanism will drop that replica from consideration, but at the same time allow Raft log truncation to occur, so the size of the Raft log should be limited to whatever Raft log truncation dictates (default 4MiB).

While each follower in a range maintains a Raft log, control of what gets written to the Raft log is in the hands of the leader. No leader and there are no Raft log writes. What happens if there is a leader but the Raft log entries it proposes are never getting applied? This can happen when the range is below quorum. Prior to 2.0, the Raft CheckQuorum mechanism would kick in and the leader would quickly step down. In 2.0 we turned on PreVote and disabled CheckQuorum (@bdarnell are they incompatible?). With CheckQuorum disabled and PreVote enabled a leader can remain the leader forever when the range is below quorum. The PreVote mechanism prevents another range from calling an election and without CheckQuorum enabled the leader won't step down.

Note that the proposal quota mechanism still applies in this scenario and incoming operations will eventually block waiting for quota. There is another mechanism at work. The periodic Raft proposal refresh (which is necessary to deal with dropped proposals) interacts badly with this scenario. In particular, we refresh pending proposals every leader election timeout period (3s?). Refreshing a proposal only results in reproposing if the lease index is still compatible, but if a range is below quorum no other commands will be applied so it seems like we would see reproposals.

The above is a theory based on a reading of the code. It matches the conditions in the clusters that have experienced problems with large Raft logs (i.e. ranges that have gone through long periods of unavailability). @nvanbenschoten is going to work on writing a test to reproduce which should be straightforward to do if the above is correct.

Fixing this problem shouldn't be too difficult. @bdarnell says:

we want to stop refreshRaftProposals(reasonTicks) if liveness indicates that we don't have a quorum of live followers

and

i think we might also want to only do reasonTicks refreshes if we are a follower

the reason that path exists is due to raft's unacknowledged proposal forwarding

if we handled ErrProposalDropped (#21849) and did our own forwarding, i think we might be able to get rid of reasonTicks refreshes completely

@petermattis petermattis added S-1-stability Severe stability issues that can be fixed by upgrading, but usually don’t resolve by restarting A-kv-client Relating to the KV client and the KV interface. labels Jul 19, 2018
@petermattis petermattis added this to the 2.0.x milestone Jul 19, 2018
@nvanbenschoten
Copy link
Member

I wrote up a test case that proves this theory to be valid. See #27774.

@nvanbenschoten
Copy link
Member

Because we use a raft.RawNode instead of the standard raft.node, the locking should be sufficient for us to determine whether we're a leader, a candidate, or a follower whenever we propose a command. One solution to this would then be to mark ProposalData as "non-reproposable" immediately before they have been proposed if the local raft group is the leader. We could then just ignore these proposals in refreshProposalsLocked whenver the reason is reasonTicks.

On a side note, paging this back in/learning new stuff about re-proposals has made me much more excited about #21849. I think we should begin handling raft.ErrProposalDropped soon even if we don't go the extra step of doing our own reliable proposal forwarding. It should be really simple and I fear we're leaving latency on the table for no real reason by ignoring these errors.

@nvanbenschoten
Copy link
Member

nvanbenschoten commented Jul 20, 2018

The following is enough to get the new test (and all others in the storage package) to pass:

--- a/pkg/storage/replica.go
+++ b/pkg/storage/replica.go
@@ -3663,6 +3663,7 @@ func defaultSubmitProposalLocked(r *Replica, p *ProposalData) error {
                // We're proposing a command so there is no need to wake the leader if we
                // were quiesced.
                r.unquiesceLocked()
+               p.proposedByLeader = raftGroup.Status().RaftState == raft.StateLeader
                return false /* unquiesceAndWakeLeader */, raftGroup.Propose(encode(p.idKey, data))
        })
 }
@@ -4595,6 +4596,8 @@ func (r *Replica) refreshProposalsLocked(refreshAtDelta int, reason refreshRaftR
                } else if reason == reasonTicks && p.proposedAtTicks > r.mu.ticks-refreshAtDelta {
                        // The command was proposed too recently, don't bother reproprosing it
                        // yet.
+               } else if reason == reasonTicks && p.proposedByLeader {
+                       // The command could not have been lost.
                } else {
                        // The proposal can still apply according to its MaxLeaseIndex. But it's
                        // likely that the proposal was dropped, so we're going to repropose it
diff --git a/pkg/storage/replica_proposal.go b/pkg/storage/replica_proposal.go
index 9cd31af752..4f69e2a886 100644
--- a/pkg/storage/replica_proposal.go
+++ b/pkg/storage/replica_proposal.go
@@ -65,6 +65,10 @@ type ProposalData struct {
        // last (re-)proposed.
        proposedAtTicks int

+       // proposedByLeader indicates that the node that created the proposal was
+       // the leader when the proposal was proposed.
+       proposedByLeader bool
+
        // command is serialized and proposed to raft. In the event of
        // reproposals its MaxLeaseIndex field is mutated.
        command *storagebase.RaftCommand

@bdarnell
Copy link
Contributor

In 2.0 we turned on PreVote and disabled CheckQuorum (@bdarnell are they incompatible?).

No, PreVote and CheckQuorum are compatible. We treated them as mutually exclusive because they are partially redundant (our original interest in CheckQuorum was for its ability to minimize disruption when recovering from partitions, which is also addressed in a better way by PreVote). The reason we disabled CheckQuorum when we enabled PreVote is that CheckQuorum is not quite compatible with quiesced replicas, and required hacks like the (now removed) tickQuiesced method.

I think we should begin handling raft.ErrProposalDropped soon even if we don't go the extra step of doing our own reliable proposal forwarding. It should be really simple and I fear we're leaving latency on the table for no real reason by ignoring these errors.

What specifically would you do with this error to improve latency? We can't do anything until we know the leader, and when we know we'll generate a reasonNewLeader refresh.

As for the fix, I wasn't thinking we'd need to track whether a particular proposal had been proposed by a leader. All we need is to check whether this replica is currently the leader, and if so ignore reasonTicks. The theory is that if we were not the leader at the time of the previous proposal, we'd get a reasonNewLeader when we became leader, and are therefore not relying on reasonTicks.

Alternately, we could use NodeLiveness, and ignore reasonTicks (regardless of our current state) whenever NodeLiveness indicates that there is not a quorum of live nodes. This is a little more expensive and a little more code than checking the state, but it feels like a narrower fix and might be more suitable to backport (I'm vaguely concerned about missing an edge case if we base everything on raft leadership).

@petermattis
Copy link
Collaborator Author

No, PreVote and CheckQuorum are compatible. We treated them as mutually exclusive because they are partially redundant (our original interest in CheckQuorum was for its ability to minimize disruption when recovering from partitions, which is also addressed in a better way by PreVote). The reason we disabled CheckQuorum when we enabled PreVote is that CheckQuorum is not quite compatible with quiesced replicas, and required hacks like the (now removed) tickQuiesced method.

Ah, so CheckQuorum was providing an unexpected benefit in preventing quorum-less leaders. We should all do some day dreaming about whether there are other scenarios where a quorum-less leader can cause a problem.

@nvanbenschoten
Copy link
Member

What specifically would you do with this error to improve latency? We can't do anything until we know the leader, and when we know we'll generate a reasonNewLeader refresh.

I didn't fully understand the interactions here when posting that. I think I do a little better now. Is it safe to think about ErrProposalDropped errors as an indication that proposals are queueing and that we're waiting for a reasonNewLeader to send them all out?

Alternately, we could use NodeLiveness, and ignore reasonTicks (regardless of our current state) whenever NodeLiveness indicates that there is not a quorum of live nodes. This is a little more expensive and a little more code than checking the state, but it feels like a narrower fix and might be more suitable to backport (I'm vaguely concerned about missing an edge case if we base everything on raft leadership).

I think this would also adress the 5x replication factor issue discussion mentioned in #27774 (comment). Do you see any downsides to tying this to node liveness though? For instance, I'm worried we might run into issues on the node liveness range itself when a node that is currently down is trying to indicate that it is now live. Maybe that's fine because node liveness updates are in their own retry loop anyway.

@bdarnell
Copy link
Contributor

Is it safe to think about ErrProposalDropped errors as an indication that proposals are queueing and that we're waiting for a reasonNewLeader to send them all out?

There is no queuing at the raft level. The raft Propose method does one of three things:

  • If leader, append to log
  • If follower of known leader, send (enqueue) MsgProp to the leader
  • If leader unknown, return ErrProposalDropped

ErrProposalDropped tells us that the leader is unknown, so we could infer that no proposals will succeed until we learn of the new leader.

Do you see any downsides to tying this to node liveness though? For instance, I'm worried we might run into issues on the node liveness range itself when a node that is currently down is trying to indicate that it is now live. Maybe that's fine because node liveness updates are in their own retry loop anyway.

Yeah, as I said on the other issue, there might be trouble with the liveness range itself. The fact that liveness updates have short timeouts and get retried themselves may mitigate it, though.

@nvanbenschoten
Copy link
Member

There is no queuing at the raft level.

Right, I didn't mean actual queuing, I was just making an analogy for the sake of my own understanding. It is safe to ignore an ErrProposalDropped because we are still holding the proposal in Replica.mu.proposals and waiting until we learn about a new leader, at which point we re-propose in MaxLeaseIndex order.

nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jul 21, 2018
Fixes cockroachdb#27772.

This change adds safeguards to prevent cases where a raft log
would grow without bound during loss of quorum scenarios. It
also adds a new test that demonstrates that the raft log does
not grow without bound in these cases.

There are two cases that need to be handled to prevent the
unbounded raft log growth observed in cockroachdb#27772.
1. When the leader proposes a command and cannot establish a
   quorum. In this case, we know the leader has the entry in
   its log, so there's no need to refresh it with `reasonTicks`.
   To avoid this, we no longer use `refreshTicks` as a leader.
2. When a follower proposes a command that is forwarded to the
   leader who cannot establish a quorum. In this case, the
   follower can't be sure (currently) that the leader got the
   proposal, so it needs to refresh using `reasonTicks`. However,
   the leader now detects duplicate forwarded proposals and
   avoids appending redundant entries to its log. It does so
   by maintaining a set of in-flight forwarded proposals that
   it has received during its term as leader. This set is reset
   after every leadership change.

Both of these cases are tested against in the new
TestLogGrowthWhenRefreshingPendingCommands. Without both of
the safeguards introduced in this commit, the test fails.

Release note (bug fix): Prevent loss of quorum situations from
allowing unbounded growth of a Range's Raft log.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jul 21, 2018
Fixes cockroachdb#27772.

This change adds safeguards to prevent cases where a raft log
would grow without bound during loss of quorum scenarios. It
also adds a new test that demonstrates that the raft log does
not grow without bound in these cases.

There are two cases that need to be handled to prevent the
unbounded raft log growth observed in cockroachdb#27772.
1. When the leader proposes a command and cannot establish a
   quorum. In this case, we know the leader has the entry in
   its log, so there's no need to refresh it with `reasonTicks`.
   To avoid this, we no longer use `refreshTicks` as a leader.
2. When a follower proposes a command that is forwarded to the
   leader who cannot establish a quorum. In this case, the
   follower can't be sure (currently) that the leader got the
   proposal, so it needs to refresh using `reasonTicks`. However,
   the leader now detects duplicate forwarded proposals and
   avoids appending redundant entries to its log. It does so
   by maintaining a set of in-flight forwarded proposals that
   it has received during its term as leader. This set is reset
   after every leadership change.

Both of these cases are tested against in the new
TestLogGrowthWhenRefreshingPendingCommands. Without both of
the safeguards introduced in this commit, the test fails.

Release note (bug fix): Prevent loss of quorum situations from
allowing unbounded growth of a Range's Raft log.
@tbg tbg added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Jul 22, 2018
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jul 23, 2018
Fixes cockroachdb#27772.

This change adds safeguards to prevent cases where a raft log
would grow without bound during loss of quorum scenarios. It
also adds a new test that demonstrates that the raft log does
not grow without bound in these cases.

There are two cases that need to be handled to prevent the
unbounded raft log growth observed in cockroachdb#27772.
1. When the leader proposes a command and cannot establish a
   quorum. In this case, we know the leader has the entry in
   its log, so there's no need to refresh it with `reasonTicks`.
   To avoid this, we no longer use `refreshTicks` as a leader.
2. When a follower proposes a command that is forwarded to the
   leader who cannot establish a quorum. In this case, the
   follower can't be sure (currently) that the leader got the
   proposal, so it needs to refresh using `reasonTicks`. However,
   the leader now detects duplicate forwarded proposals and
   avoids appending redundant entries to its log. It does so
   by maintaining a set of in-flight forwarded proposals that
   it has received during its term as leader. This set is reset
   after every leadership change.

Both of these cases are tested against in the new
TestLogGrowthWhenRefreshingPendingCommands. Without both of
the safeguards introduced in this commit, the test fails.

Release note (bug fix): Prevent loss of quorum situations from
allowing unbounded growth of a Range's Raft log.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jul 23, 2018
Fixes cockroachdb#27772.

This change adds safeguards to prevent cases where a raft log
would grow without bound during loss of quorum scenarios. It
also adds a new test that demonstrates that the raft log does
not grow without bound in these cases.

There are two cases that need to be handled to prevent the
unbounded raft log growth observed in cockroachdb#27772.
1. When the leader proposes a command and cannot establish a
   quorum. In this case, we know the leader has the entry in
   its log, so there's no need to refresh it with `reasonTicks`.
   To avoid this, we no longer use `refreshTicks` as a leader.
2. When a follower proposes a command that is forwarded to the
   leader who cannot establish a quorum. In this case, the
   follower can't be sure (currently) that the leader got the
   proposal, so it needs to refresh using `reasonTicks`. However,
   the leader now detects duplicate forwarded proposals and
   avoids appending redundant entries to its log. It does so
   by maintaining a set of in-flight forwarded proposals that
   it has received during its term as leader. This set is reset
   after every leadership change.

Both of these cases are tested against in the new
TestLogGrowthWhenRefreshingPendingCommands. Without both of
the safeguards introduced in this commit, the test fails.

Release note (bug fix): Prevent unbounded growth of the raft log
caused by a loss of quorum.
craig bot pushed a commit that referenced this issue Jul 23, 2018
27774: storage: prevent unbounded raft log growth without quorum r=nvanbenschoten a=nvanbenschoten

Fixes #27772.

This change adds safeguards to prevent cases where a raft log
would grow without bound during loss of quorum scenarios. It
also adds a new test that demonstrates that the raft log does
not grow without bound in these cases.

There are two cases that need to be handled to prevent the
unbounded raft log growth observed in #27772.
1. When the leader proposes a command and cannot establish a
   quorum. In this case, we know the leader has the entry in
   its log, so there's no need to refresh it with `reasonTicks`.
   To avoid this, we no longer use `refreshTicks` as a leader.
2. When a follower proposes a command that is forwarded to the
   leader who cannot establish a quorum. In this case, the
   follower can't be sure (currently) that the leader got the
   proposal, so it needs to refresh using `reasonTicks`. However,
   the leader now detects duplicate forwarded proposals and
   avoids appending redundant entries to its log. It does so
   by maintaining a set of in-flight forwarded proposals that
   it has received during its term as leader. This set is reset
   after every leadership change.

Both of these cases are tested against in the new
TestLogGrowthWhenRefreshingPendingCommands. Without both of
the safeguards introduced in this commit, the test fails.

Release note (bug fix): Prevent loss of quorum situations from
allowing unbounded growth of a Range's Raft log.

Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
@craig craig bot closed this as completed in #27774 Jul 23, 2018
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Aug 6, 2018
Fixes cockroachdb#27772.

This change adds safeguards to prevent cases where a raft log
would grow without bound during loss of quorum scenarios. It
also adds a new test that demonstrates that the raft log does
not grow without bound in these cases.

There are two cases that need to be handled to prevent the
unbounded raft log growth observed in cockroachdb#27772.
1. When the leader proposes a command and cannot establish a
   quorum. In this case, we know the leader has the entry in
   its log, so there's no need to refresh it with `reasonTicks`.
   To avoid this, we no longer use `refreshTicks` as a leader.
2. When a follower proposes a command that is forwarded to the
   leader who cannot establish a quorum. In this case, the
   follower can't be sure (currently) that the leader got the
   proposal, so it needs to refresh using `reasonTicks`. However,
   the leader now detects duplicate forwarded proposals and
   avoids appending redundant entries to its log. It does so
   by maintaining a set of in-flight forwarded proposals that
   it has received during its term as leader. This set is reset
   after every leadership change.

Both of these cases are tested against in the new
TestLogGrowthWhenRefreshingPendingCommands. Without both of
the safeguards introduced in this commit, the test fails.

Release note (bug fix): Prevent unbounded growth of the raft log
caused by a loss of quorum.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Aug 6, 2018
Fixes cockroachdb#27772.

This change adds safeguards to prevent cases where a raft log
would grow without bound during loss of quorum scenarios. It
also adds a new test that demonstrates that the raft log does
not grow without bound in these cases.

There are two cases that need to be handled to prevent the
unbounded raft log growth observed in cockroachdb#27772.
1. When the leader proposes a command and cannot establish a
   quorum. In this case, we know the leader has the entry in
   its log, so there's no need to refresh it with `reasonTicks`.
   To avoid this, we no longer use `refreshTicks` as a leader.
2. When a follower proposes a command that is forwarded to the
   leader who cannot establish a quorum. In this case, the
   follower can't be sure (currently) that the leader got the
   proposal, so it needs to refresh using `reasonTicks`. However,
   the leader now detects duplicate forwarded proposals and
   avoids appending redundant entries to its log. It does so
   by maintaining a set of in-flight forwarded proposals that
   it has received during its term as leader. This set is reset
   after every leadership change.

Both of these cases are tested against in the new
TestLogGrowthWhenRefreshingPendingCommands. Without both of
the safeguards introduced in this commit, the test fails.

Release note (bug fix): Prevent unbounded growth of the raft log
caused by a loss of quorum.
craig bot pushed a commit that referenced this issue Aug 6, 2018
27868: backport-2.0: storage: prevent unbounded raft log growth without quorum r=nvanbenschoten a=nvanbenschoten

Backport 2/2 commits from #27774.

/cc @cockroachdb/release

---

Fixes #27772.

This change adds safeguards to prevent cases where a raft log
would grow without bound during loss of quorum scenarios. It
also adds a new test that demonstrates that the raft log does
not grow without bound in these cases.

There are two cases that need to be handled to prevent the
unbounded raft log growth observed in #27772.
1. When the leader proposes a command and cannot establish a
   quorum. In this case, we know the leader has the entry in
   its log, so there's no need to refresh it with `reasonTicks`.
   To avoid this, we no longer use `refreshTicks` as a leader.
2. When a follower proposes a command that is forwarded to the
   leader who cannot establish a quorum. In this case, the
   follower can't be sure (currently) that the leader got the
   proposal, so it needs to refresh using `reasonTicks`. However,
   the leader now detects duplicate forwarded proposals and
   avoids appending redundant entries to its log. It does so
   by maintaining a set of in-flight forwarded proposals that
   it has received during its term as leader. This set is reset
   after every leadership change.

Both of these cases are tested against in the new
TestLogGrowthWhenRefreshingPendingCommands. Without both of
the safeguards introduced in this commit, the test fails.

Release note (bug fix): Prevent loss of quorum situations from
allowing unbounded growth of a Range's Raft log.


28225: release-2.0: importccl: Preserve '\r\n' during CSV import r=dt a=dt

Backport 1/1 commits from #28181.

/cc @cockroachdb/release

---

See #25344.


Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Co-authored-by: neeral <neeral@users.noreply.github.com>
Co-authored-by: David Taylor <tinystatemachine@gmail.com>
nvanbenschoten added a commit to nvanbenschoten/etcd that referenced this issue Oct 9, 2018
The suggested pattern for Raft proposals is that they be retried
periodically until they succeed. This turns out to be an issue
when a leader cannot commit entries because the leader will continue
to append re-proposed entries to its log without committing anything.
This can result in the uncommitted tail of a leader's log growing
without bound until it is able to commit entries.

This change add a safeguard to protect against this case where a
leader's log can grow without bound during loss of quorum scenarios.
It does so by introducing a new, optional `MaxUncommittedEntries`
configuration. This config limits the max number of uncommitted
entries that may be appended to a leader's log. Once this limit
is exceeded, proposals will begin to return ErrProposalDropped
errors.

See cockroachdb/cockroach#27772
nvanbenschoten added a commit to nvanbenschoten/etcd that referenced this issue Oct 9, 2018
The suggested pattern for Raft proposals is that they be retried
periodically until they succeed. This turns out to be an issue
when a leader cannot commit entries because the leader will continue
to append re-proposed entries to its log without committing anything.
This can result in the uncommitted tail of a leader's log growing
without bound until it is able to commit entries.

This change add a safeguard to protect against this case where a
leader's log can grow without bound during loss of quorum scenarios.
It does so by introducing a new, optional `MaxUncommittedEntries`
configuration. This config limits the max number of uncommitted
entries that may be appended to a leader's log. Once this limit
is exceeded, proposals will begin to return ErrProposalDropped
errors.

See cockroachdb/cockroach#27772
nvanbenschoten added a commit to nvanbenschoten/etcd that referenced this issue Oct 11, 2018
The suggested pattern for Raft proposals is that they be retried
periodically until they succeed. This turns out to be an issue
when a leader cannot commit entries because the leader will continue
to append re-proposed entries to its log without committing anything.
This can result in the uncommitted tail of a leader's log growing
without bound until it is able to commit entries.

This change add a safeguard to protect against this case where a
leader's log can grow without bound during loss of quorum scenarios.
It does so by introducing a new, optional ``MaxUncommittedEntriesSize
configuration. This config limits the max aggregate size of uncommitted
entries that may be appended to a leader's log. Once this limit
is exceeded, proposals will begin to return ErrProposalDropped
errors.

See cockroachdb/cockroach#27772
nvanbenschoten added a commit to nvanbenschoten/etcd that referenced this issue Oct 12, 2018
The suggested pattern for Raft proposals is that they be retried
periodically until they succeed. This turns out to be an issue
when a leader cannot commit entries because the leader will continue
to append re-proposed entries to its log without committing anything.
This can result in the uncommitted tail of a leader's log growing
without bound until it is able to commit entries.

This change add a safeguard to protect against this case where a
leader's log can grow without bound during loss of quorum scenarios.
It does so by introducing a new, optional ``MaxUncommittedEntriesSize
configuration. This config limits the max aggregate size of uncommitted
entries that may be appended to a leader's log. Once this limit
is exceeded, proposals will begin to return ErrProposalDropped
errors.

See cockroachdb/cockroach#27772
nvanbenschoten added a commit to nvanbenschoten/etcd that referenced this issue Oct 12, 2018
The suggested pattern for Raft proposals is that they be retried
periodically until they succeed. This turns out to be an issue
when a leader cannot commit entries because the leader will continue
to append re-proposed entries to its log without committing anything.
This can result in the uncommitted tail of a leader's log growing
without bound until it is able to commit entries.

This change add a safeguard to protect against this case where a
leader's log can grow without bound during loss of quorum scenarios.
It does so by introducing a new, optional ``MaxUncommittedEntriesSize
configuration. This config limits the max aggregate size of uncommitted
entries that may be appended to a leader's log. Once this limit
is exceeded, proposals will begin to return ErrProposalDropped
errors.

See cockroachdb/cockroach#27772
nvanbenschoten added a commit to nvanbenschoten/etcd that referenced this issue Oct 14, 2018
The suggested pattern for Raft proposals is that they be retried
periodically until they succeed. This turns out to be an issue
when a leader cannot commit entries because the leader will continue
to append re-proposed entries to its log without committing anything.
This can result in the uncommitted tail of a leader's log growing
without bound until it is able to commit entries.

This change add a safeguard to protect against this case where a
leader's log can grow without bound during loss of quorum scenarios.
It does so by introducing a new, optional ``MaxUncommittedEntriesSize
configuration. This config limits the max aggregate size of uncommitted
entries that may be appended to a leader's log. Once this limit
is exceeded, proposals will begin to return ErrProposalDropped
errors.

See cockroachdb/cockroach#27772
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-client Relating to the KV client and the KV interface. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-1-stability Severe stability issues that can be fixed by upgrading, but usually don’t resolve by restarting
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants