Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raft: don't allocate slice and sort on every commit #8689

Merged
merged 1 commit into from
Jul 27, 2018

Conversation

nvanbenschoten
Copy link
Contributor

No description provided.

@xiang90
Copy link
Contributor

xiang90 commented Oct 12, 2017

will this bring in an observable perf improvement for cockroachdb?

@nvanbenschoten
Copy link
Contributor Author

I doubt it will be observable, but saving 2 allocations (the slice and the sort.reverse) and often avoiding the need to Sort for each committed entry seems like a win.

@nvanbenschoten
Copy link
Contributor Author

Here's a benchstat diff of this change running on Proposal3Nodes:

name              old time/op    new time/op    delta
Proposal3Nodes-4    5.04µs ±26%    4.62µs ±32%   -8.35%  (p=0.033 n=20+20)

name              old alloc/op   new alloc/op   delta
Proposal3Nodes-4    4.58kB ±42%    4.34kB ±69%     ~     (p=0.192 n=20+20)

name              old allocs/op  new allocs/op  delta
Proposal3Nodes-4      11.5 ±39%       9.7 ±34%  -15.65%  (p=0.013 n=20+20)

The change is also addressing a TODO, so we should remove that TODO if we don't want to optimize this.

@xiang90
Copy link
Contributor

xiang90 commented Jan 3, 2018

@nvanbenschoten can you rebase?

@xiang90
Copy link
Contributor

xiang90 commented Jan 3, 2018

@siddontang can you take a look?

@nvanbenschoten
Copy link
Contributor Author

@xiang90 I just rebased this off master.

@siddontang
Copy link
Contributor

@xiang90 ok

@siddontang
Copy link
Contributor

I suggest using a new PR for the change of Raft learner.

Rest LGTM

@codecov-io
Copy link

codecov-io commented Jan 3, 2018

Codecov Report

Merging #8689 into master will increase coverage by 6.77%.
The diff coverage is 92.3%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8689      +/-   ##
==========================================
+ Coverage   69.17%   75.94%   +6.77%     
==========================================
  Files         386      359      -27     
  Lines       35891    29961    -5930     
==========================================
- Hits        24827    22755    -2072     
+ Misses       9264     5625    -3639     
+ Partials     1800     1581     -219
Impacted Files Coverage Δ
raft/raft.go 91.64% <92.3%> (-0.28%) ⬇️
main.go 0% <0%> (-100%) ⬇️
wal/file_pipeline.go 75% <0%> (-20.46%) ⬇️
pkg/fileutil/preallocate_unix.go 42.85% <0%> (-14.29%) ⬇️
etcdserver/api/v3rpc/grpc.go 85.71% <0%> (-14.29%) ⬇️
proxy/grpcproxy/register.go 69.44% <0%> (-11.12%) ⬇️
pkg/adt/interval_tree.go 81.38% <0%> (-9.31%) ⬇️
etcdctl/ctlv3/command/snapshot_command.go 64.91% <0%> (-7.31%) ⬇️
pkg/fileutil/lock_linux.go 76.66% <0%> (-6.67%) ⬇️
proxy/grpcproxy/auth.go 94% <0%> (-6%) ⬇️
... and 275 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c5bef4f...0a415cf. Read the comment docs.

@nvanbenschoten
Copy link
Contributor Author

@siddontang TFTR. I pulled the second commit out into #9088.

raft/raft.go Outdated
@@ -547,13 +548,23 @@ func (r *raft) bcastHeartbeatWithCtx(ctx []byte) {
// the commit index changed (in which case the caller should call
// r.bcastAppend).
func (r *raft) maybeCommit() bool {
// TODO(bmizerany): optimize.. Currently naive
mis := make(uint64Slice, 0, len(r.prs))
if cap(r.matchBuf) < len(r.prs) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably document this optimization for avoiding allocations on slice creating and sorting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

raft/raft.go Outdated
}
mis := r.matchBuf[:len(r.prs)]
idx := 0
sorted := true
for _, p := range r.prs {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems that it is low chance that prs happens to be sorted? not sure if this optimization makes sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In TiKV, we still sort it every time. maybe we can write a benchmark for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the change that this is sorted is low - 1/(n!) if we assume that map iteration is always random. Probably not worth the extra complexity to this function. Removed.

@nvanbenschoten
Copy link
Contributor Author

Reviving this after seeing this allocation account for .92% of allocations in a production CockroachDB cluster.

@xiang90
Copy link
Contributor

xiang90 commented Jul 26, 2018

lgtm. defer to @siddontang or @gyuho

@gyuho
Copy link
Contributor

gyuho commented Jul 26, 2018

I am on vacation till Friday. Will review by Monday. Thanks!

Copy link
Contributor

@gyuho gyuho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Thanks for optimization @nvanbenschoten!

@gyuho gyuho merged commit d16c8b3 into etcd-io:master Jul 27, 2018
nvanbenschoten added a commit to cockroachdb/vendored that referenced this pull request Jul 28, 2018
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this pull request Jul 28, 2018
This picks up etcd-io/etcd#8689.

Release note: None
craig bot pushed a commit to cockroachdb/cockroach that referenced this pull request Jul 30, 2018
27998: Pick up etcd-io/etcd#8689 r=nvanbenschoten a=nvanbenschoten

This picks up etcd-io/etcd#8689.

Release note: None

28002: storage: avoid double lock in tryGetOrCreateReplica r=nvanbenschoten a=nvanbenschoten

Before this change, `tryGetOrCreateReplica` was RLocking, checking
an uncommon condition, then Locking. This was unnecessary and made
the code look strange. It also resulted in an extra RLock in the
exceptionally common case where the Replica was not destroyed.

This change fixes this by collapsing the destroyStatus check into
the main critical section. There's no real expected performance
improvement here, this is mostly just for cleaner code.

Release note: None

Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants