Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise memory deallocation for stable log entries #96

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 37 additions & 15 deletions log_unstable.go
Original file line number Diff line number Diff line change
Expand Up @@ -152,30 +152,52 @@ func (u *unstable) stableTo(i, t uint64) {
"entry at (%d,%d) in unstable log; ignoring", i, t, i, gt)
return
}
num := int(i + 1 - u.offset)
u.entries = u.entries[num:]
nextUnstableEntryIndex := int(i + 1 - u.offset)
u.shrinkEntriesSlice(nextUnstableEntryIndex)
u.offset = i + 1
u.offsetInProgress = max(u.offsetInProgress, u.offset)
u.shrinkEntriesArray()
}

// shrinkEntriesArray discards the underlying array used by the entries slice
// shrinkEntriesSlice discards the underlying array used by the entries slice
// if most of it isn't being used. This avoids holding references to a bunch of
// potentially large entries that aren't needed anymore. Simply clearing the
// entries wouldn't be safe because clients might still be using them.
func (u *unstable) shrinkEntriesArray() {
// We replace the array if we're using less than half of the space in
// it. This number is fairly arbitrary, chosen as an attempt to balance
// memory usage vs number of allocations. It could probably be improved
// with some focused tuning.
const lenMultiple = 2
if len(u.entries) == 0 {
func (u *unstable) shrinkEntriesSlice(nextUnstableEntryIndex int) {
if nextUnstableEntryIndex == len(u.entries) { //all log entries have been successfully stored
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Please add a whitespace between // and the comment. Same bellow.

u.entries = nil
} else if len(u.entries)*lenMultiple < cap(u.entries) {
newEntries := make([]pb.Entry, len(u.entries))
copy(newEntries, u.entries)
u.entries = newEntries
} else if u.doEntriesNeedCompaction(nextUnstableEntryIndex) {
u.compactEntriesSlice(nextUnstableEntryIndex)
} else {
u.entries = u.entries[nextUnstableEntryIndex:]
}
}

func (u *unstable) doEntriesNeedCompaction(nextUnstableEntryIndex int) bool {
//Eligible for compaction if the stable entries occupy more than (1/lenMultiple) times of the:
// a) underlying-array indexes, OR
// b) total memory occupied by entries (both stable and unstable)
const lenMultiple = 2

countStableEntries := nextUnstableEntryIndex
if countStableEntries > cap(u.entries)/lenMultiple {
return true
}
Comment on lines +181 to 184
Copy link
Contributor

@pav-kv pav-kv Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This heuristic is different from the proposed idea in #71. It still honours only "visible" entries in deciding when to compact; and so it is susceptible to the same problem. The idea that was proposed is that we should instead track the full allocated size of the underlying slice.

This heuristic also seems different from the original: it checks the ratio of stable entries vs. cap; previously it checked the ratio of all entries vs. cap. Any particular reason why you went with this heuristic?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @pavelkalinnikov, thanks for the review
I forgot that cap() considers the backing array only for the last index, not the beginning one

I had another thought - at the time of shrinking the entries slice, if we remove references to the shrunk pb.Entry elements, then two things will happen:

  1. with no more references, the 'shrunk' pb.Entry values will immediately become available for GC (if not in use)
  2. the append() method will consider creating an entirely new backing array at the time of adding new elements when capacity is low, and then the preceding unreachable values (which currently hold empty pb.Entry values after the above change) will also become available for GC

this seemed like a much cleaner solution to me

if this approach seems feasible, then we can optimise the design further by replacing pb.Entry values with pointers instead, to further minimise the memory cost of holding shrunk values in the array until append() replaces the backing array

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @pavelkalinnikov, could you please share your review of the above approach?


var totalSize, stableEntriesSize int
for ind, val := range u.entries {
if ind < nextUnstableEntryIndex {
stableEntriesSize += val.Size()
}
totalSize += val.Size()
}
return (stableEntriesSize > totalSize/lenMultiple)
}

func (u *unstable) compactEntriesSlice(nextUnstableEntryIndex int) {
countUnstableEntries := (len(u.entries) - nextUnstableEntryIndex)
unstableEntries := make([]pb.Entry, countUnstableEntries)
copy(unstableEntries, u.entries[nextUnstableEntryIndex:])
u.entries = unstableEntries
}

func (u *unstable) stableSnapTo(i uint64) {
Expand Down