Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garbage collect event memory #4832

Closed
wants to merge 4 commits into from

Conversation

james7132
Copy link
Member

Objective

As mentioned briefly in discussions in bevyengine/rfcs#53, the memory used for events is never garbage collected despite the data being stored transiently. Bevy should release that memory when possible to avoid said memory from being held for too long.

Solution

Use Vec::shrink_to to halve the capacity of the the backing Vec for events after clearing during updates.

@james7132 james7132 added A-ECS Entities, components, systems, and events C-Performance A change motivated by improving speed, memory usage or compile times labels May 24, 2022
@DJMcNab
Copy link
Member

DJMcNab commented May 24, 2022

The problem with anything like this is that certain workloads will have a ping-pong effect of allocating, deallocating and reallocating. Have we checked what the total amount of capacity being taken up by this is in a reasonable workflow?

For example, moving the mouse will stop for at least a few frames, then as soon as the mouse starts moving again, we'd have to allocate again (potentially reallocating too).

@james7132
Copy link
Member Author

The primary motivator here is RFC linked above, which suggests we move hierarchy alteration detection to an event instead of incurring heavier archetype moves via PreviousParent instead. This will lead to large bursts of events being introduced as scenes are loaded, scaling linearly with the number of entities loaded, which would otherwise never be reclaimed without this.

The actual shrink should be pretty cheap given the Vec is emptied before the reallocation.

Given that events tend to be either streamed in at a consistent pace (roughly the same number of events per frame) or sent in bursts, potentially many frames apart, I think this strategy accounts for both fairly well. In the consistent case, capacity is retained to roughly the current usage, otherwise if it isn't being used it exponentially decays every two frames. In the aforementioned mouse example, the movements per frame will trigger a consistent flow of events frame to frame, requiring only reallocations on the ramp up. As it falls off, the capacity decays, but provides a window for it to more easily ramp up if need be.

If we don't do this automatically, we should at least provide the tools/APIs for allowing users to manually do it themselves. IMO it's irresponsible of us to permanently allocate memory on behalf of the user, and provide them no ways to rein it in.

@cart
Copy link
Member

cart commented May 30, 2022

Yeah I think this is reasonable. I think something like this is good as a starting point to ensure memory is reclaimed (which does feel like table stakes).

That being said, providing a user-configurable "off button" for garbage collection would make me more comfortable merging this. I'm relatively certain that there will be cases that would benefit from that, and it seems pretty easy to add in.

Something like:

enum EventGarbageCollection {
  ExponentialFallOff {
    /// Each internal vec never goes below this value.
    /// There are two internal vecs, so the total minimum capacity is 2 * min_capacity.
    min_capacity: usize,
  },
  None,
}

Note that I added min_capacity because it seems like it would be an easy add and would give users (and us) an additional tool to avoid reallocations.

We can provide additional configuration and tweaks to optimize as needed.

@james7132
Copy link
Member Author

james7132 commented May 31, 2022

Should we have this on every Events<T> resource or as a global resource? and how would we configure this at initialization?

@cart
Copy link
Member

cart commented May 31, 2022

Should we have this on every Events resource or as a global resource? and how would we configure this at initialization?

I think it should be for each Events<T> resource, as each event type will have different allocation patterns.

I'm thinking there are two paths we could take:

  1. Store the config on each Events<T> resource. add_event::<T> just sets it to some default value. Configuring it would involve grabbing the resource and modifying the config on it.
  2. Add an EventSettings<T> resource (like we do for plugins). Pass that in to Events::update inside Events::update_system.

As a riff on (2), we could consider replacing add_event::<T>() with a more standard add_plugin(EventPlugin::<T>::default()) (or keep the extension method, but register a plugin inside add_event::<T>().

I think (2) is more "idiomatic bevy", so that has my preference. But I'm open to suggestions.

@alice-i-cecile
Copy link
Member

alice-i-cecile commented Jun 1, 2022

This 100% needs to be controlled separately for each event-type due to differing usage patterns. User-controllable defaults might be good though.

Approach 3: use an Event trait, and configure it in there. We had previously discussed this for e.g. smallvec storage, and it helps improve type-safety and automatic documentation of code.

The downside to that style of design of course is that it's fixed at the type level and can't be configured by users or dynamically modified. I think those are serious limitations, and will interfere with the ability to automatically explore micro-optimizations down the line in user code.

Of all these options, I think I prefer approach 2, so we're not fetching the config data every single time. And moving this to a more standard plugin model is good, but we should keep the convenience method (or do autoregistration).

}

/// Settings for controlling the garbage collection behavior of [`Events<T>`].
pub enum EventGarbageCollection {
Copy link
Member

@alice-i-cecile alice-i-cecile Jun 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this flavor of design; there are a number of other settings that I'd like to be able to configure for events (e.g. smallvec usage and cleanup strategy).

As a result, I think we should refactor add_event to use a generic plugin under the hood.


/// Settings for controlling the garbage collection behavior of [`Events<T>`].
pub enum EventGarbageCollection {
/// Exponentially decays the amount of allocated memory for the backing event buffer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Exponentially decays the amount of allocated memory for the backing event buffer
/// Halves the amount of allocated memory for the backing event buffer each tick

@@ -448,8 +494,12 @@ impl<E: Event> Events<E> {
}

/// A system that calls [`Events::update`] once per frame.
pub fn update_system(mut events: ResMut<Self>) {
events.update();
pub fn update_system(mut events: ResMut<Self>, settings: Option<Res<EventSettings<E>>>) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice use of an optional resource; I like the way it avoids polluting the resource list with default values.

Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
Copy link
Contributor

@djeedai djeedai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my 2c: This strategy although interesting for some scenarios is probably too aggressive to be the default one, and we should consider a smoother version kicking in only after a few frames. And as always, if the caller knows for a fact there's an atypical pattern on one frame, we should offer an escape hatch to manually trigger a GC, as the caller always has more context to decide what's best.

One thing also: here the proposed scheme will fall flat if some recurring bursts occur alternatively on frame A and frame B (e.g. every 3 frames), as we only look at the size of the current buffer to decide if it needs to shrink. We should probably have at least a slightly more complex heuristic that looks at both buffers to determine an expected capacity for next frame.

Also, I'm not sure what order of magnitude of memory allocations we're talking about here, but if this is a strong concern, we should maybe consider using a single circular buffer instead of 2 swapping vecs. The advantage is that there's a single allocation covering 2 frames together, so it will more easily absorb bursts and other ping-pong effects, and it naturally solves the issue of the 2-buffer heuristic mentioned above. The ideal data structure is probably similar to a deque, with blocks of N items (to reduce the number of allocations), and the possibility to add more blocks to handle bursts (here, only at one end, unlike deque).

if let Some(settings) = settings {
events.update(&settings);
} else {
events.update(&Default::default());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does default() works here (and below)? Or we need the extra Default:: qualification?

let new_capacity = match settings.garbage_collection {
EventGarbageCollection::ExponentialFalloff { min_capacity } => self
.events_b
.len()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how Vec::shrink_to() is implemented, but if the length of the vec decreases slowly frame-to-frame, this will cause a (non no-op) call to Vec::shrink_to() every frame, which potentially will lead to a reallocation every frame depending on the implementation of Vec::shrink_to() (which I imagine has some kind of threshold). This should probably be at least tested / validated with various sizes to understand the behavior, because if the implementation is a bit naive and it ends up doing one allocation per frame then that defeats the purpose of having reusable buffers in the first place.

More generally the exponential decay here is very aggressive. Given the cost of allocations compared the relative benefit of recovering memory, it's probably worth having a smoother heuristic which observes the trend over several frames before kicking in some GC operation. Or at least make that an option, and make that option the default one, and leave that aggressive instant-trigger decay variant for those few Events<T> which we know for a fact are prone to large bursts for which we do not want to pay the cost of a large unused allocation, and instead prefer paying the cost of a reallocation. But I don't think reallocating as soon as possible should be the default.

@alice-i-cecile alice-i-cecile added the S-Needs-Benchmarking This set of changes needs performance benchmarking to double-check that they help label Jun 3, 2022
@JoJoJet JoJoJet mentioned this pull request Jul 17, 2022
@Weibye Weibye added the S-Adopt-Me The original PR author has no intent to complete this work. Pick me up! label Aug 10, 2022
@richchurcher
Copy link
Contributor

Closing due to inactivity as part of backlog cleanup, and treating the linked RFC bevyengine/rfcs#53 as the tracking issue. cc: @alice-i-cecile in case I got this one wrong, bit above my pay grade 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Performance A change motivated by improving speed, memory usage or compile times S-Adopt-Me The original PR author has no intent to complete this work. Pick me up! S-Needs-Benchmarking This set of changes needs performance benchmarking to double-check that they help
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants