Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smaller extracted sprite struct #3556

Closed
wants to merge 5 commits into from
Closed

Smaller extracted sprite struct #3556

wants to merge 5 commits into from

Conversation

inodentry
Copy link
Contributor

@inodentry inodentry commented Jan 5, 2022

Optimize the ExtractedSprite struct to have smaller size.

With a large number of sprites (like in bevymark/many_sprites) this saves memory and slightly improves perf.

The struct size has been reduced from 176 bytes to 144 bytes.

This is accomplished by:

  • Storing the color as u32 instead of Color, in the format that is later passed to the shader. The (cheap) conversion is moved to the Extract stage.
  • Storing the transform as GlobalTransform instead of Mat4. This moves the expensive matrix computation into the Prepare stage, instead of the Extract stage.

Due to touching the sorting function in prepare_sprites, this PR will have a merge conflict with #3554. One of us will have to rebase, depending on who makes it in first. I used unstable_sort and partial_cmp here, as that PR does (we already know it makes an improvement).

Reduces struct size from 176 to 160.
Reduces struct size from 160 to 144.

Moves expensive matrix computation from Extract stage to Prepare stage.
@inodentry inodentry added A-Rendering Drawing game state to the screen C-Code-Quality A section of code that is hard to understand or change C-Performance A change motivated by improving speed, memory usage or compile times labels Jan 5, 2022
@github-actions github-actions bot added the S-Needs-Triage This issue needs to be labelled label Jan 5, 2022
@inodentry inodentry removed the S-Needs-Triage This issue needs to be labelled label Jan 5, 2022
@@ -487,6 +487,15 @@ impl Color {
} => [hue, saturation, lightness, alpha],
}
}

pub fn as_linear_abgr_u32(self) -> u32 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going to cause a loss of fidelity in our colors due to discretization?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No change from before. We have already been doing this. It was just done in prepare_sprites. The color passed to the shader was already encoded like this.

I just moved that computation earlier to the Extract stage, so that the color can be stored the same way in the ExtractedSprite struct, too, making it smaller.

* Mat4::from_translation(
alignment_offset * scale_factor + text_glyph.position.extend(0.),
);
let mut text_transform = transform.clone();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this change break text rotation? The information no longer seems to exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No... it's cloning the original transform, which preserves everything.

Also, I ran the example (after the commit fixing it), and it works. ;)

@@ -487,6 +487,15 @@ impl Color {
} => [hue, saturation, lightness, alpha],
}
}

pub fn as_linear_abgr_u32(self) -> u32 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be a const fn?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only if as_linear_rgba_f32 and friends, are also made const fn. Not my call to make.

It should probably have an inline annotation, though. Added that.

@@ -337,7 +338,7 @@ pub fn prepare_sprites(
let mut current_batch_colored = false;
let mut last_z = 0.0;
for extracted_sprite in extracted_sprites.sprites.iter() {
let colored = extracted_sprite.color != Color::WHITE;
let colored = extracted_sprite.color != 0xFFFFFFFF;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Losing the nice name here is unfortunate. Ideally we could store at least this value as a named constant (although it is very recognizable).

Copy link
Contributor Author

@inodentry inodentry Jan 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could add a line just before:

const WHITE: u32 = 0xFFFFFFFF;

But I don't think that would add anything to help clarity 😆 the constant is pretty obvious as is.

@mockersf
Copy link
Member

mockersf commented Jan 5, 2022

Due to touching the sorting function in prepare_sprites, this PR will have a merge conflict with #3554. One of us will have to rebase, depending on who makes it in first. I used unstable_sort and partial_cmp here, as that PR does (we already know it makes an improvement).

On my laptop, if I remove the improvements from #3554, this PR makes many_sprites run the same fps as without it or slower by a tiny variation.

@inodentry
Copy link
Contributor Author

OK, on further investigation, my compositor was obscuring the results and I wasn't seeing the performance changes very well. I just tested it in Windows, and this paints a different picture.

This PR does indeed actually make things slower. I am very confused. I also tried just the first change with the color encoding, which is quite trivial, and it also reduced perf. The struct is smaller, the CPU computations should be the same. ... What/Why?

I guess I will close this PR.

@inodentry inodentry closed this Jan 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Rendering Drawing game state to the screen C-Code-Quality A section of code that is hard to understand or change C-Performance A change motivated by improving speed, memory usage or compile times
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants