Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Sprite Batching #3060

Closed
wants to merge 2 commits into from

Conversation

cart
Copy link
Member

@cart cart commented Nov 3, 2021

This implements the following:

  • Sprite Batching: Collects sprites in a vertex buffer to draw many sprites with a single draw call. Sprites are batched by their Handle<Image> within a specific z-level. When possible, sprites are opportunistically batched across z-levels (when no sprites with a different texture exist between two sprites with the same texture on different z levels). With these changes, I can now get ~130,000 sprites at 60fps on the bevymark_pipelined example.
  • Sprite Color Tints: The Sprite type now has a color field. Non-white color tints result in a specialized render pipeline that passes the color in as a vertex attribute. I chose to specialize this because passing vertex colors has a measurable price (without colors I get ~130,000 sprites on bevymark, with colors I get ~100,000 sprites). "Colored" sprites cannot be batched with "uncolored" sprites, but I think this is fine because the chance of a "colored" sprite needing to batch with other "colored" sprites is generally probably way higher than an "uncolored" sprite needing to batch with a "colored" sprite.
  • Sprite Flipping: Sprites can be flipped on their x or y axis using Sprite::flip_x and Sprite::flip_y. This is also true for TextureAtlasSprite.
  • Simpler BufferVec/UniformVec/DynamicUniformVec Clearing: improved the clearing interface by removing the need to know the size of the final buffer at the initial clear.

image

Note that this moves sprites away from entity-driven rendering and back to extracted lists. We could use entities here, but it necessitates that an intermediate list is allocated / populated to collect and sort extracted sprites. This redundant copy, combined with the normal overhead of spawning extracted sprite entities, brings bevymark down to ~80,000 sprites at 60fps. I think making sprites a bit more fixed (by default) is worth it. I view this as acceptable because batching makes normal entity-driven rendering pretty useless anyway (and we would want to batch most custom materials too). We can still support custom shaders with custom bindings, we'll just need to define a specific interface for it.

@cart cart added the A-Rendering Drawing game state to the screen label Nov 3, 2021
@cart cart added this to the Bevy 0.6 milestone Nov 3, 2021
@bilsen
Copy link
Contributor

bilsen commented Nov 3, 2021

Very nice!

Note that this moves sprites away from entity-driven rendering and back to extracted lists. We could use entities here, but it necessitates that an intermediate list is allocated / populated to collect and sort extracted sprites. This redundant copy, combined with the normal overhead of spawning extracted sprite entities, brings bevymark down to ~80,000 sprites at 60fps.

Honestly a bit of a bummer. Things like this is why I dont understand why entities are cleared every frame (the destiny presentation didnt make me any wiser). The need to sync entities in the app world with the render world seems like a relatively small price to pay (both performance and api wise) for allowing performant "idiomatic" ecs usage in the render app. I understand that this is unlikely to change for 0.6 however, and that I or someone will have to make an actual benchmark to compare these methods.

I view this as acceptable because batching makes normal entity-driven rendering pretty useless anyway

Could you explain this a bit more?

@cart
Copy link
Member Author

cart commented Nov 4, 2021

@bilsen

Honestly a bit of a bummer. Things like this is why I dont understand why entities are cleared every frame (the destiny presentation didnt make me any wiser). The need to sync entities in the app world with the render world seems like a relatively small price to pay (both performance and api wise) for allowing performant "idiomatic" ecs usage in the render app. I understand that this is unlikely to change for 0.6 however, and that I or someone will have to make an actual benchmark to compare these methods.

Could you explain [batching makes normal entity-driven rendering pretty useless anyway] a bit more?

I think its worth experimenting with using change detection everywhere for every component to populate retained data structures. But I predict that this will be both a pain, error prone, and possibly also expensive. I also don't think this directly relates to the batching issue at play here. Even if we retained this data across frames, we'd still need to sort + batch query results, which would still require an intermediate data structure. And batching inherently decouples drawing from individual source entities, so we still have the "drawing is no longer directly entity driven, but rather the product of entity aggregration" problem, even with retained data.

@cart
Copy link
Member Author

cart commented Nov 4, 2021

I think this is in a good spot to merge. Lots of good ideas and conversations here though. Feel free to keep those going.

@cart
Copy link
Member Author

cart commented Nov 4, 2021

bors r+

bors bot pushed a commit that referenced this pull request Nov 4, 2021
This implements the following:

* **Sprite Batching**: Collects sprites in a vertex buffer to draw many sprites with a single draw call. Sprites are batched by their `Handle<Image>` within a specific z-level. When possible, sprites are opportunistically batched _across_ z-levels (when no sprites with a different texture exist between two sprites with the same texture on different z levels). With these changes, I can now get ~130,000 sprites at 60fps on the `bevymark_pipelined` example.
* **Sprite Color Tints**: The `Sprite` type now has a `color` field. Non-white color tints result in a specialized render pipeline that passes the color in as a vertex attribute. I chose to specialize this because passing vertex colors has a measurable price (without colors I get ~130,000 sprites on bevymark, with colors I get ~100,000 sprites). "Colored" sprites cannot be batched with "uncolored" sprites, but I think this is fine because the chance of a "colored" sprite needing to batch with other "colored" sprites is generally probably way higher than an "uncolored" sprite needing to batch with a "colored" sprite.
* **Sprite Flipping**: Sprites can be flipped on their x or y axis using `Sprite::flip_x` and `Sprite::flip_y`. This is also true for `TextureAtlasSprite`.
* **Simpler BufferVec/UniformVec/DynamicUniformVec Clearing**:  improved the clearing interface by removing the need to know the size of the final buffer at the initial clear.

![image](https://user-images.githubusercontent.com/2694663/140001821-99be0d96-025d-489e-9bfa-ba19c1dc9548.png)


Note that this moves sprites away from entity-driven rendering and back to extracted lists. We _could_ use entities here, but it necessitates that an intermediate list is allocated / populated to collect and sort extracted sprites. This redundant copy, combined with the normal overhead of spawning extracted sprite entities, brings bevymark down to ~80,000 sprites at 60fps. I think making sprites a bit more fixed (by default) is worth it. I view this as acceptable because batching makes normal entity-driven rendering pretty useless anyway (and we would want to batch most custom materials too). We can still support custom shaders with custom bindings, we'll just need to define a specific interface for it.
@bors
Copy link
Contributor

bors bot commented Nov 4, 2021

@bors bors bot changed the title Sprite Batching [Merged by Bors] - Sprite Batching Nov 4, 2021
@bors bors bot closed this Nov 4, 2021
@bilsen
Copy link
Contributor

bilsen commented Nov 4, 2021

@cart

I think its worth experimenting with using change detection everywhere for every component to populate retained data structures. But I predict that this will be both a pain, error prone, and possibly also expensive

With the current built in change detection it might be more expensive, but if there was an option for push-based change detection instead I cannot fathom why it would be any slower than how things are currently. In most cases you could even do "in-place" parallel extraction using something like RenderParam<Query<&mut ExtractedMesh>>, only reserving commands for the rare occasions when entities are spawned/despawned.

I'm by no means a perf expert though, so might be something im overlooking.

Personally I think components not being retained is a pain, It complicates multi-frame dependent rendering, such as with TAA. Maybe its not that big of a deal in general though. As for bugs, I hardly think forgetting to spawn/despawn entities when their rendercomponents are added/removed will be a significant issue. Could probably even make a generic plugin that handles that automatically.

Even if we retained this data across frames, we'd still need to sort + batch query results, which would still require an intermediate data structure

Yea I guess thats true. I just dont want everything in the render world to end up in vectors. It sounded like atleast some of the overhead came from having to spawn the extracted sprites, which I think would be reduced if these entities were retained.

Anyways, just some thoughts.

@cart cart mentioned this pull request Nov 5, 2021
@alice-i-cecile alice-i-cecile mentioned this pull request Nov 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Rendering Drawing game state to the screen
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants