-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Merged by Bors] - Allow unbatched render phases to use unstable sorts #5049
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some information in a comment but then lgtm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reworded a bit and then it is good to go.
Co-authored-by: Robert Swain <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice improvement, one nit though.
On further thought, perhaps |
# Objective The descriptions included in the API docs of `entity` module, `Entity` struct, and `Component` trait have some issues: 1. the concept of entity is not clearly defined, 2. descriptions are a little bit out of place, 3. in a case the description leak too many details about the implementation, 4. some descriptions are not exhaustive, 5. there are not enough examples, 6. the content can be formatted in a much better way. ## Solution 1. ~~Stress the fact that entity is an abstract and elementary concept. Abstract because the concept of entity is not hardcoded into the library but emerges from the interaction of `Entity` with every other part of `bevy_ecs`, like components and world methods. Elementary because it is a fundamental concept that cannot be defined with other terms (like point in euclidean geometry, or time in classical physics).~~ We decided to omit the definition of entity in the API docs ([see why]). It is only described in its relationship with components. 2. Information has been moved to relevant places and links are used instead in the other places. 3. Implementation details about `Entity` have been reduced. 4. Descriptions have been made more exhaustive by stating how to obtain and use items. Entity operations are enriched with `World` methods. 5. Examples have been added or enriched. 6. Sections have been added to organize content. Entity operations are now laid out in a table. ### Todo list - [x] Break lines at sentence-level. ## For reviewers - ~~I added a TODO over `Component` docs, make sure to check it out and discuss it if necessary.~~ ([Resolved]) - You can easily check the rendered documentation by doing `cargo doc -p bevy_ecs --no-deps --open`. [see why]: bevyengine#4767 (comment) [Resolved]: bevyengine#4767 (comment)
…el (bevyengine#4663) # Objective Further speed up visibility checking by removing the main sources of contention for the system. ## Solution - ~~Make `ComputedVisibility` a resource wrapping a `FixedBitset`.~~ - ~~Remove `ComputedVisibility` as a component.~~ ~~This adds a one-bit overhead to every entity in the app world. For a game with 100,000 entities, this is 12.5KB of memory. This is still small enough to fit entirely in most L1 caches. Also removes the need for a per-Entity change detection tick. This reduces the memory footprint of ComputedVisibility 72x.~~ ~~The decreased memory usage and less fragmented memory locality should provide significant performance benefits.~~ ~~Clearing visible entities should be significantly faster than before:~~ - ~~Setting one `u32` to 0 clears 32 entities per cycle.~~ - ~~No archetype fragmentation to contend with.~~ - ~~Change detection is applied to the resource, so there is no per-Entity update tick requirement.~~ ~~The side benefit of this design is that it removes one more "computed component" from userspace. Though accessing the values within it are now less ergonomic.~~ This PR changes `crossbeam_channel` in `check_visibility` to use a `Local<ThreadLocal<Cell<Vec<Entity>>>` to mark down visible entities instead. Co-Authored-By: TheRawMeatball <[email protected]> Co-Authored-By: Aevyrie <[email protected]>
# Objective Closes bevyengine#1557. Partially addresses bevyengine#3362. Cleanup the public facing API for storage types. Most of these APIs are difficult to use safely when directly interfacing with these types, and is also currently impossible to interact with in normal ECS use as there is no `World::storages_mut`. The majority of these types should be easy enough to read, and perhaps mutate the contents, but never structurally altered without the same checks in the rest of bevy_ecs code. This both cleans up the public facing types and helps use unused code detection to remove a few of the APIs we're not using internally. ## Solution - Mark all APIs that take `&mut T` under `bevy_ecs::storage` as `pub(crate)` or `pub(super)` - Cleanup after it all. Entire type visibility changes: - `BlobVec` is `pub(super)`, only storage code should be directly interacting with it. - `SparseArray` is now `pub(crate)` for the entire type. It's an implementation detail for `Table` and `(Component)SparseSet`. - `TableMoveResult` is now `pub(crate) --- ## Changelog TODO ## Migration Guide Dear God, I hope not.
# Objective - Builds on top of bevyengine#4938 - Make clustered-forward PBR lighting/shadows functionality callable - See bevyengine#3969 for details ## Solution - Add `PbrInput` struct type containing a `StandardMaterial`, occlusion, world_position, world_normal, and frag_coord - Split functionality to calculate the unit view vector, and normal-mapped normal into `bevy_pbr::pbr_functions` - Split high-level shading flow into `pbr(in: PbrInput, N: vec3<f32>, V: vec3<f32>, is_orthographic: bool)` function in `bevy_pbr::pbr_functions` - Rework `pbr.wgsl` fragment stage entry point to make use of the new functions - This has been benchmarked on an M1 Max using `many_cubes -- sphere`. `main` had a median frame time of 15.88ms, this PR 15.99ms, which is a 0.69% frame time increase, which is within noise in my opinion. --- ## Changelog - Added: PBR shading code is now callable. Import `bevy_pbr::pbr_functions` and its dependencies, create a `PbrInput`, calculate the unit view and normal-mapped normal vectors and whether the projection is orthographic, and call `pbr()`!
…gine#4716) # Objective DioxusLabs and Bevy have taken over maintaining what was our abandoned ui layout dependency [stretch](https://github.com/vislyhq/stretch). Dioxus' fork has had a lot of work done on it by @alice-i-cecile, @Weibye , @jkelleyrtp, @mockersf, @HackerFoo, @TimJentzsch and a dozen other contributors and now is in much better shape than stretch was. The updated crate is called taffy and is available on github [here](https://github.com/DioxusLabs/taffy) ([taffy](https://crates.io/crates/taffy) on crates.io). The goal of this PR is to replace stretch v0.3.2 with taffy v0.1.0. ## Solution I changed the bevy_ui Cargo.toml to depend on taffy instead of stretch and fixed all the errors rustc complained about. --- ## Changelog Changed bevy_ui layout dependency from stretch to taffy (the maintained fork of stretch). fixes bevyengine#677 ## Migration Guide The public api of taffy is different from that of stretch so please advise me on what to do here @alice-i-cecile.
Not sure what's going on with that merge.
Went through with this. It's allows us to provide custom per-phase sorting behavior. Looking at #4291 and @superdump's test results, it seems like for unstable/unbatched phases, With the same benchmark |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the structure and flexibility but I'd like to either benchmark and choose the best radix sort crate based on criteria discussed in #4291, or I think if we want to do that separately, it makes sense to not introduce a new dependency if we are probably going to replace it almost straight away.
Co-authored-by: Robert Swain <[email protected]>
Tested the four aforementioned sorts on
With these results, I think we should stick with |
Yup, ok. I think voracious/rdst may be faster than radsort for sprites, but ok. I'm on board with taking that separately. |
bors r+ |
# Objective Partially addresses #4291. Speed up the sort phase for unbatched render phases. ## Solution Split out one of the optimizations in #4899 and allow implementors of `PhaseItem` to change what kind of sort is used when sorting the items in the phase. This currently includes Stable, Unstable, and Unsorted. Each of these corresponds to `Vec::sort_by_key`, `Vec::sort_unstable_by_key`, and no sorting at all. The default is `Unstable`. The last one can be used as a default if users introduce a preliminary depth prepass. ## Performance This will not impact the performance of any batched phases, as it is still using a stable sort. 2D's only phase is unchanged. All 3D phases are unbatched currently, and will benefit from this change. On `many_cubes`, where the primary phase is opaque, this change sees a speed up from 907.02us -> 477.62us, a 47.35% reduction. ![image](https://user-images.githubusercontent.com/3137680/174471253-22424874-30d5-4db5-b5b4-65fb2c612a9c.png) ## Future Work There were prior discussions to add support for faster radix sorts in #4291, which in theory should be a `O(n)` instead of a `O(nlog(n))` time. [`voracious`](https://crates.io/crates/voracious_radix_sort) has been proposed, but it seems to be optimize for use cases with more than 30,000 items, which may be atypical for most systems. Another optimization included in #4899 is to reduce the size of a few of the IDs commonly used in `PhaseItem` implementations to shrink the types to make swapping/sorting faster. Both `CachedPipelineId` and `DrawFunctionId` could be reduced to `u32` instead of `usize`. Ideally, this should automatically change to use stable sorts when `BatchedPhaseItem` is implemented on the same phase item type, but this requires specialization, which may not land in stable Rust for a short while. --- ## Changelog Added: `PhaseItem::sort` ## Migration Guide RenderPhases now default to a unstable sort (via `slice::sort_unstable_by_key`). This can typically improve sort phase performance, but may produce incorrect batching results when implementing `BatchedPhaseItem`. To revert to the older stable sort, manually implement `PhaseItem::sort` to implement a stable sort (i.e. via `slice::sort_by_key`). Co-authored-by: Federico Rinaldi <[email protected]> Co-authored-by: Robert Swain <[email protected]> Co-authored-by: colepoirier <[email protected]>
# Objective Partially addresses bevyengine#4291. Speed up the sort phase for unbatched render phases. ## Solution Split out one of the optimizations in bevyengine#4899 and allow implementors of `PhaseItem` to change what kind of sort is used when sorting the items in the phase. This currently includes Stable, Unstable, and Unsorted. Each of these corresponds to `Vec::sort_by_key`, `Vec::sort_unstable_by_key`, and no sorting at all. The default is `Unstable`. The last one can be used as a default if users introduce a preliminary depth prepass. ## Performance This will not impact the performance of any batched phases, as it is still using a stable sort. 2D's only phase is unchanged. All 3D phases are unbatched currently, and will benefit from this change. On `many_cubes`, where the primary phase is opaque, this change sees a speed up from 907.02us -> 477.62us, a 47.35% reduction. ![image](https://user-images.githubusercontent.com/3137680/174471253-22424874-30d5-4db5-b5b4-65fb2c612a9c.png) ## Future Work There were prior discussions to add support for faster radix sorts in bevyengine#4291, which in theory should be a `O(n)` instead of a `O(nlog(n))` time. [`voracious`](https://crates.io/crates/voracious_radix_sort) has been proposed, but it seems to be optimize for use cases with more than 30,000 items, which may be atypical for most systems. Another optimization included in bevyengine#4899 is to reduce the size of a few of the IDs commonly used in `PhaseItem` implementations to shrink the types to make swapping/sorting faster. Both `CachedPipelineId` and `DrawFunctionId` could be reduced to `u32` instead of `usize`. Ideally, this should automatically change to use stable sorts when `BatchedPhaseItem` is implemented on the same phase item type, but this requires specialization, which may not land in stable Rust for a short while. --- ## Changelog Added: `PhaseItem::sort` ## Migration Guide RenderPhases now default to a unstable sort (via `slice::sort_unstable_by_key`). This can typically improve sort phase performance, but may produce incorrect batching results when implementing `BatchedPhaseItem`. To revert to the older stable sort, manually implement `PhaseItem::sort` to implement a stable sort (i.e. via `slice::sort_by_key`). Co-authored-by: Federico Rinaldi <[email protected]> Co-authored-by: Robert Swain <[email protected]> Co-authored-by: colepoirier <[email protected]>
# Objective Partially addresses bevyengine#4291. Speed up the sort phase for unbatched render phases. ## Solution Split out one of the optimizations in bevyengine#4899 and allow implementors of `PhaseItem` to change what kind of sort is used when sorting the items in the phase. This currently includes Stable, Unstable, and Unsorted. Each of these corresponds to `Vec::sort_by_key`, `Vec::sort_unstable_by_key`, and no sorting at all. The default is `Unstable`. The last one can be used as a default if users introduce a preliminary depth prepass. ## Performance This will not impact the performance of any batched phases, as it is still using a stable sort. 2D's only phase is unchanged. All 3D phases are unbatched currently, and will benefit from this change. On `many_cubes`, where the primary phase is opaque, this change sees a speed up from 907.02us -> 477.62us, a 47.35% reduction. ![image](https://user-images.githubusercontent.com/3137680/174471253-22424874-30d5-4db5-b5b4-65fb2c612a9c.png) ## Future Work There were prior discussions to add support for faster radix sorts in bevyengine#4291, which in theory should be a `O(n)` instead of a `O(nlog(n))` time. [`voracious`](https://crates.io/crates/voracious_radix_sort) has been proposed, but it seems to be optimize for use cases with more than 30,000 items, which may be atypical for most systems. Another optimization included in bevyengine#4899 is to reduce the size of a few of the IDs commonly used in `PhaseItem` implementations to shrink the types to make swapping/sorting faster. Both `CachedPipelineId` and `DrawFunctionId` could be reduced to `u32` instead of `usize`. Ideally, this should automatically change to use stable sorts when `BatchedPhaseItem` is implemented on the same phase item type, but this requires specialization, which may not land in stable Rust for a short while. --- ## Changelog Added: `PhaseItem::sort` ## Migration Guide RenderPhases now default to a unstable sort (via `slice::sort_unstable_by_key`). This can typically improve sort phase performance, but may produce incorrect batching results when implementing `BatchedPhaseItem`. To revert to the older stable sort, manually implement `PhaseItem::sort` to implement a stable sort (i.e. via `slice::sort_by_key`). Co-authored-by: Federico Rinaldi <[email protected]> Co-authored-by: Robert Swain <[email protected]> Co-authored-by: colepoirier <[email protected]>
# Objective Partially addresses bevyengine#4291. Speed up the sort phase for unbatched render phases. ## Solution Split out one of the optimizations in bevyengine#4899 and allow implementors of `PhaseItem` to change what kind of sort is used when sorting the items in the phase. This currently includes Stable, Unstable, and Unsorted. Each of these corresponds to `Vec::sort_by_key`, `Vec::sort_unstable_by_key`, and no sorting at all. The default is `Unstable`. The last one can be used as a default if users introduce a preliminary depth prepass. ## Performance This will not impact the performance of any batched phases, as it is still using a stable sort. 2D's only phase is unchanged. All 3D phases are unbatched currently, and will benefit from this change. On `many_cubes`, where the primary phase is opaque, this change sees a speed up from 907.02us -> 477.62us, a 47.35% reduction. ![image](https://user-images.githubusercontent.com/3137680/174471253-22424874-30d5-4db5-b5b4-65fb2c612a9c.png) ## Future Work There were prior discussions to add support for faster radix sorts in bevyengine#4291, which in theory should be a `O(n)` instead of a `O(nlog(n))` time. [`voracious`](https://crates.io/crates/voracious_radix_sort) has been proposed, but it seems to be optimize for use cases with more than 30,000 items, which may be atypical for most systems. Another optimization included in bevyengine#4899 is to reduce the size of a few of the IDs commonly used in `PhaseItem` implementations to shrink the types to make swapping/sorting faster. Both `CachedPipelineId` and `DrawFunctionId` could be reduced to `u32` instead of `usize`. Ideally, this should automatically change to use stable sorts when `BatchedPhaseItem` is implemented on the same phase item type, but this requires specialization, which may not land in stable Rust for a short while. --- ## Changelog Added: `PhaseItem::sort` ## Migration Guide RenderPhases now default to a unstable sort (via `slice::sort_unstable_by_key`). This can typically improve sort phase performance, but may produce incorrect batching results when implementing `BatchedPhaseItem`. To revert to the older stable sort, manually implement `PhaseItem::sort` to implement a stable sort (i.e. via `slice::sort_by_key`). Co-authored-by: Federico Rinaldi <[email protected]> Co-authored-by: Robert Swain <[email protected]> Co-authored-by: colepoirier <[email protected]>
Objective
Partially addresses #4291.
Speed up the sort phase for unbatched render phases.
Solution
Split out one of the optimizations in #4899 and allow implementers of
PhaseItem
to change what kind of sort is used when sorting the items in the phase. Given a&mut [Self]
, the implementers must sort the slice in a way that is consistent with the downstream requirements. A default implementation that usesslice::sort_unstable_by_key
is provided.Performance
This will not impact the performance of any batched phases, as it is still using a stable sort. 2D's only phase is unchanged. All 3D phases are unbatched currently, and will benefit from this change.
On
many_cubes
, where the primary phase is opaque, this change sees a speed up from 907.02us -> 477.62us, a 47.35% reduction.Future Work
There were prior discussions to add support for faster radix sorts in #4291, which in theory should be a
O(n)
instead of aO(nlog(n))
time.voracious
has been proposed, but it seems to be optimize for use cases with more than 30,000 items, which may be atypical for most systems.Another optimization included in #4899 is to reduce the size of a few of the IDs commonly used in
PhaseItem
implementations to shrink the types to make swapping/sorting faster. BothCachedPipelineId
andDrawFunctionId
could be reduced tou32
instead ofusize
.Ideally, this should automatically change to use stable sorts when
BatchedPhaseItem
is implemented on the same phase item type, but this requires specialization, which may not land in stable Rust for a short while.Changelog
Added:
PhaseItem::sort
Migration Guide
RenderPhases now default to a unstable sort (via
slice::sort_unstable_by_key
). This can typically improve sort phase performance, but may produce incorrect batching results when implementingBatchedPhaseItem
. To revert to the older stable sort, manually implementPhaseItem::sort
to implement a stable sort (i.e. viaslice::sort_by_key
).