-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use string interning to avoid boxing labels #7760
Conversation
Example |
Example |
1 similar comment
Example |
Hi, I have taken some ideas of this PR and #5715 and put them together and expanded them in #7762. With that, no allocations are required in many common use cases. One possible issue i see with this approach here is that if two different sets produce the same hash value for whatever reason, we cannot differentiate between them without also taking the string into account. And that string could in theory also be made equal for different sets, in which case this would also not be sufficient. |
Example |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After a quick scan, this definitely looks promising as a "best of both worlds" approach.
crates/bevy_ecs/src/schedule/set.rs
Outdated
/// Type-elided struct whose methods return the same values as its original [`SystemSet`]. | ||
#[derive(Clone, Copy, Eq, PartialEq, Hash)] | ||
|
||
pub struct SystemSetUntyped { | ||
id: SystemSetId, | ||
kind: SystemSetKind, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the presence of SystemSetId
and SystemSetUntyped
is a little confusing -- could these types be merged? (Of course, doing this would probably mean you can't use define_label!
for SystemSet
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SystemSetId
is just the ID. To replace the boxing in schedule construction, we also need to remember the distinction between normal system sets, system type sets, and anonymous system sets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but is it necessary to have SystemSetId
as a separate public type? Can we just use SystemSetUntyped
for any user-facing APIs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yeah, we can probably keep SystemSetId
internal.
T: Debug + DynHash + ?Sized, | ||
{ | ||
let key = UniqueValue::of(value); | ||
let mut map = INTERNER.get_or_init(default).write().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like this to have a more descriptive error message instead of just using unwrap
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any suggestions? Is the standard message about the RwLock
being poisoned not descriptive enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, .unwrap()
uses the Debug
implementation of the error type, so the error message will just be PoisonError { ... }
.
Maybe it would be better to bypass poisoning entirely by using .unwrap_or_else(PoisonError::into_inner)
? If a panic occurs while the lock is being held, I believe it would still leave the interner in a valid state.
/// subsequent calls with the same `value` will return the same reference. | ||
pub fn intern_debug_string<T>(value: &T) -> &'static str | ||
where | ||
T: Debug + DynHash + ?Sized, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to use Hash
instead of DynHash
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. We can't use Hash
if the label traits still need to be object-safe, but otherwise, yeah.
/// This is expected to have sufficient entropy to be different for each value. | ||
#[derive(Clone, Copy, Eq, PartialEq, Hash)] | ||
struct UniqueValue(TypeId, u64); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My number one concern with this approach is whether or not we can rely on this being a truly unique value. What would the error case look like if an unlucky user saw two of their labels. Can we estimate the probability of this occurring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory, yes we can estimate it. I don't know the exact math though. It depends on the hash functions used.
I think if we can assume all hashes are equally likely, then we can approximate the number of different things n
we'd need to hash to have p
probability of a collision (where p <= 0.5
).
n ≈ sqrt(2 * 2^(bits in hash) * p)
So if a hash is 64 bits and all hashes are equally likely, according to the table you'd need to hash over 180 million things to have about a one-in-a-thousand chance of seeing a collision.
p = 0.001 ≈ 2^-10
n ≈ sqrt(2^1 * 2^64 * 2^-10) = sqrt(2^(1 + 64 - 9)) = sqrt(2^55) = 2^27.5 = 189,812,531
So it's extremely unlikely.
Rust recently extended TypeId
to 128-bits to shrink the already astronomically small odds of collisions because there were a couple collisions reported. I think we basically have to assume TypeId
is sound anyway (and that collision is impossible), So we can worry about value hash collisions, but I don't know if that concern can be justified with a mathematical argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#7762 might be better about this, idk.
If reviewing this, please also look at #7762. It's a very similar technique but also does more to avoid allocating at all in the common case where the label is a unit struct or fieldless enum. |
See also #5410, which improves the benchmarks relevant to labels. |
T: Debug + DynHash + ?Sized, | ||
{ | ||
let key = UniqueValue::of(value); | ||
let mut map = INTERNER.get_or_init(default).write().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, .unwrap()
uses the Debug
implementation of the error type, so the error message will just be PoisonError { ... }
.
Maybe it would be better to bypass poisoning entirely by using .unwrap_or_else(PoisonError::into_inner)
? If a panic occurs while the lock is being held, I believe it would still leave the interner in a valid state.
type Interner = RwLock<HashMap<UniqueValue, &'static str>>; | ||
static INTERNER: OnceLock<Interner> = OnceLock::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell, you only ever call .write()
on this, so it would be better to just use a mutex here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think trading the RwLock
for a Mutex
could end up causing some weird behavior in hypothetical situations where systems or sub-apps in different threads try to construct or run schedules at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't currently being used like a RwLock
, though. Since you only call .write()
, this is basically just a mutex with more overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is, doesn't that prevent different threads from reading the interned strings at the same time? How much more overhead does a RwLock
have?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A RwLock
only allows multiple threads to read it at once if you use .read()
. Since you're only using .write()
, it behaves like a mutex and only one thread can access the interner at once.
In order to allow multiple threads to read it at once you'd need to refactor this to only call .write()
when a string is not already interned.
if let Some(str) = INTERNER.read().unwrap().get(key) {
*str
} else {
let str: &'static str = // format and leak the value
INTERNER.write().unwrap().insert(key, str);
str
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. It went over my head what you meant by "not using it like a RwLock
".
Closing in favor of #7762 |
# Objective First of all, this PR took heavy inspiration from #7760 and #5715. It intends to also fix #5569, but with a slightly different approach. This also fixes #9335 by reexporting `DynEq`. ## Solution The advantage of this API is that we can intern a value without allocating for zero-sized-types and for enum variants that have no fields. This PR does this automatically in the `SystemSet` and `ScheduleLabel` derive macros for unit structs and fieldless enum variants. So this should cover many internal and external use cases of `SystemSet` and `ScheduleLabel`. In these optimal use cases, no memory will be allocated. - The interning returns a `Interned<dyn SystemSet>`, which is just a wrapper around a `&'static dyn SystemSet`. - `Hash` and `Eq` are implemented in terms of the pointer value of the reference, similar to my first approach of anonymous system sets in #7676. - Therefore, `Interned<T>` does not implement `Borrow<T>`, only `Deref`. - The debug output of `Interned<T>` is the same as the interned value. Edit: - `AppLabel` is now also interned and the old `derive_label`/`define_label` macros were replaced with the new interning implementation. - Anonymous set ids are reused for different `Schedule`s, reducing the amount of leaked memory. ### Pros - `InternedSystemSet` and `InternedScheduleLabel` behave very similar to the current `BoxedSystemSet` and `BoxedScheduleLabel`, but can be copied without an allocation. - Many use cases don't allocate at all. - Very fast lookups and comparisons when using `InternedSystemSet` and `InternedScheduleLabel`. - The `intern` module might be usable in other areas. - `Interned{ScheduleLabel, SystemSet, AppLabel}` does implement `{ScheduleLabel, SystemSet, AppLabel}`, increasing ergonomics. ### Cons - Implementors of `SystemSet` and `ScheduleLabel` still need to implement `Hash` and `Eq` (and `Clone`) for it to work. ## Changelog ### Added - Added `intern` module to `bevy_utils`. - Added reexports of `DynEq` to `bevy_ecs` and `bevy_app`. ### Changed - Replaced `BoxedSystemSet` and `BoxedScheduleLabel` with `InternedSystemSet` and `InternedScheduleLabel`. - Replaced `impl AsRef<dyn ScheduleLabel>` with `impl ScheduleLabel`. - Replaced `AppLabelId` with `InternedAppLabel`. - Changed `AppLabel` to use `Debug` for error messages. - Changed `AppLabel` to use interning. - Changed `define_label`/`derive_label` to use interning. - Replaced `define_boxed_label`/`derive_boxed_label` with `define_label`/`derive_label`. - Changed anonymous set ids to be only unique inside a schedule, not globally. - Made interned label types implement their label trait. ### Removed - Removed `define_boxed_label` and `derive_boxed_label`. ## Migration guide - Replace `BoxedScheduleLabel` and `Box<dyn ScheduleLabel>` with `InternedScheduleLabel` or `Interned<dyn ScheduleLabel>`. - Replace `BoxedSystemSet` and `Box<dyn SystemSet>` with `InternedSystemSet` or `Interned<dyn SystemSet>`. - Replace `AppLabelId` with `InternedAppLabel` or `Interned<dyn AppLabel>`. - Types manually implementing `ScheduleLabel`, `AppLabel` or `SystemSet` need to implement: - `dyn_hash` directly instead of implementing `DynHash` - `as_dyn_eq` - Pass labels to `World::try_schedule_scope`, `World::schedule_scope`, `World::try_run_schedule`. `World::run_schedule`, `Schedules::remove`, `Schedules::remove_entry`, `Schedules::contains`, `Schedules::get` and `Schedules::get_mut` by value instead of by reference. --------- Co-authored-by: Joseph <[email protected]> Co-authored-by: Carter Anderson <[email protected]>
# Objective First of all, this PR took heavy inspiration from bevyengine#7760 and bevyengine#5715. It intends to also fix bevyengine#5569, but with a slightly different approach. This also fixes bevyengine#9335 by reexporting `DynEq`. ## Solution The advantage of this API is that we can intern a value without allocating for zero-sized-types and for enum variants that have no fields. This PR does this automatically in the `SystemSet` and `ScheduleLabel` derive macros for unit structs and fieldless enum variants. So this should cover many internal and external use cases of `SystemSet` and `ScheduleLabel`. In these optimal use cases, no memory will be allocated. - The interning returns a `Interned<dyn SystemSet>`, which is just a wrapper around a `&'static dyn SystemSet`. - `Hash` and `Eq` are implemented in terms of the pointer value of the reference, similar to my first approach of anonymous system sets in bevyengine#7676. - Therefore, `Interned<T>` does not implement `Borrow<T>`, only `Deref`. - The debug output of `Interned<T>` is the same as the interned value. Edit: - `AppLabel` is now also interned and the old `derive_label`/`define_label` macros were replaced with the new interning implementation. - Anonymous set ids are reused for different `Schedule`s, reducing the amount of leaked memory. ### Pros - `InternedSystemSet` and `InternedScheduleLabel` behave very similar to the current `BoxedSystemSet` and `BoxedScheduleLabel`, but can be copied without an allocation. - Many use cases don't allocate at all. - Very fast lookups and comparisons when using `InternedSystemSet` and `InternedScheduleLabel`. - The `intern` module might be usable in other areas. - `Interned{ScheduleLabel, SystemSet, AppLabel}` does implement `{ScheduleLabel, SystemSet, AppLabel}`, increasing ergonomics. ### Cons - Implementors of `SystemSet` and `ScheduleLabel` still need to implement `Hash` and `Eq` (and `Clone`) for it to work. ## Changelog ### Added - Added `intern` module to `bevy_utils`. - Added reexports of `DynEq` to `bevy_ecs` and `bevy_app`. ### Changed - Replaced `BoxedSystemSet` and `BoxedScheduleLabel` with `InternedSystemSet` and `InternedScheduleLabel`. - Replaced `impl AsRef<dyn ScheduleLabel>` with `impl ScheduleLabel`. - Replaced `AppLabelId` with `InternedAppLabel`. - Changed `AppLabel` to use `Debug` for error messages. - Changed `AppLabel` to use interning. - Changed `define_label`/`derive_label` to use interning. - Replaced `define_boxed_label`/`derive_boxed_label` with `define_label`/`derive_label`. - Changed anonymous set ids to be only unique inside a schedule, not globally. - Made interned label types implement their label trait. ### Removed - Removed `define_boxed_label` and `derive_boxed_label`. ## Migration guide - Replace `BoxedScheduleLabel` and `Box<dyn ScheduleLabel>` with `InternedScheduleLabel` or `Interned<dyn ScheduleLabel>`. - Replace `BoxedSystemSet` and `Box<dyn SystemSet>` with `InternedSystemSet` or `Interned<dyn SystemSet>`. - Replace `AppLabelId` with `InternedAppLabel` or `Interned<dyn AppLabel>`. - Types manually implementing `ScheduleLabel`, `AppLabel` or `SystemSet` need to implement: - `dyn_hash` directly instead of implementing `DynHash` - `as_dyn_eq` - Pass labels to `World::try_schedule_scope`, `World::schedule_scope`, `World::try_run_schedule`. `World::run_schedule`, `Schedules::remove`, `Schedules::remove_entry`, `Schedules::contains`, `Schedules::get` and `Schedules::get_mut` by value instead of by reference. --------- Co-authored-by: Joseph <[email protected]> Co-authored-by: Carter Anderson <[email protected]>
# Objective First of all, this PR took heavy inspiration from bevyengine#7760 and bevyengine#5715. It intends to also fix bevyengine#5569, but with a slightly different approach. This also fixes bevyengine#9335 by reexporting `DynEq`. ## Solution The advantage of this API is that we can intern a value without allocating for zero-sized-types and for enum variants that have no fields. This PR does this automatically in the `SystemSet` and `ScheduleLabel` derive macros for unit structs and fieldless enum variants. So this should cover many internal and external use cases of `SystemSet` and `ScheduleLabel`. In these optimal use cases, no memory will be allocated. - The interning returns a `Interned<dyn SystemSet>`, which is just a wrapper around a `&'static dyn SystemSet`. - `Hash` and `Eq` are implemented in terms of the pointer value of the reference, similar to my first approach of anonymous system sets in bevyengine#7676. - Therefore, `Interned<T>` does not implement `Borrow<T>`, only `Deref`. - The debug output of `Interned<T>` is the same as the interned value. Edit: - `AppLabel` is now also interned and the old `derive_label`/`define_label` macros were replaced with the new interning implementation. - Anonymous set ids are reused for different `Schedule`s, reducing the amount of leaked memory. ### Pros - `InternedSystemSet` and `InternedScheduleLabel` behave very similar to the current `BoxedSystemSet` and `BoxedScheduleLabel`, but can be copied without an allocation. - Many use cases don't allocate at all. - Very fast lookups and comparisons when using `InternedSystemSet` and `InternedScheduleLabel`. - The `intern` module might be usable in other areas. - `Interned{ScheduleLabel, SystemSet, AppLabel}` does implement `{ScheduleLabel, SystemSet, AppLabel}`, increasing ergonomics. ### Cons - Implementors of `SystemSet` and `ScheduleLabel` still need to implement `Hash` and `Eq` (and `Clone`) for it to work. ## Changelog ### Added - Added `intern` module to `bevy_utils`. - Added reexports of `DynEq` to `bevy_ecs` and `bevy_app`. ### Changed - Replaced `BoxedSystemSet` and `BoxedScheduleLabel` with `InternedSystemSet` and `InternedScheduleLabel`. - Replaced `impl AsRef<dyn ScheduleLabel>` with `impl ScheduleLabel`. - Replaced `AppLabelId` with `InternedAppLabel`. - Changed `AppLabel` to use `Debug` for error messages. - Changed `AppLabel` to use interning. - Changed `define_label`/`derive_label` to use interning. - Replaced `define_boxed_label`/`derive_boxed_label` with `define_label`/`derive_label`. - Changed anonymous set ids to be only unique inside a schedule, not globally. - Made interned label types implement their label trait. ### Removed - Removed `define_boxed_label` and `derive_boxed_label`. ## Migration guide - Replace `BoxedScheduleLabel` and `Box<dyn ScheduleLabel>` with `InternedScheduleLabel` or `Interned<dyn ScheduleLabel>`. - Replace `BoxedSystemSet` and `Box<dyn SystemSet>` with `InternedSystemSet` or `Interned<dyn SystemSet>`. - Replace `AppLabelId` with `InternedAppLabel` or `Interned<dyn AppLabel>`. - Types manually implementing `ScheduleLabel`, `AppLabel` or `SystemSet` need to implement: - `dyn_hash` directly instead of implementing `DynHash` - `as_dyn_eq` - Pass labels to `World::try_schedule_scope`, `World::schedule_scope`, `World::try_run_schedule`. `World::run_schedule`, `Schedules::remove`, `Schedules::remove_entry`, `Schedules::contains`, `Schedules::get` and `Schedules::get_mut` by value instead of by reference. --------- Co-authored-by: Joseph <[email protected]> Co-authored-by: Carter Anderson <[email protected]>
Objective
Box<dyn MyLabelTrait>
labels aren't too great. Schedules could build much faster if we didn't allocate from the heap for every instance of every label. (Probably. I need to benchmark. TODO.)I believe an improved label type would have these properties:
Copy
.Enum::VariantA
andEnum::VariantB
should be different labels).Debug
string as the original label.In the past, we switched away from
Box<dyn MyLabelTrait>
but had to revert in 0.10 to haveOnEnter(state)
andOnExit(state)
be distinct labels (3), which the label implementation at the time (#4957) did not support.This is a simpler, "one-size-fits-all" approach than #5715. I think #7762 might be the preferable alternative though. It's very similar but has some internal differences, including special-casing in its macros for unit structs and fieldless enums.
Solution
Debug
string of every label just once (on first use).struct MyLabelTraitId(&'static str)
just wraps the reference to the interned string.Debug
forMyLabelTraitId
to print the string.(TypeId, hash)
tuple for uniqueness instead ofTypeId
so that different values become different labels.