-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Animatable trait for interpolation and blending #4482
Conversation
Co-authored-by: Kirillov Kirill <[email protected]>
Co-authored-by: François <[email protected]>
Use slerp for Quat interpolation
if input.additive { | ||
translation += input.weight * Vec3A::from(input.value.translation); | ||
scale += input.weight * Vec3A::from(input.value.scale); | ||
rotation = rotation.slerp(input.value.rotation, input.weight); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rotation = rotation.slerp(input.value.rotation, input.weight); | |
// Important: additively animating quaternions means we first have to partially apply the input rotation. | |
// Quaternion multiplication order is important here. Additionally, do not use nlerp here. These rotations | |
// can be very different, so we must use slerp explicitly. | |
// Reference: | |
// "Additive blend" section in <https://animcoding.com/post/animation-tech-intro-part-3-blending> | |
// "Here’s how to apply an additive pose on top of the regular pose. It differs from the regular blend. | |
// If the weight is one, then we can multiply transforms. If the weight is non-one, then we need to first blend | |
// between identity and the additive transform, then apply this blended transform on the transform in the pose." | |
rotation = rotation * Quat::IDENTITY.slerp(input.value.rotation, input.weight); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the critical fix right? We should add a comment describing why this is done, ideally with a reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this is the critical fix. Everything else in my review is 'fluff'. added a comment
input.weight, | ||
); | ||
scale = Vec3A::interpolate(&scale, &Vec3A::from(input.value.scale), input.weight); | ||
rotation = Quat::interpolate(&rotation, &input.value.rotation, input.weight); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can be more explicit about the kind of interpolation
rotation = rotation.slerp(&input.value.rotation, input.weight);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nlerp is wanted here for commutativity on the blend stack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does that mean? What's supposed to be commutative? What's the blend stack? Blending operations are not commutative in general, you need to apply them in order. And even if they are, what do we gain by reordering?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the blend stack is the sequence of animations being blended. commutativity makes reasoning about the operations simpler in the general case, as you dont need to have logic to ensure they are always ordered the same way for the result to be consistent. consider two animations A and B, which can each start and stop at any moment. if you start A, then B, you will have A applied first then B applied on top. if you start B then A, they will be on the stack the other way around and will look different. This gets more complicated if you can cancel and restart, or pause animations. You start needing an animation priority system to determine which get applied first, and that can be confusing. Besides all that, its slower for not much gain quality-wise, as far as i understand. Slerp is not needed here, refer to the document linked in the comments above interpolate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you start A, then B, you will have A applied first then B applied on top.
Only if B is additive, because otherwise you should stop A first, you can't apply two non-additive animations to the same target. Also pause etc. is about animation transitions and creating the stack, not about blending (how to consume the stack). And if B is additive then I argue that the operations are not commutative. It's not a question of quality, it's a question of producing the correct result.
fn interpolate(a: &Self, b: &Self, t: f32) -> Self { | ||
Self { | ||
translation: Vec3::interpolate(&a.translation, &b.translation, t), | ||
rotation: Quat::interpolate(&a.rotation, &b.rotation, t), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can be more explicit about the kind of interpolation
this definitely needs to explicitly be slerp
rotation: Quat::interpolate(&a.rotation, &b.rotation, t), | |
rotation: a.rotation.slerp(&b.rotation, t), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because interpolate should be nlerp by default and transforms can be very different rotations and thus need slerp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah you mean that for rotations the interpolation should be fixed to always be slerp because others will not provide decent results in general, right? I agree that we should probably limit the number of different interpolations we support, explicitly. See my other long comment about the design.
} | ||
|
||
impl Animatable for Quat { | ||
/// Performs an nlerp, because it's cheaper and easier to combine with other animations, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These docs are dissonant. The code is not doing an nlerp
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be doing an nlerp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your other comment says it should be doing a slerp() no? Because nlerp() for very different rotations might be way off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've gone over the usages of interpolate and slerp again after reading more. In general we want to use nlerp in most places, except for additive blending and transform interpolation. The rest we prefer nlerp.
I don't have much to add to the review other than what @rodolphito said, looks good otherwise. |
pub trait Animatable: Reflect + Sized + Send + Sync + 'static { | ||
/// Interpolates between `a` and `b` with a interpolation factor of `time`. | ||
/// | ||
/// The `time` parameter here may not be clamped to the range `[0.0, 1.0]`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this comment mean? How is an implementor supposed to understand interpolating with t=1.5
(which sounds like an extrapolation rather)? Are we sure this is even possible in the general case? Because this seems to strongly imply lerp
here / a linear transform.
There's a bigger problem with switching to nlerp i've discovered: Glam's nlerp is not commutative, nor can it easily be made commutative. This is because it is normalizing after interpolating, which makes sense for blending two quaternions, but it also means that its no longer a truly linear operation. To commutatively blend an arbitrary number of quaternions, one would need to defer the normalization til the very end. This is problematic however because the intermediary values are not true rotation quaternions because they are not unit length. We'd want to encode this in the type system somehow. I think it can be done with a PR to glam, but it might be contentious as it adds complexity and would need some extra implementations. There would need to be a For this reason I am opting to leave slerp in for the moment, because this would need further discussion and upstream work in glam. |
I've realized there actually is a simpler way forward that can be implemented now without changes to glam. |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the long comment over the abstraction shape. We can discuss on Discord, but this was easier I think to make a comment here with code example.
use bevy_utils::FloatOrd; | ||
|
||
/// An individual input for [`Animatable::blend`]. | ||
pub struct BlendInput<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've thought a bit more about this abstraction. I think it's useful to look at what our goals are, especially in terms of how things will be used, and how this is trying to achieve those goals.
As I recall, the base idea of BlendInput<T>
is to "preprocess" all the various blend sources into a single common data structure per type (one for f32
, one for Vec4
, etc.) and then run a tight loop on blending all sources for that data type, which is then easier because you iterate only on BlendInput<T>
and not a collection of heterogenous sources.
Now, this is clearly a step in the right direction, in the sense that we're thinking about the data usage pattern to optimize it and leverage e.g. SIMD. Where I think this abstraction falls short is in two ways:
-
It iterates over the blend sources in the inner loop, as seen in the signature of
Animatable::blend()
which takes a set of sources to blend together. How many different sources do we expect will blend that way? Even for an advanced scenario with say 4 base animations blending together (walk, run, jump, and crouch), and let's say 4 more additive ones on top (look left/right, up/down, and whatever else), that makes 8 sources to blend into a final vertex value. Compare that with how many vertices we need to blend for a single 3D character in a modern game, which is in the order of thousands. And those will be iterated over in an outer loop. We've got the loops reversed from a performance point of view. -
The data structure is not optimal for packing and memory accesses. First the packing: depending on
T
, and let's take anyVecN
as example, the compiler will most likely put theVecN
first, aligned on 4 bytes (or 16 bytes for SIMD), followed byweight
(4B size/align), followed byadditive
(1B). This means we have 3 bytes of padding for every 12B (f32
) to 24B (Vec4
) of total size, which is a lot. The access pattern, which is also to load the value and weight into separate (SIMD) registers to perform somelerp()
(which is the most common case; let's ignore rotations for a minute) is also not helped here because we can't load more than 1 source at once (SIMD register load only accepts a continuous block of memory). If, on the other hand, we were transforming this "array of struct" into a "struct of array", we could load more data sources at once, and blend multiple of them via SIMD. Better even, we can even pack alllerp()
sources together (that is, including packingf32
withVec3
andVec4
) since the SIMD instructions to lerp are the same and operate component-wise. This means a very tight loop on a long array of data without any branching etc. which is the absolute optimal for SIMD.
So the way I'd see a better abtraction for the blending step is:
- explicitly support lerp / nlerp / slerp with separate optimized codepaths, because they require different SIMD instructions;
- possibly add a fallback where we invoke something like the current
Animatable::interpolate()
, but only for "exotic" interpolations, and I'm not even sure we want that (I don't have an example); - pack all inputs and do
lerp(lhs[], rhs[], weights[])
instead oflerp(lhs,rhs,weight)[]
.
pub enum Interpolation<T> {
Linear,
NormalizedLinear,
SphericalLinear,
// See how this immediately become messy due to T, maybe we can skip this entirely
Custom(Box<dyn Fn(&T, &T, f32) -> T),
}
impl Interpolation<f32> {
// Blend two arrays of values with given weights
fn blend(&self, src: &mut [f32], rhs: &[f32], weights: &[f32], additive: bool) {
match *self {
Interpolation::<f32>::Linear => {
// Branch _before_ we start the hot loop
if additive {
// This loop can get partially unrolled, use SIMD, etc.
for i in 0..src.len() {
src[i] += rhs[i] * weights[i];
}
} else {
// Same here
for i in 0..src.len() {
src[i] = (1.0 - weights[i]) * src[i] + rhs[i] * weights[i];
}
}
}
// etc.
}
}
}
// Now, we need to know how to pack e.g. Vec3 and f32 and Vec4 together:
pub trait Animatable: [...] {
type Unit;
fn prepare(&self) -> &[Unit];
}
impl Animatable for Vec3 {
type Unit = f32;
fn prepare(&self) -> &[f32] { self.as_ref() }
}
// etc. for other types...
fn animation_system(anims: &[...]) {
// Read and prepare values from component (here, Transform)
let mut src = vec![];
for tr in &transforms {
// Assuming here the case where both translation and scale use Interpolation::Linear
// to keep the example short:
src.extend_from_slice(tr.translation.prepare());
src.extend_from_slice(tr.scale.prepare());
}
// Blend all anims into src directly
let mut rhs = vec![];
for anim in &anims {
// Grab one animation. Ideally we can even pre-process this in the anim format,
// and completely remove this block.
rhs.clear();
for tr in &anim.transforms {
rhs.extend_from_slice(tr.translation.prepare());
rhs.extend_from_slice(tr.scale.prepare());
}
// Do the actual blend for every transform at once
Interpolation::<f32>::blend(&mut src, &rhs, &anim.weights, false);
}
// Store back final result
let mut offset = 0;
for tr in &mut transforms {
tr.translation.copy_from_slice(&src[offset..offset+3]);
offset += 3;
tr.scale.copy_from_slice(&src[offset..offset+3]);
offset += 3;
}
}
fn interpolate(a: &Self, b: &Self, t: f32) -> Self { | ||
Self { | ||
translation: Vec3::interpolate(&a.translation, &b.translation, t), | ||
rotation: Quat::interpolate(&a.rotation, &b.rotation, t), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah you mean that for rotations the interpolation should be fixed to always be slerp because others will not provide decent results in general, right? I agree that we should probably limit the number of different interpolations we support, explicitly. See my other long comment about the design.
} | ||
|
||
impl Animatable for Quat { | ||
/// Performs an nlerp, because it's cheaper and easier to combine with other animations, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your other comment says it should be doing a slerp() no? Because nlerp() for very different rotations might be way off.
@rodolphito I've been told that what this PR does actually matches what Unity does. In Unity override layers cancel out the results of previous additive layers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed a lot on Discord. There's relative consensus with @pcwalton and @james7132 that this is good enough for now, so let's move forward.
Some resources:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm content with this for now. Like djee said, there's been a lot of discussion and we have broad consensus. Ongoing improvements will be much easier to do in follow-ups :)
For posterity, the discussion that led to this being merged can be found here: https://discord.com/channels/691052431525675048/774027865020039209/1202358388466384956 |
# Objective Allow animation of types other than translation, scale, and rotation on `Transforms`. ## Solution Add a base trait for all values that can be animated by the animation system. This provides the basic operations for sampling and blending animation values for more than just translation, rotation, and scale. This implements part of bevyengine/rfcs#51, but is missing the implementations for `Range<T>` and `Color`. This also does not fully integrate with the existing `AnimationPlayer` yet, just setting up the trait. --------- Co-authored-by: Kirillov Kirill <[email protected]> Co-authored-by: François <[email protected]> Co-authored-by: irate <[email protected]> Co-authored-by: Alice Cecile <[email protected]> Co-authored-by: Alice Cecile <[email protected]>
Objective
Allow animation of types other than translation, scale, and rotation on
Transforms
.Solution
Add a base trait for all values that can be animated by the animation system. This provides the basic operations for sampling and blending animation values for more than just translation, rotation, and scale.
This implements part of bevyengine/rfcs#51, but is missing the implementations for
Range<T>
andColor
. This also does not fully integrate with the existingAnimationPlayer
yet, just setting up the trait.