-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Let's add images #293
Let's add images #293
Conversation
Atlas offset and image size were originally stored in the ptcl but are not tile dependent. Moving these to info saves 8 bytes per image tile.
@@ -97,6 +101,11 @@ fn cardioid_and_friends(sb: &mut SceneBuilder, _: &mut SceneParams) { | |||
} | |||
|
|||
fn animated_text(sb: &mut SceneBuilder, params: &mut SceneParams) { | |||
let piet_logo = params | |||
.images | |||
.from_bytes(PIET_LOGO_IMAGE.as_ptr() as usize, PIET_LOGO_IMAGE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a strict provenance perspective, this should use pointer::addr (whereas as usize
exposes the address, AIUI). That however is unstable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just my expedient hacky solution to support image loading for examples. Being an example though, it probably does deserve a “don’t do this in real code!” comment.
Since the address is never converted back into a pointer, I don’t think provenance is an issue. Size likely would be on CHERI architectures though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, to be clear, there's no soundness issue here.
This is just an observation that the semantics of pointer to integer casts are messy. The way I think this works is that the use of as usize
"tells" the compiler that you might do a corresponding as *const _
, even though there isn't. .addr
doesn't have that "side effect", but is unstable.
This is not at all important, I just recognised this as an opportunity to brush up my understanding of strict aliasing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough! Pointer to int casts are always worth calling out in code review just to have some discussion logged for future reference.
use vello::peniko::*; | ||
use vello::*; | ||
|
||
const PIET_LOGO_IMAGE: &[u8] = include_bytes!("../../assets/piet-logo.png"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not so sure about using the Piet logo here. Ideally, we'd have a Vello logo, or a Linebender logo.
I don't have a better suggestion for what we can use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Linebender or vello logos would be great! This choice allowed me to sidestep licensing concerns and, as an additional benefit, has transparency which exposed a filtering bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the licensing situation for the druid logo from the screenshot you posted on xulip chat? I'm guessing Druid is a bit more relevant than Piet.
I'm supportive of a weekend art project for a vello logo :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume we're free to use the druid logo as well but I thought piet might be slightly more closely related to vello given the lineage. For now, the priority is just to have some example that displays an image.
I can provide colorful... commentary on a logo project :) I don't think my artistic talents are up to the task!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good. One thing that got me thinking is whether the ImageCache
should be part of the encoding module or if it should be a separate utility (perhaps part of the scene API). It sounds generally useful and how the atlas is packed is an intrinsic part of the encoding.
When/if we move the encoding module into its own crate, should ImageCache
be provided by that or should we try to provide a leaner API? I'm not strongly opinionated either way but I am thinking about how to keep the native (C/C++) bridging for clients simple and the library lightweight.
use vello::peniko::*; | ||
use vello::*; | ||
|
||
const PIET_LOGO_IMAGE: &[u8] = include_bytes!("../../assets/piet-logo.png"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the licensing situation for the druid logo from the screenshot you posted on xulip chat? I'm guessing Druid is a bit more relevant than Piet.
I'm supportive of a weekend art project for a vello logo :)
@@ -377,6 +384,13 @@ fn main( | |||
write_grad(CMD_RAD_GRAD, index, info_offset); | |||
} | |||
} | |||
// DRAWTAG_FILL_IMAGE | |||
case 0x248u: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unfortunate that naga still doesn't seem to support non-literal const-expressions as a case selector (which is supported by the spec).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to change those values several times while playing with the encoding and missing one of these actually led to a TDR. So yeah, this is supremely annoying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the naga side this seems to tracked by gfx-rs/naga#2106. I'm putting a mention here and I'll ping the naga issue as well as this would be a huge quality of life improvement (I've wanted this in other projects as well).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the link. This will hopefully make the cut for the next release.
// DRAWTAG_FILL_IMAGE | ||
case 0x248u: { | ||
info[di] = bitcast<u32>(linewidth); | ||
let inv_det = 1.0 / (matrx.x * matrx.w - matrx.y * matrx.z); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is perfectly fine as is but if we were to use the built-in matrix types this could possible use the determinant
intrinsic. Maybe worth exploring in the future (I also don't know if there are any potential compatibility issues with using the intrinsics over hand-rolling calculation yourself).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great to do a pass through the code to make use of appropriate matrix types/ops where appropriate. For this PR, I'd like to avoid the extra churn though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For sure, I didn't mean for that to happen in this CL. I'm happy to do a fixup pass later for matrices seeing as I've brought it up a couple times already and you seem to like the idea :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sgtm!
Doing the final packing requires resolving the atlas layout in some way. This currently sits in the encoding module, but I can easily see breaking that one up into both |
* Add comment about naughty ptr2int cast * Change Option to anyhow::Result in test scene ImageCache * Replace magic constant with pixel stride from selected format
OK, let's discuss this on Monday. I don't think we need to block this PR on this issue but I think let's also plan to talk about what the FFI bridge should look like. Once we pull encoding into its own separate crate we can measure the code size and see if it's worth modularizing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say Raph should probably take a look as well and Daniel should approve following his comments but LGTM from my point of view.
I left two more comments around some unused imports. I tested the with_winit/headless examples on M1 Max and with_winit on Android and didn't run into any issues.
As you mentioned, there is a slight performance decrease that I noticed on Pixel 6 (framerate for the scene drops from ~80fps to ~70fps). I didn't see a performance drop on M1 Max.
It might be a good idea to add a test scene with many images just so we get an idea about performance when there are many atlas lookups.
// DRAWTAG_FILL_IMAGE | ||
case 0x248u: { | ||
info[di] = bitcast<u32>(linewidth); | ||
let inv_det = 1.0 / (matrx.x * matrx.w - matrx.y * matrx.z); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For sure, I didn't mean for that to happen in this CL. I'm happy to do a fixup pass later for matrices seeing as I've brought it up a couple times already and you seem to like the idea :)
examples/scenes/src/test_scenes.rs
Outdated
use crate::{ExampleScene, SceneConfig, SceneParams, SceneSet}; | ||
use vello::kurbo::{Affine, BezPath, Ellipse, PathEl, Point, Rect}; | ||
use vello::kurbo::{Affine, BezPath, Ellipse, PathEl, Point, Rect, Vec2}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the Vec2
import is unused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed! There are a few more issues with unused symbols. They've been lingering a while so I'll do another pass to fix those soon (and also make the codebase clippy clean)
Looking forward to it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks good to me. I am not attached to scaling the image in a rasterization pipeline, though my gut feeling is that it would be higher performance in many cases. Ultimately that's a matter for very careful performance analysis, and the goal for this cycle is to get something that works. This approach also avoids the problems of really huge atlas entries when scaling way up.
Thanks, and excited to see this!
if all(atlas_uv < atlas_extents) { | ||
let uv_quad = vec4(max(floor(atlas_uv), image.atlas_offset), min(ceil(atlas_uv), atlas_extents)); | ||
let uv_frac = fract(atlas_uv); | ||
let a = premul_alpha(textureLoad(image_atlas, vec2<i32>(uv_quad.xy), 0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious why the by-hand bilinear interpolation here rather than using hardware. Going over the WGSL intrinsics, I think textureSampleLevel
is the most appropriate version - unlike textureSample
it doesn't rely on derivatives, and I think it's likely to be more performant than textureSampleGrad
(which based on past piet-gpu experimentation was slow).
That could perhaps be followup work to this patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hadn’t intended to do filtering at all in this PR so it was sort of tacked on as an afterthought. I’ll work on hardware sampling as a followup.
pub index: u32, | ||
/// Packed image offset. | ||
pub offset: u32, | ||
/// Packed atlas coordinates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would [u16; 2] be a better choice here? It feels more semantic, but perhaps just using u32 leaves less slop around stuff like endianness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it was simply a desire to keep the shape of the data the same for both CPU and GPU. Also, when we do add descriptor indexing, the xy field (maybe renamed location) can be used as the array index in that mode.
shader/fine.wgsl
Outdated
let my_xy = vec2(xy.x + f32(i), xy.y); | ||
let atlas_uv = image.matrx.xy * my_xy.x + image.matrx.zw * my_xy.y + image.xlat + image.atlas_offset; | ||
// This currently clips to the image bounds. TODO: extend modes | ||
if all(atlas_uv < atlas_extents) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My gut feeling is that performance might be improved considerably if this is also predicated on area[i] != 0.0
, though that will almost certainly depend on the hardware. These loads are already conditional, and I think it can reduce texture sample bandwidth for the tile wastage case. Perhaps it's worth doing careful experiments and landing this with the simplest case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had discussed this and I agree that since we’re already branching, might as well add the additional check. Will do before merging.
I've been experimenting with SVG parser and realized indeed it would be better to place |
Thanks for pointing this out. For higher level integrations like SVG and Lottie that make use of the scene API, the (resolve time) image cache is essentially an internal implementation detail and users shouldn’t have to worry about it. |
This patch introduces initial support for rendering images.
The general process is as follows:
Queue::write_texture
to blit all images to their associated locations.draw_leaf
computes an inverse transform, same as radial gradients (7 words of draw info).coarse
writes the image parameters into the ptcl (4 words).fine
reads the image parameters, converts coords from screen to atlas space, clips, loads 4 pixels and performs bilinear filtering.I haven't done any performance measurement, but the fine stage hurts a bit. This can be simplified considerably for images where the upper 2x2 matrix is identity and the translation is integral.
The alternative is to follow the original plan in #176 which calls for rasterizing pre-transformed images into the atlas. This makes fine cheaper at the cost of significantly more complexity at the setup stage and less efficient atlas allocation.
On the todo list: more robustness, retained caching, extend modes, ...