-
Notifications
You must be signed in to change notification settings - Fork 193
Conversation
819e485
to
b0623af
Compare
src/back/spv/writer.rs
Outdated
@@ -439,6 +441,44 @@ impl Writer { | |||
results: Vec::new(), | |||
}; | |||
|
|||
for (handle, expression) in ir_function.expressions.iter() { | |||
match expression { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's match *expression
instead, like in all of the other places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
src/back/spv/writer.rs
Outdated
@@ -439,6 +441,44 @@ impl Writer { | |||
results: Vec::new(), | |||
}; | |||
|
|||
for (handle, expression) in ir_function.expressions.iter() { | |||
match expression { | |||
&crate::Expression::Compose { ty, .. } => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed on the Matrix, this code is overly generic.
The problem we need to solve is about dynamic indexing of arrays that are given by values (i.e. expressions or constants).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I have narrowed it down at this point. Thank you for identifying this so quickly!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid this needs another major iteration. I expressed some ideas below.
src/back/spv/writer.rs
Outdated
function.local_const_arrays.insert(handle, id); | ||
} | ||
} | ||
crate::Expression::Constant(h) => match ir_module.constants[h].inner { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably want to check that h
isn't already registered. I don't think there is a rule in the IR that prevents multiple Expression::Constant
pointing to the same thing.
src/back/spv/writer.rs
Outdated
block | ||
.body | ||
.push(Instruction::store(var_id, composite_id, None)); | ||
var_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can return var_id
here. The type of the returned SPIR-V is still expected to be a value, not a pointer. It seems to me that we should always return composite_id
from this block?
@@ -1375,7 +1450,20 @@ impl Writer { | |||
} | |||
} | |||
crate::Expression::GlobalVariable(handle) => self.global_variables[handle.index()].id, | |||
crate::Expression::Constant(handle) => self.constant_ids[handle.index()], | |||
crate::Expression::Constant(handle) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the near future, when we know how each constant is used by each function - #656 - we'll be able to do this at the beginning of the function only. Maybe we can leave a small TODO pointing to this issue?
src/back/spv/writer.rs
Outdated
block | ||
.body | ||
.push(Instruction::store(var_id, composite_id, None)); | ||
var_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similarly to the other case, I don't think we can return var_id
here, since it has a different type
src/back/spv/writer.rs
Outdated
let id = self.id_gen.next(); | ||
let var_id = match ir_function.expressions[base] { | ||
crate::Expression::Constant(..) => self.cached[base], | ||
crate::Expression::Compose { .. } => self.cached[base], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, this is incomplete: there other ways to get an array-typed expression:
- a
Expression::FunctionArgument
that happens to be an array - a
Expression::Load
on a pointer to an array - an
Expression::Access*
on a structure (or other composite) that contains an array
There needs to be a more general approach here...
Here is a rough idea. Let's have a generic mechanism here for promoting expressions to variables. If an expression is promoted, then the Emit
of it becomes an OpStore
, and any use of it becomes an OpLoad
, other than the Access
/AccessIndex
uses, which do OpLoad
after issuing OpAccessChain
.
We could build a FastHashMap<Handle<crate::Expression>, Word>
for storing the variable ids, similar to what the PR does.
I've been making some headway on this, and noticed something: The WGSL front-end always generates an |
That's a great question! I believe we want to generate |
I have several test cases working now, including const arrays of arrays, but a problem is that I haven't found an opportunity to produce a single multi-step With non-const vars, With non-const variables, the spv back-end caches an id of It's an interesting problem! It took me a while to figure out that coupling between the front- and back-ends with the |
Thank you for digging into this! It's a complex topic, and I'm happy that you aren't giving up easily 👍
Are you concerned about having to load the whole
Could you clarify what is
Did you consider a path where we just promote any expression (containing a fixed-size array) into a variable?
I'm not sure what extra help or information the frontends could provide in this case. |
Yes, it's an efficiency concern. The sour taste is because the current approach loads all of I'm sorry that I wasn't clear enough with my last message. The issue with building an The challenge with generating a multi-level For the That works out because the wgsl front-end produces an It is that To be clear, there is nothing wrong with how things are currently handled; it nicely sidesteps the challenge of the writer visiting expressions inside-out: that is, the writer sees the innermost Having the WGSL front-end produce a One approach would be if we had a way to determine that the expressions |
Right, thank you for explaining this is great detail!
Please not that this isn't just WGSL frontend. The same goes for SPV-in.
Is there much to be saved here though, with a sufficiently smart compiler? My understanding is that we'd still have to load If that's the case, let's not worry about this right now. We can keep an issue about possible overhead of constant arrays, when nested into other structures by value, and that's it. |
@acowley any chance we can resume this and land? |
Loading an element from a const array requires that memory be allocated for that array. If a function makes use of a local or global const array, we allocate a local variable into which we store the array. Array accesses then load from that variable.
First: apologies that this is a larger PR than you probably want. I think this now hits all of the complaints I had before about spurious allocations, but it needs more testing. It can do things like generate an |
@acowley I am indeed very concerned about the complexity of this PR. More than 400 LOC of pure code to handle (or work around) the cases of constant arrays indexed dynamically. I'm sure we can approach this much simpler. Let me try to explain. We just need two things:
For the performance considerations, the important thing to keep in mind that value arrays never appear out of nowhere. They need to be either constructed, or loaded from somewhere (e.g. a uniform buffer). In both cases, there is a copy of the data involved. Therefore, promoting it to a variable, and doing an extra In practice, Vulkan drivers all run the local variable optimization pass, where they figure out which variables are not needed. So all this "promotion to a variable" will likely be resolved and eliminated by the driver, anyway. We should put more focus on clean and simple code as opposed to trying to design an elaborate minimum-overhead solution for this. |
Okay. I'd have preferred to simplify the traversal logic which is why I moved the PR status back to ETA: I took out the optimization which is why the old snapshots now fail. |
I moved the Access-optimized version elsewhere as something closer to where I wanted this code to go. The bulk of the lines related to the optimization is in a couple traversal helpers on Naga IR that don't have anything to do with Access or SPIR-V. I understand that we're really not on the same page regarding this feature, though, so I'm fine with closing this PR with no hard feelings to let you get on with it the way you'd like to see it done! |
I didn't realize this. Will review this PR and try to figure out what is related and what not. |
I think it’s still more than you’re looking for. The diff I linked deals with finding roots of access chains, producing I support however you want to iterate on const array support, but, as a user, I’d be a bit disappointed if using That the generated SPIR-V is noisier without some optimization is another minor consideration, and I understand your point that a driver is highly likely to optimize away the noise. I appreciate that the alternative of producing a |
I'm going to try implementing it the way I described (unless you want to try this out?) and see how it fares in terms of code complexity and produced code quality. We can also run it through tools like Radeon GPU Analyser, and perhaps it will show the actual machine code after optimization. |
Here's a simplified commit. It leaves intermediate |
@acowley thank you! So do I understand correctly that the logic of traversing the access chain would only help for arrays that are nested? I took a stab at a naive implementation in #723, and I'm tempted to land it as a quick and dirty solution (doesn't require any extra state, well localized). It comes with a simple test, which I suspect is producing similar code to what this PR does (since the array is non-nested). |
Yes, I checked against a few test cases, and for single array accesses the generated spv is essentially the same modulo the friendly name generation that I cribbed from elsewhere in the writer. Nested accesses of course look different in the spv, but I didn’t try to look at any disassembly from the driver. |
Fixes #635
Loading an element from a const array requires that memory be allocated for that array. If a function makes use of a local or global const array, we allocate a local variable into which we store the array. Array accesses then load from that variable.