-
Notifications
You must be signed in to change notification settings - Fork 718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[spirv] Fix bug of CTBuffer DX memory layout with matrix #3672
Conversation
✅ Build DirectXShaderCompiler 1.0.4760 completed (commit f9560a9200 by @jaebaek) |
This is terrifying 😨 |
❌ Build DirectXShaderCompiler 1.0.4786 failed (commit 7247ed3234 by @jaebaek) |
❌ Build DirectXShaderCompiler 1.0.4792 failed (commit e2ff96cd17 by @jaebaek) |
I want to update the unit tests but it is almost done. The clone variable has the same type and memory layout with the one the current DXC generates for CTBuffer. I create a module initialization function to copy the CTBuffer with the correct memory layout to the clone variable and add Mostly this copy will be optimized out by spirv-opt. |
❌ Build DirectXShaderCompiler 1.0.1 failed (commit eaeb02ad29 by @jaebaek) |
❌ Build DirectXShaderCompiler 1.0.2 failed (commit 8406085935 by @jaebaek) |
❌ Build DirectXShaderCompiler 1.0.4 failed (commit 2970cadb3d by @jaebaek) |
❌ Build DirectXShaderCompiler 1.0.5 failed (commit 1165681c55 by @jaebaek) |
❌ Build DirectXShaderCompiler 1.0.7 failed (commit 993eabf668 by @jaebaek) |
❌ Build DirectXShaderCompiler 1.0.9 failed (commit 1011c00644 by @jaebaek) |
✅ Build DirectXShaderCompiler 1.0.10 completed (commit 1011c00644 by @jaebaek) |
This will fix #3463 |
✅ Build DirectXShaderCompiler 1.0.11 completed (commit 0f23667e57 by @jaebaek) |
✅ Build DirectXShaderCompiler 1.0.15 completed (commit 46f1c5eefc by @jaebaek) |
✅ Build DirectXShaderCompiler 1.0.17 completed (commit e138a15065 by @jaebaek) |
✅ Build DirectXShaderCompiler 1.0.18 completed (commit 7a86a0adc6 by @jaebaek) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your code review.
✅ Build DirectXShaderCompiler 1.0.29 completed (commit ff073e97ea by @jaebaek) |
✅ Build DirectXShaderCompiler 1.0.59 completed (commit 0faab2c3ee by @jaebaek) |
✅ Build DirectXShaderCompiler 1.0.60 completed (commit 8ff663d06d by @jaebaek) |
When the requested constant buffer layout is DX, some type lowering can become complex (Nx1 matrices for ex). To simplify the lowering, the backend "clones" those variables (See microsoft#3672) - on one end, we expose the correct layout - but then, we copy this to a local variable to have a different layout compatible with the operations we usually do on such types. This means the type can sometimes be an HybridStructType, which is not completely lowered, or a normal lowerer SPIR-V type. Both should be allowed, but some codepaths had to be updated to reflect this. Fixes microsoft#6511 Signed-off-by: Nathan Gauër <[email protected]>
When a CTBuffer contains a matrix and we use FXC memory layout for it i.e.,
-fvk-use-dx-layout
, the memory layout of the generated struct is differentfrom what FXC generates. FXC memory layout rule for matrices or array of
matrices are:
floatMxN
meansN
float vectors and each vector hasM
elements.floatMxN[K]
means an array offloatMxN
withK
elements.The non-trivial case is
float1xN
which is a matrix withN
vectors and each vector has 1 element.Its size should be
16 * (N - 1) + 4
based on the FXC memory layout rule.For example, the size of
float1x2
must be 20 in bytes, which means we want to put the first float value offloat1x2
at the offset 0 in bytes and the second float value at the offset 16 in bytes.It means we must not generate it as a SPIR-V vector type because setting it as a SPIR-V vector results in
putting the first at the offset 0 in bytes and the second at the offset 4 in bytes.
In addition, we cannot set it as a SPIR-V matrix type because SPIR-V does not allow a matrix with a single
row and a vector with a single element.
The only available option is to set it as a SPIR-V array with
ArrayStride 16
.Since we currently consider
float1xN
as anOpTypeVector
and generate all SPIR-V code based on the assumption.Changing the type of
float1xN
toOpTypeArray
needs huge engineering costs to handle all the cases.For example, in many places e.g., addition, subtraction, multiplication, we use
OpVectorShuffle
forfloat1xN
because we consider it asOpTypeArray
.Our solution is to create two variables for CTBuffer including
type1xN
with FXC memory layout:OpTypeArray
fortype1xN
OpTypeVector
fortype1xN
as the current DXC does.The Original variable is in charge of getting CTBuffer data from CPU.
We create a module initialization function to copy the Original variable to the Clone variable.
We insert
OpFunctionCall
for the module initialization function into all entry points.We use the Clone variable for the CTBuffer in all places.