-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assertion error from linear layouts #4727
Comments
Just to confirm, the TritonGPU IR is generated from valid Triton python code? |
It's came from the lowering from a new operator I'm adding, but I'll see if I can reproduce with an existing operator. |
This produces the same error on the current master branch import triton.language as tl
import triton
import torch
@triton.jit
def test_fn(out_ptr, a_ptr, workspace, M, N, M_BLOCK: tl.constexpr, N_BLOCK: tl.constexpr):
desc_ptr = workspace
tl.extra.cuda.experimental_device_tensormap_create2d(desc_ptr=desc_ptr, global_address=a_ptr, load_size=[4, N_BLOCK], global_size=[M, N], element_ty=a_ptr.dtype.element_ty)
tl.extra.cuda.experimental_tensormap_fenceproxy_acquire(desc_ptr)
gather = tl._experimental_descriptor_load(desc_ptr, [0, 0], [4, N_BLOCK], a_ptr.dtype.element_ty)
tl.store(out_ptr + tl.arange(0, 4)[:, None] * N_BLOCK + tl.arange(0, N_BLOCK)[None, :], gather)
out = torch.empty((4, 128), dtype=torch.float32, device="cuda")
inp = torch.arange(4 * 128, dtype=torch.float32, device="cuda").reshape(4, 128)
workspace = torch.empty(128, dtype=torch.uint8, device="cuda")
test_fn[(1,)](out, inp, workspace, 4, 128, 4, 128) |
I'll take a look today |
Reopening this as it seems the TMA hardware does support swizzling with only 4 rows of data. I get this result if it's helpful:
|
I think the problem is on this line |
I am running into an assertion error in the codegen for
local_load
which is coming from the linear layouts code. Here is a minified reproducerWhen lowering to llvm ir it fails with the following error
cc @Jokeren @jlebar
The text was updated successfully, but these errors were encountered: