Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[StableHLO] Reduce op failure #848

Closed
mmanzoorTT opened this issue Sep 30, 2024 · 5 comments
Closed

[StableHLO] Reduce op failure #848

mmanzoorTT opened this issue Sep 30, 2024 · 5 comments
Assignees

Comments

@mmanzoorTT
Copy link
Contributor

StableHLO reduce op is failing due to issues with constant op

A sample test case can be found in tt-xla PR

@nsmithtt
Copy link
Contributor

nsmithtt commented Oct 9, 2024

Not sure what the issue is from the description. Not sure which component to send it to, can you add more context?

@mmanzoorTT
Copy link
Contributor Author

jnp.max(array) is lowered to

    %0 = stablehlo.constant dense<0xFF800000> : tensor<f32>
    %1 = stablehlo.reduce(%arg0 init: %0) applies stablehlo.maximum across dimensions = [0, 1] : (tensor<128x784xf32>, tensor<f32>) -> tensor<f32>

Similarly, jnp.sum(array) is lowered to

    %cst = stablehlo.constant dense<0xFF800000> : tensor<f32>
    %0 = stablehlo.reduce(%arg0 init: %cst) applies stablehlo.maximum across dimensions = [0, 1] : (tensor<2x1xf32>, tensor<f32>) -> tensor<f32>

stablehlo.reduce requires a scalar constant for initialization and scalar constant support is in progress. So stablehlo.reduce is blocked. We have stablehlo to TTIR conversion for stablehlo.reduce but end-to-end tests (using jax) fails due to issues with scalar constant.

@mmanzoorTT
Copy link
Contributor Author

TTNN has ttnn.max and ttnn.sum ops which corresponds to stablehlo.reduce op. TTNN version does not require initialization constant and it gets remove as dead code. so scalar constant is not an issue here.
stablehlo.reduce conversion has few bugs which are fixed in #899

@uazizTT
Copy link
Contributor

uazizTT commented Oct 11, 2024

This seems to be a duplicate of #805

mrakitaTT added a commit that referenced this issue Dec 20, 2024
This PR adds TTNN workarounds for these Metal issues:
- tenstorrent/tt-metal#13361 - By decomposing
`reduce(keepDim=false)` into `reduce(keepDim=true) + reshape`
- tenstorrent/tt-metal#16118 - By annulling
dimensions argument when all dims are being reduced

As part of this work I've also:
- Enabled conversion of `stablehlo.reduce` op with multiple reduce
dimensions
- Added reduce ops verifiers in TTIR
- Added a separate function in TTNNWorkarounds to run rewrite patterns
for decomposition and layout workarounds
- Added lots of unit tests for reduce ops to cover conversions and
verifiers
- Added lots of silicon tests for reduce ops

Opened issue #1624 on
myself to revert these workarounds once Metal issues are fixed.

Closes #805, #848

After implementing these workarounds and running tests, I've encountered
[another Metal
issue](tenstorrent/tt-metal#16104), this time
in `reshape` op. I've debugged it and I have a local fix, I will send a
PR to fix it in Metal repo, confirmed with reshape op owners. I've
opened myself an issue
#1640 to enable Reduce ops
silicon tests after this fix is uplifted.

Another issue that I've encountered while working on this is that after
the workaround pass decompositions, if we are changing the shapes of the
ops tensors, that means that their layout needs to be changed too, but
layout pass is done before the workaround pass. I've managed to solve it
by reusing the layout of the input tensor, but I am not sure if that is
a good solution and maybe we need to repeat some of the layout logic
again after workaround decompositions. FYI @sdjordjevicTT

Here is the example TTNN IR before the workarounds:
```
%3 = "ttnn.sum"(%2) <{dim_arg = [0: i32, 1 : i32, 2: i32], keep_dim = false}> : (tensor<128x32x4xf32, #ttnn_layout2>) -> tensor<1xf32, #ttnn_layout2>
```

and after the workarounds:
```
%3 = "ttnn.sum"(%2) <{keep_dim = true}> : (tensor<128x32x4xf32, #ttnn_layout2>) -> tensor<1x1x1xf32, #ttnn_layout2>
%4 = "ttnn.reshape"(%3) <{shape = [1 : i32]}> : (tensor<1x1x1xf32, #ttnn_layout2>) -> tensor<1xf32, #ttnn_layout3>
```
@mrakitaTT
Copy link
Contributor

Closing as a duplicate of #805

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants