-
Notifications
You must be signed in to change notification settings - Fork 645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revert "Propagate reshapes through generics with reduction… #18968
Revert "Propagate reshapes through generics with reduction… #18968
Conversation
caece27
to
ac014c1
Compare
Update: this still should be reverted but @Groverkss is going to also work on making the spec more robust for attention |
…ree-org#18857)" This reverts commit 78481a6. Signed-off-by: Ian Wood <[email protected]>
…g#18968) This reverts commit 8d3faf8. Signed-off-by: Ian Wood <[email protected]>
…g#18968) This reverts commit 8d3faf8. Signed-off-by: Ian Wood <[email protected]>
…g#18968) This reverts commit 8d3faf8. Signed-off-by: Ian Wood <[email protected]>
…g#18968) This reverts commit 8d3faf8. Signed-off-by: Ian Wood <[email protected]>
…#18968) …(iree-org#18857)" This regresses sdxl int8 perf by increasing the dimensionality of `attention` ops which messes with the attention spec. Revert this for now and reland once `CollapseDimensionsPass` can handle attention. This reverts commit 78481a6. Signed-off-by: Ian Wood <[email protected]> Signed-off-by: Giacomo Serafini <[email protected]>
…g#18968) This reverts commit 8d3faf8. Signed-off-by: Ian Wood <[email protected]>
…g#18968) This reverts commit 8d3faf8. Signed-off-by: Ian Wood <[email protected]>
…g#18968) This reverts commit 8d3faf8. Signed-off-by: Ian Wood <[email protected]>
Reland after fixing sdxl int8 regressions via #19012. Running CI revealed further performance regressions that have pending patches: #19325 and #19326. This reverts commit 8d3faf8. --------- Signed-off-by: Ian Wood <[email protected]>
oh oh, this cause quite a big regression in llama_8b_fp16 At this commit (a5c3879)
Before this commit (80cbf6b)
|
I'd suspect that its causing reduction/matmul ops to get expanded but they can't get collapsed because they are dynamic. Do you have the traces and/or dispatch counts for the runs? |
I can give you a full repro in the morning |
…ree-org#18968)" This reverts commit a5c3879.
…ree-org#18968)" This reverts commit a5c3879. Signed-off-by: MaheshRavishankar <[email protected]>
…19647) This reverts commit a5c3879. Seems like there is a regression due to dynamic shapes #18968 (comment) Signed-off-by: MaheshRavishankar <[email protected]>
…(#18857)"
This regresses sdxl int8 perf by increasing the dimensionality of
attention
ops which messes with the attention spec. Revert this for now and reland onceCollapseDimensionsPass
can handle attention.This reverts commit 78481a6.