Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

try_to_fold_vector_reduce<Call> has incorrect behavior #6883

Closed
rootjalex opened this issue Jul 25, 2022 · 0 comments · Fixed by #6896
Closed

try_to_fold_vector_reduce<Call> has incorrect behavior #6883

rootjalex opened this issue Jul 25, 2022 · 0 comments · Fixed by #6896
Labels

Comments

@rootjalex
Copy link
Member

In CodeGen_LLVM, we call try_to_fold_vector_reduce<Call> on saturating_add or saturating_sub calls, while not providing information as to whether or not the accumulation is an addition or a subtraction:

https://github.com/halide/Halide/blob/11a049c3967a277173e288ffd802f08ce1a1b78e/src/CodeGen_LLVM.cpp#L2835-#L2839

This seems like incorrect behavior - I noticed this while restructuring CodeGen_X86 into separate optimization and code generation passes, because it appears that the accumulating saturating dot product instructions should trigger on both of these patterns:

saturating_sub(wild_i32x, VectorReduce(SaturatingAdd, factor=4, widening_mul(wild_i16x, wild_i16x)))
saturating_add(wild_i32x, VectorReduce(SaturatingAdd, factor=4, widening_mul(wild_i16x, wild_i16x)))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant