-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trivial code change causes vectorization failure #30933
Comments
Since they both inline equally well, this decision will come down to the legality/cost model in LLVM. Easiest way to figure that out is to dump out the LLVM IR of the one that doesn't optimize and run it through opt with |
Are you suggesting that the difference is a failure of LLVM to optimize the IR? To me (who admittedly doesn't understand much of this), it appears the difference is from Julia generating explicitly vectorized IR for |
correct, Julia does not have a vectorizer. The output of |
Hmm.. but isn't the following vectorized to run over 4 values at a time? (Taken from the output of
Nothing like that appears in |
Yes, as I said, |
Ah.. sorry I thought you meant "the output of code_llvm will then be sent to LLVM to be optimized". Thanks for clearing up my confusion. |
Ah, sorry I see how that could be misread. |
To see the IR that Julia produces you can use |
Running with
Fast version:
I wanted to try passing the IR to a more recent version of LLVM opt, but I couldn't figure out a way to get Julia to output complete IR. |
Take a look at https://docs.julialang.org/en/v1/devdocs/llvm/index.html#Debugging-LLVM-transformations-in-isolation-1, since this is getting fairly off-topic feel free to ping me on Slack if you get stuck. |
Thanks to @vchuravy's help, I think I've tracked down the issue. Using the unoptimized IR, I was able to reproduce the vectorization failure with LLVM 6:
If I remove the |
The first function below fails to vectorize, but if I make a trivial change (expand
s = f*f
) as shown in the second function, then Julia produces nicely vectorized LLVM-IR.I don't see any reason why they both shouldn't vectorize, so I think it might be a bug with the Julia optimizer.
The text was updated successfully, but these errors were encountered: