-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SVE] Support scalable vectors in LoopVectorizer #16782
Conversation
cc the usual suspects @lhutton1 @leandron @Anndrey24 @tqchen @Lunderberg |
The implementation looks reasonable, though I have one main question for it: What is the behavior of the updated pass for a target that doesn't support SVE? Prior SVE-commits enabled the functionality, but didn't produce SVE in any of the default lowering passes. From this line, versions of LLVM before 11.0 do not support SVE, nor from my brief reading of the CUDA codegen here does CUDA. Since |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, there are plenty of non-trivial changes here! Thanks for tidying up the tests as well :) The comments are largely just nitpicks 😅
Thank you for your feedback @Lunderberg, much appreciated!
When it comes to targets that don't support SVE, I'd expect these targets to not trigger the creation of scalable vectors. In the current plan the creation of scalable vectors has to be intentional, i.e. it comes from splitting an axis by a
so I'll modify the patch such that it checks for a target and fails as before if the extent is not an int.
In principle I'm not against making vectorizing for scalable vectors functionality more explicitly target specific, but it is not obvious to me what that would mean in terms of code? |
Trying to reason loudly on this,
See, so the schedules are already target dependent but the underlaying execution of such schedule is not. |
That's a good point. This only applies for cases that would currently produce an error, where a I'm convinced, and agree that this change doesn't introduce any errors that would not already have been errors from an upstream pass. We may still want to have validation at this step, to avoid making errors be more complicated downstream, but that can be added later as needed. |
This patch add support for turning loops marked for vectorizing into scalable vectors if the extent of the loop is a vscale dependent expression in a correct form. The testing for both scalable and fixed length vectors in test_tir_transform.py has been extended and most of the tests have been converted to TVMScript based testing against expected output. Co-authored-by: Luke Hutton <[email protected]> Co-authored-by: Neil Hickey <[email protected]>
e2fd300
to
477b89a
Compare
Thanks for all the reviews and discussion! The latest version now includes changes as a response to @lhutton1 review. I also looked into using the target info in the LoopVectorizer and discovered several issues, which I think will need fixes in separate patches:
So a bit of a can of worms there. I agree with @Lunderberg that in its current form the pass would not introduce errors that would have not been errors in other places of the stack, so I suggest we deal with the target detection issues separately and solidify |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates @ekalda, LGTM!
Thanks @ekalda @Lunderberg @cbalint13! Let's investigate adding the target dependent error message in a follow-up |
This patch add support for turning loops marked for vectorizing into scalable vectors if the extent of the loop is a vscale dependent expression in a correct form.
The testing for both scalable and fixed length vectors in test_tir_transform.py has been extended and most of the tests have been converted to TVMScript based testing against expected output.