-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for multi-dim vectorization #4873
Conversation
…such that they are only applied to the Expr with same number of lanes.
* fuzz_simpify generates nested ramps/broadcasts * changes is_same_lane_num to is_same_type * some rules are disabled for now
* added is_scalar * fixed ramp lane count * enabled back all rules by guarding some which use 'lanes' with is_scalar
Where do we stand on this -- are we still pending other PRs? |
I already merged in within-vector reductions PR, so it should be good as soon as tests failing on build bots are fixed. I was going to take a look at it today. |
Cool -- no rush, just checking :-) |
src/VectorizeLoops.cpp
Outdated
@@ -191,7 +191,7 @@ struct InterleavedRamp { | |||
int lanes, repetitions; | |||
}; | |||
|
|||
bool is_interleaved_ramp(const Expr &e, const Scope<Expr> &scope, InterleavedRamp *result) { | |||
bool is_interleaved_ramp(const Expr &e, const std::map<string, Expr> &widened_vars, InterleavedRamp *result) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this changed from a Scope to a map?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the case when there are multiple vectorized loops, it's possible that the same let variable may be vectorized differently depending on the loop level it's used and the exact form may not be known when we first see it (for example, if it's used inside of the inner vectorized loop), so map is used to keep track of all possible variant which were used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused because a Scope is a map where you can have multiple values per key, and the most recently added value for the key hides the older ones. So a Scope should be strictly more general than a map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, I guess I misunderstood the way it works; based on name and some usages I saw, I assumed that it's more like named stack and push/pop should go in last-in-first-out order, but in this case push and pop may happen in different functions (and maybe in different orders with other lets), so I thought it won't work correctly. However, if it's map of stacks instead, I guess this shouldn't be an issue. I can change map to scope and track widened names for Let/LetStmt separately, would this be preferable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason to use a Scope instead is if the same name might show up multiple times in a way that a Scope would just handle correctly. Currently vectorization just happens on code with unique var names, but maybe it would be a useful to someone to be able to run it on stmts that don't obey that constraint. But if you're pushing/popping in non-stack order, then maybe that wouldn't work transparently. I need to figure out why you're pushing/popping these in non-stack order though to make sure I understand this code. I added a question about it elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation, I agree this is a good reason to use Scope! The reason I did it this way is because widened value of Let may change in the case when there is a vectorized loop inside of the Let body and this Let is used in there, so instead of mutating it in advance, it's mutated (sort of) lazily only when needed inside of the visit(const Variable *op) (see get_mutated_var_from_scope() function) to make sure that we've seen all of the vectorized loops. Maybe, as an alternative, it would be better to update all variables in scope every time we encounter a new vectorized loop, replace mutated variables in the scope, mutate loop body, and on return from the mutate(body), create widened variables if needed. That might be quite a bit of a change, but sounds like it's worth doing. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated it to use Scope again using approach described above, and I think the code is better now (there are fewer changes for sure), thanks for the the suggestions!
Periodic ping to see where this PR stands |
I believe it's stalled on me continuing my review. Sorry for the delay, I'm currently overloaded with things I have told people I will do. |
No worries, just making sure there isn't stuff that needed something trivial to unblock :-) |
All tests are green again. |
lgtm apart from my two minor comments. |
Thank you for reviewing! |
This PR adds support for multi-dimensional vectorization as described in #4616.
There are two main pieces here:
Notes/questions:
fuzz_simplify is failing for the type UInt(1) when vector width is more than 2 even before the changes, so I've disabled it for now. The error message is"Integer constant 2 will be implicitly coerced to type uint1, which changes its value to (uint1)0.
. I am going to look into it next, but figured I ask in case this is known or anyone has any suggestions.Otherwise, I think, it should be ready for review. Thanks!