-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TIR][Analysis][Arith] Implement basic data-flow analysis #13130
Conversation
Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.
Generated by tvm-bot |
8dd661f
to
f367b99
Compare
thanks @Lunderberg Buffer tracking analysis is something that is helpful, however logically at a higher level than the integer simplification. As a result, perhaps it is better to have things live in TIR/analysis and passes. |
Hey @tqchen, and thank you! That would make sense from an organization perspective, and I'll move the |
f04d4f9
to
74eb399
Compare
bcf0507
to
fe1b114
Compare
4bce27e
to
5e723ca
Compare
@tvm-bot rerun |
5e723ca
to
8af309b
Compare
Rebased onto main now that all prerequisite PRs are in, and marking as ready for review. No changes relative to the version that passed CI, so (fingers crossed) hopefully smooth sailing on CI. |
A couple of notes to reviewers. These were things that I mentioned directly to @csullivan , but wanted to record them alongside the PR.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general this look pretty good. Thanks @Lunderberg!
enum class AccessType { | ||
Read, | ||
Write, | ||
Assume, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does Assume
mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an assumption expressed as part of tir::builtin::assume()
(T.assume()
in TVMScript). So if a function contains if i<5: T.assume(buf[i] == 0)
, that would be expressed as with AccessType::Assume
, with a predicate of i<5
and a known value of 0
.
I've added documentation for each option, since it isn't immediate clear from the enum otherwise.
* Used to construct boolean expressions indicating whether the loop | ||
* iteration that performs this touch has been reached. | ||
*/ | ||
std::vector<std::pair<Var, PrimExpr>> loop_var_expressions; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the primexpr in each pair the extent of the loop var?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite. The PrimExpr is an expression for the loop_var
in terms of the buffer axis variables. This lets the buffer touches track knowns across loop iterations (e.g. Making a predicate axis0 <= i
to indicate that a value is known for this and previous loop iterations, provided that it isn't subsequently overwritten.)
for i in T.serial(16):
# Introduces a known value, A[axis0] == i, for all axis0 <= i.
A[i] = i
# May be simplified to "x = i+1", since substituting {axis0: i}
# results in i <= i, which is true.
x = A[i] + 1
# May not be simplified to "x = i+1", since substituting
# {axis0: i+1} results in i+1 <= i, which is false.
x = A[i+1] + 1
I've added to the docstring for loop_var_expressions
to indicate what the PrimExpr
is.
PrimExpr BeforeLoopIteration() const; | ||
PrimExpr AtLoopIteration() const; | ||
PrimExpr AfterLoopIteration() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, and added.
*/ | ||
bool IsSubsetOf(const BufferTouch& other, arith::Analyzer* analyzer) const; | ||
|
||
/* \brief Checks if this touch affects distinct indicates from another |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/* \brief Checks if this touch affects distinct indicates from another | |
/* \brief Checks if this touch affects distinct indices from another |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, and updated as suggested.
*/ | ||
bool IsDistinctFrom(const BufferTouch& other, arith::Analyzer* analyzer) const; | ||
|
||
/* \brief Checks if this touch affects distinct indicates from another |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/* \brief Checks if this touch affects distinct indicates from another | |
/* \brief Checks if this touch affects distinct indices from another |
unless indicates means something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, just a typo that was copy/pasted erroneously. Fixed here as well as suggested.
void ApplyTouches(const Map<Buffer, Array<Var>>& axis_var_lookup, | ||
const std::vector<BufferTouch>& touch_points, arith::Analyzer* analyzer); | ||
|
||
void BackpropUnusedIndices(const Map<Buffer, Array<Var>>& axis_var_lookup, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, and added.
std::vector<BufferTouch> constraints; | ||
}; | ||
|
||
class ControlFlowGraph { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having some basic documentation on what a control graph is would be helpful here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, and added in process of adding documentation here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And added.
} else if (side_effect == tir::CallEffectKind::kReadState) { | ||
buffer_exprs.push_back(expr); | ||
} else { | ||
LOG(FATAL) << "Assumption must be pure or read-only"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you print out the assumption and what its side effect kind is here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, and print statement added. I've also added a inline std::ostream& operator<<(std::ostream& os, CallEffectKind side_effect)
alongside the definition of CallEffectKind
, as it may be useful in other print statements.
// This assumption is an inequality a data-dependent | ||
// conditional. Not an error for this to occur, but also not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// This assumption is an inequality a data-dependent | |
// conditional. Not an error for this to occur, but also not | |
// This assumption is an inequality on a data-dependent | |
// conditional. Not an error for this to occur, but also not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, and updated.
@@ -1761,6 +1761,37 @@ PrimExpr RewriteSimplifier::Impl::VisitExpr_(const AndNode* op) { | |||
|
|||
TVM_TRY_REWRITE(x == c1 && x != c2, x == c1 && c1 != c2); | |||
TVM_TRY_REWRITE(x != c2 && x == c1, x == c1 && c1 != c2); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add tests for these. Tests using negative numbers too would be great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, and updated. There were a couple of the rewrite rules that had incorrect behavior for negative numerators, so those are now fixed.
cc @wrongtest-intellif who have some relevant expertises on the affected code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation looks great. Thank you @Lunderberg!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks for the data flow analysis and propagation @Lunderberg! Despite the advanced topic, the implementation is very well documented and accessible 🎉 .
I'm about half way through and will send one more set of comments after lunch, but wanted to send these now incase it helps asynchrony.
if (num_previous_visits >= max_revisits) { | ||
return BufferState(); | ||
} | ||
ICHECK_LE(block.predecessors.size(), 2) << "Each block should have at most two predecessors"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why at most two? Is this for limiting loop dependency analysis? (1) the initial predecessor from loop start, and (2) from a previous loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: If my above comment ⬆️ was a correct interpretation, it might be worth documenting somewhere if there is a limitation imposed by the assumption that, or if not, why it is complete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct, it is there to make the later implementation easier, as it only requires merging known values from two or fewer predecessors. There are cases which could be expressed as having three predecessors, but which the collection represents with more control flow blocks than would otherwise be required. For example,
for i in range(n):
if A[i] == 0:
A[i] = 1
else:
A[i] = 2
The after-loop block could be represented as having three predecessors, (1) the before-loop block when n==0
, (2) the then-case during the last iteration, and (3) the else-case during the last iteration. However, the collector instead represents this with a after-conditional block whose predecessors are (2) and (3), with the after-loop having the after-conditional block and before-loop block as predecessors.
This is primarily an internal check, to validate that nothing in the collection step violated this later assumption. I've added a comment to clarify.
a = key_false_; | ||
b = GetKey(simplified); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a bug fix? Do we have a test to reproduce if so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not so much as bug as a performance improvement and consistency with some steps in the data-flow portions.
The performance improvement comes from this calling scope, which iterates across pairs of expressions, where b
is always later in the loop. By placing the simplified expression in b
instead of a
, this allows later calls to TrySimplifyOr
to further simplify using the simplified expression as an input. As a result, repeated passes across all pairs are avoided.
The consistency comes from the ControlFlowBlock
's use of ExprDeepEqual
to check for convergence. Because the rewrite simplifier doesn't produce a canonical form (e.g. The expressions i > 0 && j > 0
and j>0 && i>0
are equivalent, but neither will be simplified to the other), there were a few cases where I wanted to preserve a specific order when simplifying.
* When used as a constraint (e.g. in BufferState), should use | ||
* Assume. | ||
*/ | ||
AccessType touch_type{AccessType::Assume}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps it would be better to have an unknown access type than to default to Assume
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default type here doesn't have an impact, as both of the constructors explicitly set a value for the touch_type
. I mainly included a default to avoid any possibility of having uninitialized values. I'd want to avoid adding another touch type, since that would be an additional state that would need to be checked for when in-use, even though it shouldn't ever occur.
This class interface was partly a compromise while cleaning up the dev branch, as this one class serves two purposes. During collection, it is a 1:1 mapping of buffer touch points in TIR. During propagation, it tracks the known values in each control flow block. There was too much shared functionality for that to be a reasonable usage, but also not enough distinction to warrant pulling out the shared functionality into a common base class.
std::vector<BufferTouch> constraints; | ||
}; | ||
|
||
/*! \brief Represents the flow of control through a `tir::Stmt` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic documentation for building a mental model of the ControlFlowGraph structure and intention behind its design. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! (And thanks to @tkonolige for suggesting an overview portion of the documentation!)
// The predicate can identify which predecessor block applies | ||
// (e.g. i==0 for the first loop iteration, i>0 for remaining | ||
// loop iterations). Therefore, we can use all buffer | ||
// constraints, conditional on having come from the | ||
// predecessor that provides it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Update the comments to match the backprop case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Given that the nice construction you've made with the control flow graph results in steps 1-3 being nearly identical between the forward and backward case, it would be nice to see the approaches share a common implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had tried to pull them into a common implementation, but it ended up needing to either have a few too many if/then cases for my comfort, or to need an entirely new level of abstraction to abstract over the direction of propagation.
post_a.Union(post_b, &analyzer); | ||
return post_a; | ||
} else { | ||
// We don't know which predecessor applies. Therefore, the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: comments apply to successors not predecessors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated here as well.
} | ||
} | ||
|
||
bool ControlFlowGraph::IsOverwrittenWithoutEffect(const tir::BufferStore& store, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this being used somewhere? I couldn't find it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't right now, but will be used by #13217 for identifying no-op stores. It probably should have been in that PR instead of this one, but splitting things up into logically distinct PRs had been getting to be non-trivial.
local_analyzer.Bind(free_params); | ||
local_analyzer.rewrite_simplify.SetEnabledExtensions( | ||
RewriteSimplifier::kTransitivelyProveInequalities); | ||
analyzer = &local_analyzer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting the analyzer pointer to a stack reference looks like it could result in undefined behavior after this function returns. Perhaps just use the local_analyzer in the below simplifications?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, and updated to only use local_analyzer
. It wouldn't be undefined behavior, because nothing in the parent's scope is being overwritten, but it does entirely ignore the analyzer
argument. This was a relic of the development, before the control-flow graph/block held enough context to set up the iterator ranges, and is no longer needed.
Thank you for the catch!
// if i < 3: T.assume(buf[i] == value) | ||
// T.assume(i>=3 or buf[i] == value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! (I also prepended these lines with Option 1:
and Option 2:
, to make it clear to a reader that these relate to the comment itself, and are not a weird form of embedded TVMScript in C++.)
if (!as_equal_node) { | ||
// This assumption is an inequality on a data-dependent | ||
// conditional. Not an error for this to occur, but also not | ||
// something that is currently supported. | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this doc hoisted somewhere more visible to user calling code so that it's clear that only assumptions on equality -- and explicitly not inequalities and not-equal -- are currently supported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the data-dependent inequality occurs in T.assume()
, it raises an error because the primary purpose of T.assume()
is to provide information for this analysis. The comment applies specifically to scoped contexts that have a data-dependent conditional (e.g. if buf[i] > 10: ...
), which are valid TIR but not something that are used.
I've added a description of this limitation to the docstring of ControlFlowGraph
, that only fully-constrained values are tracked, so that the limitation is more visible. However, since it is entirely legal TIR to write these data-dependent inequalities, I don't think it should raise an error when encountering them.
Current CI failures are due to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the great additions @Lunderberg, LGTM.
An optional utility to track known buffer values through a TIR PrimFunc, allowing simplifications based on known values. * Updated documentation following review comments * Unit tests for rewrites, including negative numerators for div/mod * Fix linting error * Added brief description on what a control graph is * Updates based on review comments * Updated T.assume(expr) to T.evaluate(T.assume(expr))
An optional utility to track known buffer values through a TIR PrimFunc, allowing simplifications based on known values.
This PR is currently marked as a draft, as it contains several independent utilities, each with their own PR. Each commit in this branch is marked with the corresponding PR.