Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TIR][Analysis][Arith] Implement basic data-flow analysis #13130

Merged
merged 8 commits into from
Nov 16, 2022

Conversation

Lunderberg
Copy link
Contributor

An optional utility to track known buffer values through a TIR PrimFunc, allowing simplifications based on known values.

This PR is currently marked as a draft, as it contains several independent utilities, each with their own PR. Each commit in this branch is marked with the corresponding PR.

@tvm-bot
Copy link
Collaborator

tvm-bot commented Oct 19, 2022

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

Generated by tvm-bot

@areusch areusch added needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it and removed needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it labels Oct 19, 2022
@Lunderberg Lunderberg force-pushed the dataflow_analysis branch 2 times, most recently from 8dd661f to f367b99 Compare October 19, 2022 15:47
@tqchen
Copy link
Member

tqchen commented Oct 19, 2022

thanks @Lunderberg

Buffer tracking analysis is something that is helpful, however logically at a higher level than the integer simplification. As a result, perhaps it is better to have things live in TIR/analysis and passes.

@Lunderberg
Copy link
Contributor Author

Hey @tqchen, and thank you! That would make sense from an organization perspective, and I'll move the src/arith/control_flow_graph.* to src/tir/analysis/control_flow_graph.*.

@Lunderberg
Copy link
Contributor Author

Rebasing to reflect which PRs are still included in this dev branch. Only #13023 and #13024 are needed before this PR can be marked as ready.

@Lunderberg Lunderberg force-pushed the dataflow_analysis branch 3 times, most recently from bcf0507 to fe1b114 Compare October 31, 2022 15:06
@Lunderberg Lunderberg force-pushed the dataflow_analysis branch 2 times, most recently from 4bce27e to 5e723ca Compare October 31, 2022 21:16
@Lunderberg
Copy link
Contributor Author

@tvm-bot rerun

@Lunderberg
Copy link
Contributor Author

Rebased onto main now that all prerequisite PRs are in, and marking as ready for review. No changes relative to the version that passed CI, so (fingers crossed) hopefully smooth sailing on CI.

@Lunderberg Lunderberg marked this pull request as ready for review November 4, 2022 18:05
@Lunderberg Lunderberg requested a review from csullivan November 4, 2022 18:05
@Lunderberg
Copy link
Contributor Author

A couple of notes to reviewers. These were things that I mentioned directly to @csullivan , but wanted to record them alongside the PR.

Copy link
Contributor

@tkonolige tkonolige left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general this look pretty good. Thanks @Lunderberg!

enum class AccessType {
Read,
Write,
Assume,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does Assume mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an assumption expressed as part of tir::builtin::assume() (T.assume() in TVMScript). So if a function contains if i<5: T.assume(buf[i] == 0), that would be expressed as with AccessType::Assume, with a predicate of i<5 and a known value of 0.

I've added documentation for each option, since it isn't immediate clear from the enum otherwise.

* Used to construct boolean expressions indicating whether the loop
* iteration that performs this touch has been reached.
*/
std::vector<std::pair<Var, PrimExpr>> loop_var_expressions;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the primexpr in each pair the extent of the loop var?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite. The PrimExpr is an expression for the loop_var in terms of the buffer axis variables. This lets the buffer touches track knowns across loop iterations (e.g. Making a predicate axis0 <= i to indicate that a value is known for this and previous loop iterations, provided that it isn't subsequently overwritten.)

for i in T.serial(16):
    # Introduces a known value, A[axis0] == i, for all axis0 <= i.
    A[i] = i

    # May be simplified to "x = i+1", since substituting {axis0: i}
    # results in i <= i, which is true.
    x = A[i] + 1

    # May not be simplified to "x = i+1", since substituting
    # {axis0: i+1} results in i+1 <= i, which is false.
    x = A[i+1] + 1

I've added to the docstring for loop_var_expressions to indicate what the PrimExpr is.

Comment on lines 97 to 99
PrimExpr BeforeLoopIteration() const;
PrimExpr AtLoopIteration() const;
PrimExpr AfterLoopIteration() const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, and added.

*/
bool IsSubsetOf(const BufferTouch& other, arith::Analyzer* analyzer) const;

/* \brief Checks if this touch affects distinct indicates from another
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/* \brief Checks if this touch affects distinct indicates from another
/* \brief Checks if this touch affects distinct indices from another

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, and updated as suggested.

*/
bool IsDistinctFrom(const BufferTouch& other, arith::Analyzer* analyzer) const;

/* \brief Checks if this touch affects distinct indicates from another
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/* \brief Checks if this touch affects distinct indicates from another
/* \brief Checks if this touch affects distinct indices from another

unless indicates means something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, just a typo that was copy/pasted erroneously. Fixed here as well as suggested.

void ApplyTouches(const Map<Buffer, Array<Var>>& axis_var_lookup,
const std::vector<BufferTouch>& touch_points, arith::Analyzer* analyzer);

void BackpropUnusedIndices(const Map<Buffer, Array<Var>>& axis_var_lookup,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, and added.

std::vector<BufferTouch> constraints;
};

class ControlFlowGraph {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having some basic documentation on what a control graph is would be helpful here.

Copy link
Contributor Author

@Lunderberg Lunderberg Nov 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, and added in process of adding documentation here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And added.

} else if (side_effect == tir::CallEffectKind::kReadState) {
buffer_exprs.push_back(expr);
} else {
LOG(FATAL) << "Assumption must be pure or read-only";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you print out the assumption and what its side effect kind is here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, and print statement added. I've also added a inline std::ostream& operator<<(std::ostream& os, CallEffectKind side_effect) alongside the definition of CallEffectKind, as it may be useful in other print statements.

Comment on lines 284 to 285
// This assumption is an inequality a data-dependent
// conditional. Not an error for this to occur, but also not
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// This assumption is an inequality a data-dependent
// conditional. Not an error for this to occur, but also not
// This assumption is an inequality on a data-dependent
// conditional. Not an error for this to occur, but also not

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, and updated.

@@ -1761,6 +1761,37 @@ PrimExpr RewriteSimplifier::Impl::VisitExpr_(const AndNode* op) {

TVM_TRY_REWRITE(x == c1 && x != c2, x == c1 && c1 != c2);
TVM_TRY_REWRITE(x != c2 && x == c1, x == c1 && c1 != c2);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add tests for these. Tests using negative numbers too would be great.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, and updated. There were a couple of the rewrite rules that had incorrect behavior for negative numerators, so those are now fixed.

@tqchen
Copy link
Member

tqchen commented Nov 10, 2022

cc @wrongtest-intellif who have some relevant expertises on the affected code

Copy link
Contributor

@tkonolige tkonolige left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation looks great. Thank you @Lunderberg!

Copy link
Contributor

@csullivan csullivan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for the data flow analysis and propagation @Lunderberg! Despite the advanced topic, the implementation is very well documented and accessible 🎉 .

I'm about half way through and will send one more set of comments after lunch, but wanted to send these now incase it helps asynchrony.

src/tir/analysis/control_flow_graph.cc Outdated Show resolved Hide resolved
if (num_previous_visits >= max_revisits) {
return BufferState();
}
ICHECK_LE(block.predecessors.size(), 2) << "Each block should have at most two predecessors";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why at most two? Is this for limiting loop dependency analysis? (1) the initial predecessor from loop start, and (2) from a previous loop?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: If my above comment ⬆️ was a correct interpretation, it might be worth documenting somewhere if there is a limitation imposed by the assumption that, or if not, why it is complete.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct, it is there to make the later implementation easier, as it only requires merging known values from two or fewer predecessors. There are cases which could be expressed as having three predecessors, but which the collection represents with more control flow blocks than would otherwise be required. For example,

for i in range(n):
    if A[i] == 0:
        A[i] = 1
    else:
        A[i] = 2

The after-loop block could be represented as having three predecessors, (1) the before-loop block when n==0, (2) the then-case during the last iteration, and (3) the else-case during the last iteration. However, the collector instead represents this with a after-conditional block whose predecessors are (2) and (3), with the after-loop having the after-conditional block and before-loop block as predecessors.

This is primarily an internal check, to validate that nothing in the collection step violated this later assumption. I've added a comment to clarify.

Comment on lines +257 to +258
a = key_false_;
b = GetKey(simplified);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a bug fix? Do we have a test to reproduce if so?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not so much as bug as a performance improvement and consistency with some steps in the data-flow portions.

The performance improvement comes from this calling scope, which iterates across pairs of expressions, where b is always later in the loop. By placing the simplified expression in b instead of a, this allows later calls to TrySimplifyOr to further simplify using the simplified expression as an input. As a result, repeated passes across all pairs are avoided.

The consistency comes from the ControlFlowBlock's use of ExprDeepEqual to check for convergence. Because the rewrite simplifier doesn't produce a canonical form (e.g. The expressions i > 0 && j > 0 and j>0 && i>0 are equivalent, but neither will be simplified to the other), there were a few cases where I wanted to preserve a specific order when simplifying.

src/arith/conjunctive_normal_form.cc Show resolved Hide resolved
* When used as a constraint (e.g. in BufferState), should use
* Assume.
*/
AccessType touch_type{AccessType::Assume};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it would be better to have an unknown access type than to default to Assume?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default type here doesn't have an impact, as both of the constructors explicitly set a value for the touch_type. I mainly included a default to avoid any possibility of having uninitialized values. I'd want to avoid adding another touch type, since that would be an additional state that would need to be checked for when in-use, even though it shouldn't ever occur.

This class interface was partly a compromise while cleaning up the dev branch, as this one class serves two purposes. During collection, it is a 1:1 mapping of buffer touch points in TIR. During propagation, it tracks the known values in each control flow block. There was too much shared functionality for that to be a reasonable usage, but also not enough distinction to warrant pulling out the shared functionality into a common base class.

std::vector<BufferTouch> constraints;
};

/*! \brief Represents the flow of control through a `tir::Stmt`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic documentation for building a mental model of the ControlFlowGraph structure and intention behind its design. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! (And thanks to @tkonolige for suggesting an overview portion of the documentation!)

src/tir/analysis/control_flow_graph.h Show resolved Hide resolved
src/tir/analysis/control_flow_graph.h Outdated Show resolved Hide resolved
src/tir/analysis/control_flow_graph.h Outdated Show resolved Hide resolved
Comment on lines 1538 to 1542
// The predicate can identify which predecessor block applies
// (e.g. i==0 for the first loop iteration, i>0 for remaining
// loop iterations). Therefore, we can use all buffer
// constraints, conditional on having come from the
// predecessor that provides it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Update the comments to match the backprop case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Given that the nice construction you've made with the control flow graph results in steps 1-3 being nearly identical between the forward and backward case, it would be nice to see the approaches share a common implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments updated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had tried to pull them into a common implementation, but it ended up needing to either have a few too many if/then cases for my comfort, or to need an entirely new level of abstraction to abstract over the direction of propagation.

post_a.Union(post_b, &analyzer);
return post_a;
} else {
// We don't know which predecessor applies. Therefore, the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: comments apply to successors not predecessors

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated here as well.

}
}

bool ControlFlowGraph::IsOverwrittenWithoutEffect(const tir::BufferStore& store,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this being used somewhere? I couldn't find it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't right now, but will be used by #13217 for identifying no-op stores. It probably should have been in that PR instead of this one, but splitting things up into logically distinct PRs had been getting to be non-trivial.

local_analyzer.Bind(free_params);
local_analyzer.rewrite_simplify.SetEnabledExtensions(
RewriteSimplifier::kTransitivelyProveInequalities);
analyzer = &local_analyzer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting the analyzer pointer to a stack reference looks like it could result in undefined behavior after this function returns. Perhaps just use the local_analyzer in the below simplifications?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, and updated to only use local_analyzer. It wouldn't be undefined behavior, because nothing in the parent's scope is being overwritten, but it does entirely ignore the analyzer argument. This was a relic of the development, before the control-flow graph/block held enough context to set up the iterator ranges, and is no longer needed.

Thank you for the catch!

Comment on lines 261 to 262
// if i < 3: T.assume(buf[i] == value)
// T.assume(i>=3 or buf[i] == value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! (I also prepended these lines with Option 1: and Option 2:, to make it clear to a reader that these relate to the comment itself, and are not a weird form of embedded TVMScript in C++.)

Comment on lines +284 to +289
if (!as_equal_node) {
// This assumption is an inequality on a data-dependent
// conditional. Not an error for this to occur, but also not
// something that is currently supported.
return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this doc hoisted somewhere more visible to user calling code so that it's clear that only assumptions on equality -- and explicitly not inequalities and not-equal -- are currently supported?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the data-dependent inequality occurs in T.assume(), it raises an error because the primary purpose of T.assume() is to provide information for this analysis. The comment applies specifically to scoped contexts that have a data-dependent conditional (e.g. if buf[i] > 10: ...), which are valid TIR but not something that are used.

I've added a description of this limitation to the docstring of ControlFlowGraph, that only fully-constrained values are tracked, so that the limitation is more visible. However, since it is entirely legal TIR to write these data-dependent inequalities, I don't think it should raise an error when encountering them.

@Lunderberg
Copy link
Contributor Author

Current CI failures are due to T.assume(expr) no longer being parsed by TVMScript on main. It looks like this is related to #12496. After that PR, any PrimExpr that occurs in the context of a statement is silently ignored, including T.assume. For now, updating the T.assume instances in this PR to instead use T.evaluate(T.assume(...)), and putting together a separate PR to avoid the silent dropping of expressions.

Copy link
Contributor

@csullivan csullivan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the great additions @Lunderberg, LGTM.

@csullivan csullivan changed the title [TIR][Arith] Implement basic data-flow analysis [TIR][Analysis][Arith] Implement basic data-flow analysis Nov 16, 2022
@csullivan csullivan merged commit a80cdc2 into apache:main Nov 16, 2022
@Lunderberg Lunderberg deleted the dataflow_analysis branch November 16, 2022 22:13
xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022
An optional utility to track known buffer values through a TIR PrimFunc, allowing simplifications based on known values.

* Updated documentation following review comments

* Unit tests for rewrites, including negative numerators for div/mod

* Fix linting error

* Added brief description on what a control graph is

* Updates based on review comments

* Updated T.assume(expr) to T.evaluate(T.assume(expr))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants