ExprMutator refactor & Normalizer #32

YuchenJin · 2021-11-04T18:15:54Z

Thanks for contributing to TVM! Please refer to guideline https://tvm.apache.org/docs/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from Reviewers by @ them in the pull request thread.

include/tvm/relax/utils.h

altanh

quick round of review 👀

include/tvm/ir/expr.h

include/tvm/relax/block_builder.h

altanh · 2021-11-04T19:27:01Z

include/tvm/relax/expr_functor.h

@@ -179,17 +179,10 @@ class ExprMutator : public ExprFunctor<Expr(const Expr&)> {
 public:
  ExprMutator() {
    name_table_ = std::make_shared<NameTable>();
+    // TODO (@yuchen, @altanh): should block builder expose name_table_ instead?


adding my offline thought here: I think to maximize the benefit of using NameTable (which is to prevent confusing name clashes for easier debugging), we would ideally have a single NameTable that persists across the entire compilation flow. One option is to embed it in the IRModule, or alternatively the PassContext.

src/relax/ir/expr_functor.cc

src/relax/transform/to_anf.cc

src/relax/transform/to_non_dataflow.cc

include/tvm/relax/block_builder.h

src/relax/ir/expr_functor.cc

include/tvm/relax/block_builder.h

include/tvm/relax/utils.h

src/relax/ir/block_builder.cc

tqchen · 2021-11-04T23:50:17Z

src/relax/ir/block_builder.cc

@@ -231,8 +454,10 @@ bool BlockBuilderNode::CanProveShapeEqual(const Expr& lhs, const Expr& rhs) {

 // TODO(@altanh, @yuchen): emit expr in ssa form
 Expr BlockBuilderNode::Normalize(const Expr& expr) {
-  if (expr.as<CallNode>()) {


Have a fast path for normalize, directly check if it is var, or call with var arguments

going to add a TODO instead since this isn't urgent

src/relax/ir/expr_functor.cc

src/relax/transform/call_dps_rewrite.cc

tqchen · 2021-11-08T16:31:03Z

include/tvm/relax/expr_functor.h

  template <typename T>
  T WithShapeAndType(T expr, Optional<ObjectRef> shape, Type type) {
    // make sure to not "forget" DataflowVars
-    if (!std::is_same<T, DataflowVar>::value && static_cast<ObjectRef>(expr).as<DataflowVarNode>()) {
+    if (!std::is_same<T, DataflowVar>::value &&
+        static_cast<ObjectRef>(expr).as<DataflowVarNode>()) {


while this is indeed more efficient, let us just use runtime polymorphism in this case and define the implementation in cc file.
Var WithShapeAndType(Var expr, Optional shape, Type type) {

the reason I did this was because we'll need a function like this for filling in the shapes and types of arbitrary exprs during forward inference (will add in subsequent PR). Is there a problem with this approach?

I see, I believe the usages of the two can be slightly different.

In this case Var WithShapeAndType(Var var, Optional<ObjectRef> shape, Type type), we will recreate the Var of the same type as Var if the shape and type does not equals to the result.

In the forward inference case, we can directly mutate the fields in the expr by leveraging idempotent property(and shape is undefined), but will not recreate a new Var if the fields mismatches.

So it might be good to split them into two functions

hmm from my understanding we would want the same behavior for general exprs. For example if we are running shape inference on add(x, x) and the shape of x changes, then we want to update the shape_ field of the add call node (which we can do by directly mutating if it's undefined, or creating a copy with the shape field mutated, which this function does).

Because of the immutable property, we cannot mutate the x.shape_ the original call add. In the mutator, if shape of x changes, then we the mutator must return a new value xx, and in VisitExpr of add, it detects that args[0] changes and there is a need to create a new Call node, this new call node do not have shape_ field and the shape needed to be populated during normalization

that makes sense to me, I was thinking in the context where the new Call node is made using CopyOnWrite()->args = {xx, xx} style (which would carry over the old shape). It's unfortunate if every node needs to be manually reconstructed I feel like

I see, in that case it is still helpful to copy on write and reset the shape/type fields as well, then pass to the function. Since otherwise the extra comparison will generate another Call node(other from the COW one). Alternatively, we can just pass in a flag saying please ignore and force override the fields. Either case, the semantics of the function is different enough that it deserves a separate function.

it's definitely a trade off between having users write

call.CopyOnWrite()->args = ... call->shape_ = NullOpt; call->checked_type_ = Type(); return call;

or trusting that the flag will not be used incorrectly and cause non-idempotent mutation, or just having an extra copy. In the interest of getting the PR merged I'll just make this Var like you said

In this case, instead write return Call(old_op->op, new_args, old_op->span) is not as bad

certainly, let us add this as a discussion item in our meeting agenda

include/tvm/relax/expr_functor.h

src/relax/ir/block_builder.cc

tqchen · 2021-11-09T22:13:47Z

include/tvm/relax/expr.h

@@ -440,6 +449,7 @@ class ExternFunc : public Expr {
 public:
  TVM_DLL ExternFunc(String global_symbol, Span span = Span());
  TVM_DEFINE_OBJECT_REF_METHODS(ExternFunc, Expr, ExternFuncNode);
+  TVM_DEFINE_OBJECT_REF_COW_METHOD(ExternFuncNode);


note for future iterations, perhaps it is worth to remove COW because the shape changing pattern issue

tqchen

did a careful read, some final nits and we are good to go

include/tvm/relax/expr_functor.h

include/tvm/relax/utils.h

src/relax/ir/block_builder.cc

src/relax/ir/expr_functor.cc

* fixes * revert checked_type visitor and fix relax usage * ExprNormalizer * fix that annoying bug and get tests passing * Memoization fix for the ExprMutator; separate VisitVarDef from use. * rebase. * rebase. * address part of comments. * address more comments * address more comments and add doc * address more comments * fix potential mutation bug * always assign normalized shape if can * address comments Co-authored-by: Altan Haan <[email protected]>

* [Pass] Fix concurrent modification in RemoveUnusedVars When running RemoveUnusedVars (i.e. remove_all_unused), in some cases the map users will raise Concurrent modification error. This commit fixed it by changing the logic to "iterate the map first and update it later". * change the algorithm by store keys first * Add an ICHECK before getting map value * change the information of the ICHECK

(tlc-pack#57) * [FIX][Pass] concurrent modification in RemoveUnusedVars (tlc-pack#32) * [Pass] Fix concurrent modification in RemoveUnusedVars When running RemoveUnusedVars (i.e. remove_all_unused), in some cases the map users will raise Concurrent modification error. This commit fixed it by changing the logic to "iterate the map first and update it later". * change the algorithm by store keys first * Add an ICHECK before getting map value * change the information of the ICHECK * fix find include path (tlc-pack#42) Co-authored-by: Hongyi Jin <[email protected]> * [BUG] ExternFunc is not considered in attach_global_symbol.cc (tlc-pack#48) * [Cherry-Pick] Minor fix for TaskScheduler and VerifyGPUCode (tlc-pack#54) * [Fix] Task scheduler error prompt upon build/run failure (#13601) * [Fix] Use proper target in VerifyGPUCode (#13548) Previously, the VerifyGPUCode post-processor uses hardcoded target `Target("cuda")` for applying pass LowerIntrin. This is a bit problematic since the actual target can be other GPU target (e.g., Metal). Therefore, this PR changes the hardcoded target to be the actual target. Co-authored-by: Chaosfan <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Bohan Hou <[email protected]>