-
Notifications
You must be signed in to change notification settings - Fork 58
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quick round of review 👀
include/tvm/relax/expr_functor.h
Outdated
@@ -179,17 +179,10 @@ class ExprMutator : public ExprFunctor<Expr(const Expr&)> { | |||
public: | |||
ExprMutator() { | |||
name_table_ = std::make_shared<NameTable>(); | |||
// TODO (@yuchen, @altanh): should block builder expose name_table_ instead? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding my offline thought here: I think to maximize the benefit of using NameTable
(which is to prevent confusing name clashes for easier debugging), we would ideally have a single NameTable
that persists across the entire compilation flow. One option is to embed it in the IRModule, or alternatively the PassContext.
@@ -231,8 +454,10 @@ bool BlockBuilderNode::CanProveShapeEqual(const Expr& lhs, const Expr& rhs) { | |||
|
|||
// TODO(@altanh, @yuchen): emit expr in ssa form | |||
Expr BlockBuilderNode::Normalize(const Expr& expr) { | |||
if (expr.as<CallNode>()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have a fast path for normalize, directly check if it is var, or call with var arguments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
going to add a TODO instead since this isn't urgent
include/tvm/relax/expr_functor.h
Outdated
template <typename T> | ||
T WithShapeAndType(T expr, Optional<ObjectRef> shape, Type type) { | ||
// make sure to not "forget" DataflowVars | ||
if (!std::is_same<T, DataflowVar>::value && static_cast<ObjectRef>(expr).as<DataflowVarNode>()) { | ||
if (!std::is_same<T, DataflowVar>::value && | ||
static_cast<ObjectRef>(expr).as<DataflowVarNode>()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while this is indeed more efficient, let us just use runtime polymorphism in this case and define the implementation in cc file.
Var WithShapeAndType(Var expr, Optional shape, Type type) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the reason I did this was because we'll need a function like this for filling in the shapes and types of arbitrary exprs during forward inference (will add in subsequent PR). Is there a problem with this approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I believe the usages of the two can be slightly different.
- In this case
Var WithShapeAndType(Var var, Optional<ObjectRef> shape, Type type)
, we will recreate the Var of the same type as Var if the shape and type does not equals to the result. - In the forward inference case, we can directly mutate the fields in the expr by leveraging idempotent property(and shape is undefined), but will not recreate a new Var if the fields mismatches.
So it might be good to split them into two functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm from my understanding we would want the same behavior for general exprs. For example if we are running shape inference on add(x, x)
and the shape of x
changes, then we want to update the shape_
field of the add
call node (which we can do by directly mutating if it's undefined, or creating a copy with the shape field mutated, which this function does).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because of the immutable property, we cannot mutate the x.shape_
the original call add
. In the mutator, if shape of x
changes, then we the mutator must return a new value xx
, and in VisitExpr of add
, it detects that args[0]
changes and there is a need to create a new Call node, this new call node do not have shape_
field and the shape needed to be populated during normalization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that makes sense to me, I was thinking in the context where the new Call node is made using CopyOnWrite()->args = {xx, xx}
style (which would carry over the old shape). It's unfortunate if every node needs to be manually reconstructed I feel like
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, in that case it is still helpful to copy on write and reset the shape/type fields as well, then pass to the function. Since otherwise the extra comparison will generate another Call node(other from the COW one). Alternatively, we can just pass in a flag saying please ignore and force override the fields. Either case, the semantics of the function is different enough that it deserves a separate function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's definitely a trade off between having users write
call.CopyOnWrite()->args = ...
call->shape_ = NullOpt;
call->checked_type_ = Type();
return call;
or trusting that the flag will not be used incorrectly and cause non-idempotent mutation, or just having an extra copy. In the interest of getting the PR merged I'll just make this Var like you said
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, instead write return Call(old_op->op, new_args, old_op->span)
is not as bad
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
certainly, let us add this as a discussion item in our meeting agenda
@@ -440,6 +449,7 @@ class ExternFunc : public Expr { | |||
public: | |||
TVM_DLL ExternFunc(String global_symbol, Span span = Span()); | |||
TVM_DEFINE_OBJECT_REF_METHODS(ExternFunc, Expr, ExternFuncNode); | |||
TVM_DEFINE_OBJECT_REF_COW_METHOD(ExternFuncNode); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note for future iterations, perhaps it is worth to remove COW because the shape changing pattern issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did a careful read, some final nits and we are good to go
* fixes * revert checked_type visitor and fix relax usage * ExprNormalizer * fix that annoying bug and get tests passing * Memoization fix for the ExprMutator; separate VisitVarDef from use. * rebase. * rebase. * address part of comments. * address more comments * address more comments and add doc * address more comments * fix potential mutation bug * always assign normalized shape if can * address comments Co-authored-by: Altan Haan <[email protected]>
* fixes * revert checked_type visitor and fix relax usage * ExprNormalizer * fix that annoying bug and get tests passing * Memoization fix for the ExprMutator; separate VisitVarDef from use. * rebase. * rebase. * address part of comments. * address more comments * address more comments and add doc * address more comments * fix potential mutation bug * always assign normalized shape if can * address comments Co-authored-by: Altan Haan <[email protected]>
* fixes * revert checked_type visitor and fix relax usage * ExprNormalizer * fix that annoying bug and get tests passing * Memoization fix for the ExprMutator; separate VisitVarDef from use. * rebase. * rebase. * address part of comments. * address more comments * address more comments and add doc * address more comments * fix potential mutation bug * always assign normalized shape if can * address comments Co-authored-by: Altan Haan <[email protected]>
* fixes * revert checked_type visitor and fix relax usage * ExprNormalizer * fix that annoying bug and get tests passing * Memoization fix for the ExprMutator; separate VisitVarDef from use. * rebase. * rebase. * address part of comments. * address more comments * address more comments and add doc * address more comments * fix potential mutation bug * always assign normalized shape if can * address comments Co-authored-by: Altan Haan <[email protected]>
* fixes * revert checked_type visitor and fix relax usage * ExprNormalizer * fix that annoying bug and get tests passing * Memoization fix for the ExprMutator; separate VisitVarDef from use. * rebase. * rebase. * address part of comments. * address more comments * address more comments and add doc * address more comments * fix potential mutation bug * always assign normalized shape if can * address comments Co-authored-by: Altan Haan <[email protected]>
* [Pass] Fix concurrent modification in RemoveUnusedVars When running RemoveUnusedVars (i.e. remove_all_unused), in some cases the map users will raise Concurrent modification error. This commit fixed it by changing the logic to "iterate the map first and update it later". * change the algorithm by store keys first * Add an ICHECK before getting map value * change the information of the ICHECK
* [Pass] Fix concurrent modification in RemoveUnusedVars When running RemoveUnusedVars (i.e. remove_all_unused), in some cases the map users will raise Concurrent modification error. This commit fixed it by changing the logic to "iterate the map first and update it later". * change the algorithm by store keys first * Add an ICHECK before getting map value * change the information of the ICHECK
* [Pass] Fix concurrent modification in RemoveUnusedVars When running RemoveUnusedVars (i.e. remove_all_unused), in some cases the map users will raise Concurrent modification error. This commit fixed it by changing the logic to "iterate the map first and update it later". * change the algorithm by store keys first * Add an ICHECK before getting map value * change the information of the ICHECK
* [Pass] Fix concurrent modification in RemoveUnusedVars When running RemoveUnusedVars (i.e. remove_all_unused), in some cases the map users will raise Concurrent modification error. This commit fixed it by changing the logic to "iterate the map first and update it later". * change the algorithm by store keys first * Add an ICHECK before getting map value * change the information of the ICHECK
(tlc-pack#57) * [FIX][Pass] concurrent modification in RemoveUnusedVars (tlc-pack#32) * [Pass] Fix concurrent modification in RemoveUnusedVars When running RemoveUnusedVars (i.e. remove_all_unused), in some cases the map users will raise Concurrent modification error. This commit fixed it by changing the logic to "iterate the map first and update it later". * change the algorithm by store keys first * Add an ICHECK before getting map value * change the information of the ICHECK * fix find include path (tlc-pack#42) Co-authored-by: Hongyi Jin <[email protected]> * [BUG] ExternFunc is not considered in attach_global_symbol.cc (tlc-pack#48) * [Cherry-Pick] Minor fix for TaskScheduler and VerifyGPUCode (tlc-pack#54) * [Fix] Task scheduler error prompt upon build/run failure (#13601) * [Fix] Use proper target in VerifyGPUCode (#13548) Previously, the VerifyGPUCode post-processor uses hardcoded target `Target("cuda")` for applying pass LowerIntrin. This is a bit problematic since the actual target can be other GPU target (e.g., Metal). Therefore, this PR changes the hardcoded target to be the actual target. Co-authored-by: Chaosfan <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Bohan Hou <[email protected]>
(tlc-pack#57) * [FIX][Pass] concurrent modification in RemoveUnusedVars (tlc-pack#32) * [Pass] Fix concurrent modification in RemoveUnusedVars When running RemoveUnusedVars (i.e. remove_all_unused), in some cases the map users will raise Concurrent modification error. This commit fixed it by changing the logic to "iterate the map first and update it later". * change the algorithm by store keys first * Add an ICHECK before getting map value * change the information of the ICHECK * fix find include path (tlc-pack#42) Co-authored-by: Hongyi Jin <[email protected]> * [BUG] ExternFunc is not considered in attach_global_symbol.cc (tlc-pack#48) * [Cherry-Pick] Minor fix for TaskScheduler and VerifyGPUCode (tlc-pack#54) * [Fix] Task scheduler error prompt upon build/run failure (#13601) * [Fix] Use proper target in VerifyGPUCode (#13548) Previously, the VerifyGPUCode post-processor uses hardcoded target `Target("cuda")` for applying pass LowerIntrin. This is a bit problematic since the actual target can be other GPU target (e.g., Metal). Therefore, this PR changes the hardcoded target to be the actual target. Co-authored-by: Chaosfan <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Bohan Hou <[email protected]>
(tlc-pack#57) * [FIX][Pass] concurrent modification in RemoveUnusedVars (tlc-pack#32) * [Pass] Fix concurrent modification in RemoveUnusedVars When running RemoveUnusedVars (i.e. remove_all_unused), in some cases the map users will raise Concurrent modification error. This commit fixed it by changing the logic to "iterate the map first and update it later". * change the algorithm by store keys first * Add an ICHECK before getting map value * change the information of the ICHECK * fix find include path (tlc-pack#42) Co-authored-by: Hongyi Jin <[email protected]> * [BUG] ExternFunc is not considered in attach_global_symbol.cc (tlc-pack#48) * [Cherry-Pick] Minor fix for TaskScheduler and VerifyGPUCode (tlc-pack#54) * [Fix] Task scheduler error prompt upon build/run failure (#13601) * [Fix] Use proper target in VerifyGPUCode (#13548) Previously, the VerifyGPUCode post-processor uses hardcoded target `Target("cuda")` for applying pass LowerIntrin. This is a bit problematic since the actual target can be other GPU target (e.g., Metal). Therefore, this PR changes the hardcoded target to be the actual target. Co-authored-by: Chaosfan <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Bohan Hou <[email protected]>
(tlc-pack#57) * [FIX][Pass] concurrent modification in RemoveUnusedVars (tlc-pack#32) * [Pass] Fix concurrent modification in RemoveUnusedVars When running RemoveUnusedVars (i.e. remove_all_unused), in some cases the map users will raise Concurrent modification error. This commit fixed it by changing the logic to "iterate the map first and update it later". * change the algorithm by store keys first * Add an ICHECK before getting map value * change the information of the ICHECK * fix find include path (tlc-pack#42) Co-authored-by: Hongyi Jin <[email protected]> * [BUG] ExternFunc is not considered in attach_global_symbol.cc (tlc-pack#48) * [Cherry-Pick] Minor fix for TaskScheduler and VerifyGPUCode (tlc-pack#54) * [Fix] Task scheduler error prompt upon build/run failure (#13601) * [Fix] Use proper target in VerifyGPUCode (#13548) Previously, the VerifyGPUCode post-processor uses hardcoded target `Target("cuda")` for applying pass LowerIntrin. This is a bit problematic since the actual target can be other GPU target (e.g., Metal). Therefore, this PR changes the hardcoded target to be the actual target. Co-authored-by: Chaosfan <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Bohan Hou <[email protected]>
* fixes * revert checked_type visitor and fix relax usage * ExprNormalizer * fix that annoying bug and get tests passing * Memoization fix for the ExprMutator; separate VisitVarDef from use. * rebase. * rebase. * address part of comments. * address more comments * address more comments and add doc * address more comments * fix potential mutation bug * always assign normalized shape if can * address comments Co-authored-by: Altan Haan <[email protected]>
* fixes * revert checked_type visitor and fix relax usage * ExprNormalizer * fix that annoying bug and get tests passing * Memoization fix for the ExprMutator; separate VisitVarDef from use. * rebase. * rebase. * address part of comments. * address more comments * address more comments and add doc * address more comments * fix potential mutation bug * always assign normalized shape if can * address comments Co-authored-by: Altan Haan <[email protected]>
(tlc-pack#57) * [FIX][Pass] concurrent modification in RemoveUnusedVars (tlc-pack#32) * [Pass] Fix concurrent modification in RemoveUnusedVars When running RemoveUnusedVars (i.e. remove_all_unused), in some cases the map users will raise Concurrent modification error. This commit fixed it by changing the logic to "iterate the map first and update it later". * change the algorithm by store keys first * Add an ICHECK before getting map value * change the information of the ICHECK * fix find include path (tlc-pack#42) Co-authored-by: Hongyi Jin <[email protected]> * [BUG] ExternFunc is not considered in attach_global_symbol.cc (tlc-pack#48) * [Cherry-Pick] Minor fix for TaskScheduler and VerifyGPUCode (tlc-pack#54) * [Fix] Task scheduler error prompt upon build/run failure (#13601) * [Fix] Use proper target in VerifyGPUCode (#13548) Previously, the VerifyGPUCode post-processor uses hardcoded target `Target("cuda")` for applying pass LowerIntrin. This is a bit problematic since the actual target can be other GPU target (e.g., Metal). Therefore, this PR changes the hardcoded target to be the actual target. Co-authored-by: Chaosfan <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Bohan Hou <[email protected]>
* fixes * revert checked_type visitor and fix relax usage * ExprNormalizer * fix that annoying bug and get tests passing * Memoization fix for the ExprMutator; separate VisitVarDef from use. * rebase. * rebase. * address part of comments. * address more comments * address more comments and add doc * address more comments * fix potential mutation bug * always assign normalized shape if can * address comments Co-authored-by: Altan Haan <[email protected]>
* fixes * revert checked_type visitor and fix relax usage * ExprNormalizer * fix that annoying bug and get tests passing * Memoization fix for the ExprMutator; separate VisitVarDef from use. * rebase. * rebase. * address part of comments. * address more comments * address more comments and add doc * address more comments * fix potential mutation bug * always assign normalized shape if can * address comments Co-authored-by: Altan Haan <[email protected]>
* fixes * revert checked_type visitor and fix relax usage * ExprNormalizer * fix that annoying bug and get tests passing * Memoization fix for the ExprMutator; separate VisitVarDef from use. * rebase. * rebase. * address part of comments. * address more comments * address more comments and add doc * address more comments * fix potential mutation bug * always assign normalized shape if can * address comments Co-authored-by: Altan Haan <[email protected]>
(tlc-pack#57) * [FIX][Pass] concurrent modification in RemoveUnusedVars (tlc-pack#32) * [Pass] Fix concurrent modification in RemoveUnusedVars When running RemoveUnusedVars (i.e. remove_all_unused), in some cases the map users will raise Concurrent modification error. This commit fixed it by changing the logic to "iterate the map first and update it later". * change the algorithm by store keys first * Add an ICHECK before getting map value * change the information of the ICHECK * fix find include path (tlc-pack#42) Co-authored-by: Hongyi Jin <[email protected]> * [BUG] ExternFunc is not considered in attach_global_symbol.cc (tlc-pack#48) * [Cherry-Pick] Minor fix for TaskScheduler and VerifyGPUCode (tlc-pack#54) * [Fix] Task scheduler error prompt upon build/run failure (#13601) * [Fix] Use proper target in VerifyGPUCode (#13548) Previously, the VerifyGPUCode post-processor uses hardcoded target `Target("cuda")` for applying pass LowerIntrin. This is a bit problematic since the actual target can be other GPU target (e.g., Metal). Therefore, this PR changes the hardcoded target to be the actual target. Co-authored-by: Chaosfan <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Bohan Hou <[email protected]>
(tlc-pack#57) * [FIX][Pass] concurrent modification in RemoveUnusedVars (tlc-pack#32) * [Pass] Fix concurrent modification in RemoveUnusedVars When running RemoveUnusedVars (i.e. remove_all_unused), in some cases the map users will raise Concurrent modification error. This commit fixed it by changing the logic to "iterate the map first and update it later". * change the algorithm by store keys first * Add an ICHECK before getting map value * change the information of the ICHECK * fix find include path (tlc-pack#42) Co-authored-by: Hongyi Jin <[email protected]> * [BUG] ExternFunc is not considered in attach_global_symbol.cc (tlc-pack#48) * [Cherry-Pick] Minor fix for TaskScheduler and VerifyGPUCode (tlc-pack#54) * [Fix] Task scheduler error prompt upon build/run failure (#13601) * [Fix] Use proper target in VerifyGPUCode (#13548) Previously, the VerifyGPUCode post-processor uses hardcoded target `Target("cuda")` for applying pass LowerIntrin. This is a bit problematic since the actual target can be other GPU target (e.g., Metal). Therefore, this PR changes the hardcoded target to be the actual target. Co-authored-by: Chaosfan <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Bohan Hou <[email protected]>
* fixes * revert checked_type visitor and fix relax usage * ExprNormalizer * fix that annoying bug and get tests passing * Memoization fix for the ExprMutator; separate VisitVarDef from use. * rebase. * rebase. * address part of comments. * address more comments * address more comments and add doc * address more comments * fix potential mutation bug * always assign normalized shape if can * address comments Co-authored-by: Altan Haan <[email protected]>
* fixes * revert checked_type visitor and fix relax usage * ExprNormalizer * fix that annoying bug and get tests passing * Memoization fix for the ExprMutator; separate VisitVarDef from use. * rebase. * rebase. * address part of comments. * address more comments * address more comments and add doc * address more comments * fix potential mutation bug * always assign normalized shape if can * address comments Co-authored-by: Altan Haan <[email protected]>
(tlc-pack#57) * [FIX][Pass] concurrent modification in RemoveUnusedVars (tlc-pack#32) * [Pass] Fix concurrent modification in RemoveUnusedVars When running RemoveUnusedVars (i.e. remove_all_unused), in some cases the map users will raise Concurrent modification error. This commit fixed it by changing the logic to "iterate the map first and update it later". * change the algorithm by store keys first * Add an ICHECK before getting map value * change the information of the ICHECK * fix find include path (tlc-pack#42) Co-authored-by: Hongyi Jin <[email protected]> * [BUG] ExternFunc is not considered in attach_global_symbol.cc (tlc-pack#48) * [Cherry-Pick] Minor fix for TaskScheduler and VerifyGPUCode (tlc-pack#54) * [Fix] Task scheduler error prompt upon build/run failure (#13601) * [Fix] Use proper target in VerifyGPUCode (#13548) Previously, the VerifyGPUCode post-processor uses hardcoded target `Target("cuda")` for applying pass LowerIntrin. This is a bit problematic since the actual target can be other GPU target (e.g., Metal). Therefore, this PR changes the hardcoded target to be the actual target. Co-authored-by: Chaosfan <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Bohan Hou <[email protected]>
* fixes * revert checked_type visitor and fix relax usage * ExprNormalizer * fix that annoying bug and get tests passing * Memoization fix for the ExprMutator; separate VisitVarDef from use. * rebase. * rebase. * address part of comments. * address more comments * address more comments and add doc * address more comments * fix potential mutation bug * always assign normalized shape if can * address comments Co-authored-by: Altan Haan <[email protected]>
* fixes * revert checked_type visitor and fix relax usage * ExprNormalizer * fix that annoying bug and get tests passing * Memoization fix for the ExprMutator; separate VisitVarDef from use. * rebase. * rebase. * address part of comments. * address more comments * address more comments and add doc * address more comments * fix potential mutation bug * always assign normalized shape if can * address comments Co-authored-by: Altan Haan <[email protected]>
* fixes * revert checked_type visitor and fix relax usage * ExprNormalizer * fix that annoying bug and get tests passing * Memoization fix for the ExprMutator; separate VisitVarDef from use. * rebase. * rebase. * address part of comments. * address more comments * address more comments and add doc * address more comments * fix potential mutation bug * always assign normalized shape if can * address comments Co-authored-by: Altan Haan <[email protected]>
* fixes * revert checked_type visitor and fix relax usage * ExprNormalizer * fix that annoying bug and get tests passing * Memoization fix for the ExprMutator; separate VisitVarDef from use. * rebase. * rebase. * address part of comments. * address more comments * address more comments and add doc * address more comments * fix potential mutation bug * always assign normalized shape if can * address comments Co-authored-by: Altan Haan <[email protected]>
* fixes * revert checked_type visitor and fix relax usage * ExprNormalizer * fix that annoying bug and get tests passing * Memoization fix for the ExprMutator; separate VisitVarDef from use. * rebase. * rebase. * address part of comments. * address more comments * address more comments and add doc * address more comments * fix potential mutation bug * always assign normalized shape if can * address comments Co-authored-by: Altan Haan <[email protected]>
* fixes * revert checked_type visitor and fix relax usage * ExprNormalizer * fix that annoying bug and get tests passing * Memoization fix for the ExprMutator; separate VisitVarDef from use. * rebase. * rebase. * address part of comments. * address more comments * address more comments and add doc * address more comments * fix potential mutation bug * always assign normalized shape if can * address comments Co-authored-by: Altan Haan <[email protected]>
(tlc-pack#57) * [FIX][Pass] concurrent modification in RemoveUnusedVars (tlc-pack#32) * [Pass] Fix concurrent modification in RemoveUnusedVars When running RemoveUnusedVars (i.e. remove_all_unused), in some cases the map users will raise Concurrent modification error. This commit fixed it by changing the logic to "iterate the map first and update it later". * change the algorithm by store keys first * Add an ICHECK before getting map value * change the information of the ICHECK * fix find include path (tlc-pack#42) Co-authored-by: Hongyi Jin <[email protected]> * [BUG] ExternFunc is not considered in attach_global_symbol.cc (tlc-pack#48) * [Cherry-Pick] Minor fix for TaskScheduler and VerifyGPUCode (tlc-pack#54) * [Fix] Task scheduler error prompt upon build/run failure (#13601) * [Fix] Use proper target in VerifyGPUCode (#13548) Previously, the VerifyGPUCode post-processor uses hardcoded target `Target("cuda")` for applying pass LowerIntrin. This is a bit problematic since the actual target can be other GPU target (e.g., Metal). Therefore, this PR changes the hardcoded target to be the actual target. Co-authored-by: Chaosfan <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Bohan Hou <[email protected]>
(tlc-pack#57) * [FIX][Pass] concurrent modification in RemoveUnusedVars (tlc-pack#32) * [Pass] Fix concurrent modification in RemoveUnusedVars When running RemoveUnusedVars (i.e. remove_all_unused), in some cases the map users will raise Concurrent modification error. This commit fixed it by changing the logic to "iterate the map first and update it later". * change the algorithm by store keys first * Add an ICHECK before getting map value * change the information of the ICHECK * fix find include path (tlc-pack#42) Co-authored-by: Hongyi Jin <[email protected]> * [BUG] ExternFunc is not considered in attach_global_symbol.cc (tlc-pack#48) * [Cherry-Pick] Minor fix for TaskScheduler and VerifyGPUCode (tlc-pack#54) * [Fix] Task scheduler error prompt upon build/run failure (#13601) * [Fix] Use proper target in VerifyGPUCode (#13548) Previously, the VerifyGPUCode post-processor uses hardcoded target `Target("cuda")` for applying pass LowerIntrin. This is a bit problematic since the actual target can be other GPU target (e.g., Metal). Therefore, this PR changes the hardcoded target to be the actual target. Co-authored-by: Chaosfan <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Bohan Hou <[email protected]>
Thanks for contributing to TVM! Please refer to guideline https://tvm.apache.org/docs/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from Reviewers by @ them in the pull request thread.