From ef07206ebb1aab8d1fa56bae8c0f97e7c6de3df0 Mon Sep 17 00:00:00 2001 From: Min Chen Date: Mon, 13 Feb 2023 06:32:39 +0000 Subject: [PATCH 1/4] Improve representation of IntSet. --- rfcs/0099-improve-IntSet-representation.md | 67 ++++++++++++++++++++++ 1 file changed, 67 insertions(+) create mode 100644 rfcs/0099-improve-IntSet-representation.md diff --git a/rfcs/0099-improve-IntSet-representation.md b/rfcs/0099-improve-IntSet-representation.md new file mode 100644 index 00000000..e8568b7d --- /dev/null +++ b/rfcs/0099-improve-IntSet-representation.md @@ -0,0 +1,67 @@ +- Feature Name: Improve Representation of IntSet +- Start Date: 2023-02-13 +- RFC PR: https://github.com/apache/tvm-rfcs/pull/99/ +- GitHub Issue: https://github.com/apache/tvm/pull/0000 + +# Summary +It would be great if TVMScript can grow into a generic programming language in marchine learning domain. To reach that, it seems some powerful analysis tools are needed. Integer set is pivotal in IR analyzing, but IntSet in TVM only represents ranges. This RFC is to seek an improvement for it so that we can perform the IR analysis more precisely. We found the Presburger Set in MLIR library could be leveraged for this purpose. +# Motivation +Current dependence analysis is carried out roughly and inconveniently. Due to the absence of necessary basic infrastructure, it's hard to consider complex if conditions in IR, and it's even more challenging to do element-wise dependence analysis, both for inside or between TIR block analysis. +## Inner block analysis +One goal of TVMscript is to provide an easy & flexible way to construct computation workload for both TVM compiler developers and machine learning algorithm developers. TVMscript requires users to annotate some extra information when programming, such as `T.block`/`T.remap`/`T.init`..., which seem not very intuitive if developers do not have deep compiler knowledge. If IR analysis can help users annotate this kind of information automatically, it would be a tremendous programming option for many cases. All the above analysis requires a better data dependency analysis, which needs a more sophisticated integer set utility. If we can analyze the element-wise dependency between different loop instances, it should be easy to detect the spatial/reduce iteration axis automatically. Take the following loop statement for example, how to easily analyze whether axis `i` is spatial or not, without element-wise dependency analysis? +``` +for i, j, k, m, n in grid(37, 23, 40, 57, 60): + if 3*m + 7n < 58 and 45*k + 77*j >= 34: + B[i*3324 + j*23 + k*103 + m*279] = A[i, j, k, m, n] +``` +If we can detect the reduce axis, `T.init` pattern detection should not be a problem, then. Of course, this auto-detection should only work as an option because TVM developers may sometimes still want to handcraft some complex blocks, such as block nesting. +## Interblock analysis +Most TIR primitives and passes need to analyze the data denpendency between producer & consumer blocks. Without sufficient utility, it's hard to consider complex `if` conditions. Analysis without if-conditions leads to a rough dependency result and causes redundant data transfer & computation. The redundant workload could be neglective for CPU/GPU, but it could be painful for NPU, for which extra data is devastating both for DMA and computation. An if-condition-aware integer set should solve this problem. And even if there is no `if` condition in blocks, `T.Read`/`T.Write` is needed when constructing the workload, which also can be easily inferred from IR stmt if a better IntSet exists. Here is an example: +``` +@T.prim_func +def if_func(a: T.handle, c: T.handle) -> None: + A = T.match_buffer(a, (60,), "float32") + C = T.match_buffer(c, (20, 20), "float32") + with T.block(): + B = T.alloc_buffer((60,), dtype="float32") + for i in T.grid(60,): + with T.block(): + B[i] = A[i] + for i, j in T.grid(20, 20): + with T.block(): + if i + j < 20 and i - j <= 0: + C[i, j] = B[i + 2*j] + else: + C[i, j] = 0.0 +``` +How to determine the maximum range of `B` and do compact buffer range for it? Shape of `B` only needs (39,) and currently CompactBufferAllocation does nothing on this. If we can determine the range automatically, the `T.Read`/`T.Write` annotations could be saved . +# Guide-level explanation +The proposal is to reconstruct IntSet class. The key point is to support inequation constraints in IntSet to consider if-conditions. The inequation constraints are exprs on multiple Vars. The existing memeber functions of IntSet will be kept but reimplemented, so it doesn't impact any previous analyzing works. But inside the new IntSet, inequations are used to represent integer sets. An additional IntSet constructor function will be added to construct from inequations: +``` +IntSet FromConstraints(Array inequations) +``` +In order to manage the analysis, we need to separate the vars in all the inequations into at least two kinds, the iteration(or domain) vars and other vars, say local(target) vars. The new IntSet keeps the relationship between iteration vars and local vars, from iteration to local vars or vice versa. Some other utility functions are needed to transform the relationship, including: +``` +IntSet reverse() +``` + Reverse the relationship from local vars to iteration vars. So we can further analyze dependency based on read/write sets. +``` +IntSet apply_iteration()/apply_local() +``` + Merge two relationships targeting the iteration vars or local vars. Then we can propagate the relationship between multiple sets. +``` +IntSet solve_bounds(PrimExpr expr) +``` + Prove engine to solve the maximum/minimum optimization problem based on the inequations in IntSet,such as simplex solver. Input parameter expr is the target expression of optimization problem. + +Other existing API, like intersect/union etc, will be reimplemented based on updated data structure. +# Reference-level explanation +You may have already noticed that this is just what other modern integer set library provides, like ISL. So an economical way to achieve this is to leverage existing public wheels. ISL is mostly used, but it seems not modular enough or open enough, so it could be difficult to integrate deeply. Presburger Set located in MLIR is modular designed and open developed. So building the new IntSet starting from it would be a good choice. + +No need to introduce MLIR as a source code submodule. Installing LLVM prebuilt package installs the necessary libs of Presburger Set, so MLIR can be integrated into TVM just like LLVM codegen uses LLVM libs, and it can be switched on/off on demand. The new-added util function needs to check whether MLIR is installed when called and falls back to the interval set when MLIR is not found. +# Drawbacks +The IntSet serves as the basic infrastructure of IR analysis, and a wide range of lowering passes/primitives depends on it. Switching it off may limit the analysis if the new-added utils are widely used in the future. In that case, the analysis will be downgraded to using the former interval set. +# Alternatives +The other way is to handcraft a copy of code similar to Presburger Set in MLIR, which minimizes the software dependence of TVM project, but it needs considerable effort and seems like reinventing the wheel, if there is no extra new idea to implement. + +Reimplementing IntSet could impact broadly, and risk exists even if there is an on/off switch and off by default. Another way is to implement a different class to leave IntSet alone. But analysis code should determine which one to use, and this could be difficult to decide sometimes. \ No newline at end of file From a0c86f4fd3f0560fccd556de94b782d125520cc8 Mon Sep 17 00:00:00 2001 From: Min Chen Date: Tue, 14 Feb 2023 07:50:14 +0000 Subject: [PATCH 2/4] Implement another class PresburgerSet. --- rfcs/0099-improve-IntSet-representation.md | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/rfcs/0099-improve-IntSet-representation.md b/rfcs/0099-improve-IntSet-representation.md index e8568b7d..ffe2b78d 100644 --- a/rfcs/0099-improve-IntSet-representation.md +++ b/rfcs/0099-improve-IntSet-representation.md @@ -36,32 +36,32 @@ def if_func(a: T.handle, c: T.handle) -> None: ``` How to determine the maximum range of `B` and do compact buffer range for it? Shape of `B` only needs (39,) and currently CompactBufferAllocation does nothing on this. If we can determine the range automatically, the `T.Read`/`T.Write` annotations could be saved . # Guide-level explanation -The proposal is to reconstruct IntSet class. The key point is to support inequation constraints in IntSet to consider if-conditions. The inequation constraints are exprs on multiple Vars. The existing memeber functions of IntSet will be kept but reimplemented, so it doesn't impact any previous analyzing works. But inside the new IntSet, inequations are used to represent integer sets. An additional IntSet constructor function will be added to construct from inequations: +The proposal is to implement a `PresburgerSet` class. The key point is to support inequation constraints to consider if-conditions, so the inequation on multiple Vars will mainly be used to express the sets. The basic set manipulation functions and constructor functions in `IntSet` class will be reimplemented in `PresburgerSet`. An additional constructor function will be added to construct from inequations: ``` -IntSet FromConstraints(Array inequations) +PresburgerSet FromConstraints(Array inequations) ``` -In order to manage the analysis, we need to separate the vars in all the inequations into at least two kinds, the iteration(or domain) vars and other vars, say local(target) vars. The new IntSet keeps the relationship between iteration vars and local vars, from iteration to local vars or vice versa. Some other utility functions are needed to transform the relationship, including: +In order to manage the analysis, we need to separate the vars in all the inequations into at least two kinds, the iteration(or domain) vars and other vars, say local(target) vars. `PresburgerSet` keeps the relationship between iteration vars and local vars, from iteration to local vars or vice versa. Some other utility functions are needed to transform the relationship, including: ``` -IntSet reverse() +PresburgerSet reverse() ``` Reverse the relationship from local vars to iteration vars. So we can further analyze dependency based on read/write sets. ``` -IntSet apply_iteration()/apply_local() +PresburgerSet apply_iteration()/apply_local() ``` Merge two relationships targeting the iteration vars or local vars. Then we can propagate the relationship between multiple sets. ``` -IntSet solve_bounds(PrimExpr expr) +PresburgerSet solve_bounds(PrimExpr expr) ``` - Prove engine to solve the maximum/minimum optimization problem based on the inequations in IntSet,such as simplex solver. Input parameter expr is the target expression of optimization problem. + Prove engine to solve the maximum/minimum optimization problem based on the inequations in `PresburgerSet`, such as simplex solver. Input parameter expr is the target expression of optimization problem. Other existing API, like intersect/union etc, will be reimplemented based on updated data structure. # Reference-level explanation -You may have already noticed that this is just what other modern integer set library provides, like ISL. So an economical way to achieve this is to leverage existing public wheels. ISL is mostly used, but it seems not modular enough or open enough, so it could be difficult to integrate deeply. Presburger Set located in MLIR is modular designed and open developed. So building the new IntSet starting from it would be a good choice. +You may have already noticed that this is just what other modern integer set library provides, like ISL. So an economical way to achieve this is to leverage existing public wheels. ISL is mostly used, but it seems not modular enough or open enough, so it could be difficult to integrate deeply. Presburger Set located in MLIR is modular designed and open developed. So building from it would be a good choice. No need to introduce MLIR as a source code submodule. Installing LLVM prebuilt package installs the necessary libs of Presburger Set, so MLIR can be integrated into TVM just like LLVM codegen uses LLVM libs, and it can be switched on/off on demand. The new-added util function needs to check whether MLIR is installed when called and falls back to the interval set when MLIR is not found. # Drawbacks -The IntSet serves as the basic infrastructure of IR analysis, and a wide range of lowering passes/primitives depends on it. Switching it off may limit the analysis if the new-added utils are widely used in the future. In that case, the analysis will be downgraded to using the former interval set. +The `PresburgerSet` serves as the basic infrastructure of IR analysis, and a wide range of lowering passes/primitives may need it. Part of its functionality is duplicated with `IntSet`, so people should make a decision which one to use according to the analysis task. # Alternatives The other way is to handcraft a copy of code similar to Presburger Set in MLIR, which minimizes the software dependence of TVM project, but it needs considerable effort and seems like reinventing the wheel, if there is no extra new idea to implement. - -Reimplementing IntSet could impact broadly, and risk exists even if there is an on/off switch and off by default. Another way is to implement a different class to leave IntSet alone. But analysis code should determine which one to use, and this could be difficult to decide sometimes. \ No newline at end of file +# Future possiblities +One day, when we make sure `PresburgerSet` can fully cover what `IntSet` provides, in terms of functionality and efficiency, we may consider phasing out the legacy IntSet, then no more decisions about `PresburgerSet` and `IntSet`. \ No newline at end of file From 682d4aa5e16275de0a675ebbc598c795b8655ffc Mon Sep 17 00:00:00 2001 From: Min Chen Date: Tue, 14 Feb 2023 13:55:39 +0000 Subject: [PATCH 3/4] Change RFC title as Introduce PresburgerSet --- ...IntSet-representation.md => 0099-introduce-PresburgerSet.md} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename rfcs/{0099-improve-IntSet-representation.md => 0099-introduce-PresburgerSet.md} (99%) diff --git a/rfcs/0099-improve-IntSet-representation.md b/rfcs/0099-introduce-PresburgerSet.md similarity index 99% rename from rfcs/0099-improve-IntSet-representation.md rename to rfcs/0099-introduce-PresburgerSet.md index ffe2b78d..ed1331a7 100644 --- a/rfcs/0099-improve-IntSet-representation.md +++ b/rfcs/0099-introduce-PresburgerSet.md @@ -1,4 +1,4 @@ -- Feature Name: Improve Representation of IntSet +- Feature Name: Introduce PresburgerSet - Start Date: 2023-02-13 - RFC PR: https://github.com/apache/tvm-rfcs/pull/99/ - GitHub Issue: https://github.com/apache/tvm/pull/0000 From a41660a0dbd25e9d34a614e2bf55e5892f1a92a5 Mon Sep 17 00:00:00 2001 From: Min Chen Date: Thu, 16 Feb 2023 01:50:50 +0000 Subject: [PATCH 4/4] Update tracking issue number. --- rfcs/0099-introduce-PresburgerSet.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/rfcs/0099-introduce-PresburgerSet.md b/rfcs/0099-introduce-PresburgerSet.md index ed1331a7..a8617aac 100644 --- a/rfcs/0099-introduce-PresburgerSet.md +++ b/rfcs/0099-introduce-PresburgerSet.md @@ -1,7 +1,7 @@ - Feature Name: Introduce PresburgerSet - Start Date: 2023-02-13 - RFC PR: https://github.com/apache/tvm-rfcs/pull/99/ -- GitHub Issue: https://github.com/apache/tvm/pull/0000 +- GitHub Issue: https://github.com/apache/tvm/issues/14006 # Summary It would be great if TVMScript can grow into a generic programming language in marchine learning domain. To reach that, it seems some powerful analysis tools are needed. Integer set is pivotal in IR analyzing, but IntSet in TVM only represents ranges. This RFC is to seek an improvement for it so that we can perform the IR analysis more precisely. We found the Presburger Set in MLIR library could be leveraged for this purpose. @@ -64,4 +64,4 @@ The `PresburgerSet` serves as the basic infrastructure of IR analysis, and a wid # Alternatives The other way is to handcraft a copy of code similar to Presburger Set in MLIR, which minimizes the software dependence of TVM project, but it needs considerable effort and seems like reinventing the wheel, if there is no extra new idea to implement. # Future possiblities -One day, when we make sure `PresburgerSet` can fully cover what `IntSet` provides, in terms of functionality and efficiency, we may consider phasing out the legacy IntSet, then no more decisions about `PresburgerSet` and `IntSet`. \ No newline at end of file +One day, when we make sure `PresburgerSet` can fully cover what `IntSet` provides, in terms of functionality and efficiency, we may consider phasing out the legacy IntSet, then no more decisions about `PresburgerSet` and `IntSet`.