[ARITH] Enhance Canonical Simplify for LE #15471

Hzfengsy · 2023-08-03T16:59:10Z

This PR enhances the canonical simplifier to support the following patterns:

x0 * s0 + x1 * s1 + ... + xn + c < 0, let d = gcd(s0, s1, ..., s{n-1}, c)

if can prove -d < xn < d, then we can simplify the expression to x0 * (s0/d) + x1 * (s1/d) + ... + x{n-1} * (s{n-1}/d) < c/d, e.g. x * 8 + y < 16 where y \in [0, 8), we can simplify it to x < 2
if xn is in pattern of yn % m, where m % d == 0, convert it to yn // d % (m/d) e.g. x1 * 64 + (x2 * 8 + x3) % 64 < 120, x3 \in [0, 8), we can simplify it to x1 * 8 + (x2 * 8 + x3) // 8 % 8 < 15 ==> x1 * 8 + x2 % 8 < 15

This PR enhances the canonical simplifier to support the following patterns: x0 * s0 + x1 * s1 + ... + xn + c < 0, let d = gcd(s0, s1, ..., s{n-1}, c) 1. if can prove -d < xn < d, then we can simplify the expression to x0 * (s0/d) + x1 * (s1/d) + ... + x{n-1} * (s{n-1}/d) < c/d, e.g. `x * 8 + y < 16` where `y` \in [0, 8), we can simplify it to `x < 2` 2. if xn is in pattern of yn % m, where m % d == 0, convert it to yn // d % (m/d) e.g. `x1 * 64 + (x2 * 8 + x3) % 64 < 120`, `x3` \in [0, 8), we can simplify it to `x1 * 8 + (x2 * 8 + x3) // 8 % 8 < 15` ==> `x1 * 8 + x2 % 8 < 15`

tvm-bot · 2023-08-03T16:59:14Z

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

cc @junrushao _{See #10317 for details}

_{Generated by tvm-bot}

Hzfengsy · 2023-08-03T17:02:20Z

tests/python/unittest/test_target_codegen_c_host.py

@@ -64,7 +64,8 @@ def test_add_pipeline():
    s[C].pragma(xo1, "parallel_launch_point")
    s[C].pragma(xo2, "parallel_stride_pattern")
    s[C].pragma(xo2, "parallel_barrier_when_finish")
-    s[C].vectorize(xi)
+    # FIXME(tvm-team): vector operators are not supported for codegen to C yet


This case is because of the poor support of vectorization for c_codegen.
Before this PR, TVM failed to simplify the branch predicate and skip the vectorization. This PR enhances the arith, and makes it TRUE vectoring, but failed on codegen stage.

In short:

It's not related to this PR, as it's codegen issue

It's not a regression, vectorized step is skipped before this PR.

PR apache#15471 enhances the simplification for LE, while missed a case where the upper bound `kPosInf` is divisible by a factor. Therefore, prior to this PR, when simplifying `x * 1024 + y < z * 7168`, it will fails with the error message ``` InternalError: Check failed: value < 1LL << (dtype.bits() - 1) (8589934591 vs. 2147483648) : ValueError: Literal value 8589934591 exceeds maximum of int32 ``` This is just because the upper bound 7 here divides `kPosInf` the maximum value of int64, which passes an "if" condition in apache#15471 unexpectedly. This PR fixes the issue.

PR #15471 enhances the simplification for LE, while missed a case where the upper bound `kPosInf` is divisible by a factor. Therefore, prior to this PR, when simplifying `x * 1024 + y < z * 7168`, it will fails with the error message ``` InternalError: Check failed: value < 1LL << (dtype.bits() - 1) (8589934591 vs. 2147483648) : ValueError: Literal value 8589934591 exceeds maximum of int32 ``` This is just because the upper bound 7 here divides `kPosInf` the maximum value of int64, which passes an "if" condition in #15471 unexpectedly. This PR fixes the issue.

PR apache#15471 enhances the simplification for LE, while missed a case where the upper bound `kPosInf` is divisible by a factor. Therefore, prior to this PR, when simplifying `x * 1024 + y < z * 7168`, it will fails with the error message ``` InternalError: Check failed: value < 1LL << (dtype.bits() - 1) (8589934591 vs. 2147483648) : ValueError: Literal value 8589934591 exceeds maximum of int32 ``` This is just because the upper bound 7 here divides `kPosInf` the maximum value of int64, which passes an "if" condition in apache#15471 unexpectedly. This PR fixes the issue.

github-actions bot requested a review from tqchen August 3, 2023 16:59

Hzfengsy commented Aug 3, 2023

View reviewed changes

fix

f7e10e1

Hzfengsy mentioned this pull request Aug 4, 2023

[Unity][DLight] Use less shared memory for gemv #15482

Merged

update log

01c7e9e

tqchen approved these changes Aug 8, 2023

View reviewed changes

tqchen merged commit 8cadd1f into apache:main Aug 8, 2023

ysh329 mentioned this pull request Oct 18, 2023

[Release] v0.14.0 Release Candidate Notes #15948

Closed

Hzfengsy deleted the simplify_arith branch November 5, 2023 09:39

MasterJH5574 mentioned this pull request Mar 12, 2024

[Fix][Arith] Fix canonical simplification of LE #16704

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ARITH] Enhance Canonical Simplify for LE #15471

[ARITH] Enhance Canonical Simplify for LE #15471

Hzfengsy commented Aug 3, 2023

tvm-bot commented Aug 3, 2023

Hzfengsy Aug 3, 2023

[ARITH] Enhance Canonical Simplify for LE #15471

[ARITH] Enhance Canonical Simplify for LE #15471

Conversation

Hzfengsy commented Aug 3, 2023

tvm-bot commented Aug 3, 2023

Hzfengsy Aug 3, 2023

Choose a reason for hiding this comment