[ValueTracking] Extend LHS/RHS with matching operand to work without constants. #85557

goldsteinn · 2024-03-16T23:41:34Z

Previously we only handled the L0 == R0 case if both L1 and R1
where constant.

We can get more out of the analysis using general constant ranges
instead.

For example, X u> Y implies X != 0.

In general, any strict comparison on X implies that X is not equal
to the boundary value for the sign and constant ranges with/without
sign bits can be useful in deducing implications.

llvmbot · 2024-03-16T23:42:06Z

@llvm/pr-subscribers-llvm-transforms

Author: None (goldsteinn)

Changes

Previously we only handled the L0 == R0 case if both L1 and R1
where constant.

We can get more out of the analysis using general constant ranges
instead.

For example, X u> Y implies X != 0.

In general, any strict comparison on X implies that X is not equal
to the boundary value for the sign and constant ranges with/without
sign bits can be useful in deducing implications.

Full diff: https://github.com/llvm/llvm-project/pull/85557.diff

7 Files Affected:

(modified) llvm/lib/Analysis/ValueTracking.cpp (+44-17)
(modified) llvm/test/Transforms/InstCombine/assume.ll (+1-1)
(modified) llvm/test/Transforms/InstCombine/range-check.ll (+2-10)
(modified) llvm/test/Transforms/InstCombine/select.ll (+2-4)
(modified) llvm/test/Transforms/InstCombine/shift.ll (+3-4)
(modified) llvm/test/Transforms/LoopVectorize/X86/pr23997.ll (+3-4)
(modified) llvm/test/Transforms/NewGVN/pr35125.ll (+3-6)

diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index edbeede910d7f7..b2846274a36e73 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -8499,20 +8499,20 @@ isImpliedCondMatchingOperands(CmpInst::Predicate LPred,
   return std::nullopt;
 }
 
-/// Return true if "icmp LPred X, LC" implies "icmp RPred X, RC" is true.
-/// Return false if "icmp LPred X, LC" implies "icmp RPred X, RC" is false.
+/// Return true if "icmp LPred X, LCR" implies "icmp RPred X, RCR" is true.
+/// Return false if "icmp LPred X, LCR" implies "icmp RPred X, RCR" is false.
 /// Otherwise, return std::nullopt if we can't infer anything.
-static std::optional<bool> isImpliedCondCommonOperandWithConstants(
-    CmpInst::Predicate LPred, const APInt &LC, CmpInst::Predicate RPred,
-    const APInt &RC) {
-  ConstantRange DomCR = ConstantRange::makeExactICmpRegion(LPred, LC);
-  ConstantRange CR = ConstantRange::makeExactICmpRegion(RPred, RC);
-  ConstantRange Intersection = DomCR.intersectWith(CR);
-  ConstantRange Difference = DomCR.difference(CR);
-  if (Intersection.isEmptySet())
-    return false;
-  if (Difference.isEmptySet())
+static std::optional<bool>
+isImpliedCondCommonOperandWithCR(CmpInst::Predicate LPred, ConstantRange LCR,
+                                 CmpInst::Predicate RPred, ConstantRange RCR) {
+  ConstantRange DomCR = ConstantRange::makeAllowedICmpRegion(LPred, LCR);
+  // If all true values for lhs and true for rhs, lhs implies rhs
+  if (DomCR.icmp(RPred, RCR))
     return true;
+
+  // If there is no overlap, lhs implies not rhs
+  if (DomCR.icmp(CmpInst::getInversePredicate(RPred), RCR))
+    return false;
   return std::nullopt;
 }
 
@@ -8532,11 +8532,38 @@ static std::optional<bool> isImpliedCondICmps(const ICmpInst *LHS,
   CmpInst::Predicate LPred =
       LHSIsTrue ? LHS->getPredicate() : LHS->getInversePredicate();
 
-  // Can we infer anything when the 0-operands match and the 1-operands are
-  // constants (not necessarily matching)?
-  const APInt *LC, *RC;
-  if (L0 == R0 && match(L1, m_APInt(LC)) && match(R1, m_APInt(RC)))
-    return isImpliedCondCommonOperandWithConstants(LPred, *LC, RPred, *RC);
+  if (L0 == R1) {
+    std::swap(R0, R1);
+    RPred = ICmpInst::getSwappedPredicate(RPred);
+  }
+  if (L1 == R0) {
+    std::swap(L0, L1);
+    LPred = ICmpInst::getSwappedPredicate(LPred);
+  }
+
+  // See if we can infer anything if operand-0 matches and we have at least one
+  // constant operand-1.
+  if (L0 == R0 && L0->getType()->isIntOrIntVectorTy()) {
+    // Potential TODO: We could also further use the constant range of L0/R0 to
+    // further constraint the constant ranges. At the moment this leads to
+    // several regressions related to not transforming `multi_use(A + C0) eq/ne
+    // C1` (see discussion: D58633).
+    ConstantRange LCR = computeConstantRange(
+        L1, ICmpInst::isSigned(LPred), /* UseInstrInfo=*/true, /*AC=*/nullptr,
+        /*CxtI=*/nullptr, /*DT=*/nullptr, Depth);
+    ConstantRange RCR = computeConstantRange(
+        R1, ICmpInst::isSigned(RPred), /* UseInstrInfo=*/true, /*AC=*/nullptr,
+        /*CxtI=*/nullptr, /*DT=*/nullptr, Depth);
+    // Even if L1/R1 are not both constant, we can still sometimes deduce
+    // relationship from a single constant. For example X u> Y implies X != 0.
+    if (auto R = isImpliedCondCommonOperandWithCR(LPred, LCR, RPred, RCR))
+      return R;
+    // If both L1/R1 where exact constant ranges and we didn't get anything
+    // here, we won't be able to deduce this.
+    const APInt *Unused;
+    if (match(L1, m_APInt(Unused)) && match(R1, m_APInt(Unused)))
+      return std::nullopt;
+  }
 
   // Can we infer anything when the two compares have matching operands?
   bool AreSwappedOps;
diff --git a/llvm/test/Transforms/InstCombine/assume.ll b/llvm/test/Transforms/InstCombine/assume.ll
index 927f0a86b0a252..87c75fb2b55592 100644
--- a/llvm/test/Transforms/InstCombine/assume.ll
+++ b/llvm/test/Transforms/InstCombine/assume.ll
@@ -386,7 +386,7 @@ define i1 @nonnull5(ptr %a) {
 define i32 @assumption_conflicts_with_known_bits(i32 %a, i32 %b) {
 ; CHECK-LABEL: @assumption_conflicts_with_known_bits(
 ; CHECK-NEXT:    store i1 true, ptr poison, align 1
-; CHECK-NEXT:    ret i32 1
+; CHECK-NEXT:    ret i32 poison
 ;
   %and1 = and i32 %b, 3
   %B1 = lshr i32 %and1, %and1
diff --git a/llvm/test/Transforms/InstCombine/range-check.ll b/llvm/test/Transforms/InstCombine/range-check.ll
index 0d138b6ba7e79d..210e57c1d1fe4c 100644
--- a/llvm/test/Transforms/InstCombine/range-check.ll
+++ b/llvm/test/Transforms/InstCombine/range-check.ll
@@ -340,11 +340,7 @@ define i1 @negative4_logical(i32 %x, i32 %n) {
 
 define i1 @negative5(i32 %x, i32 %n) {
 ; CHECK-LABEL: @negative5(
-; CHECK-NEXT:    [[NN:%.*]] = and i32 [[N:%.*]], 2147483647
-; CHECK-NEXT:    [[A:%.*]] = icmp sgt i32 [[NN]], [[X:%.*]]
-; CHECK-NEXT:    [[B:%.*]] = icmp sgt i32 [[X]], -1
-; CHECK-NEXT:    [[C:%.*]] = or i1 [[A]], [[B]]
-; CHECK-NEXT:    ret i1 [[C]]
+; CHECK-NEXT:    ret i1 true
 ;
   %nn = and i32 %n, 2147483647
   %a = icmp slt i32 %x, %nn
@@ -355,11 +351,7 @@ define i1 @negative5(i32 %x, i32 %n) {
 
 define i1 @negative5_logical(i32 %x, i32 %n) {
 ; CHECK-LABEL: @negative5_logical(
-; CHECK-NEXT:    [[NN:%.*]] = and i32 [[N:%.*]], 2147483647
-; CHECK-NEXT:    [[A:%.*]] = icmp sgt i32 [[NN]], [[X:%.*]]
-; CHECK-NEXT:    [[B:%.*]] = icmp sgt i32 [[X]], -1
-; CHECK-NEXT:    [[C:%.*]] = or i1 [[A]], [[B]]
-; CHECK-NEXT:    ret i1 [[C]]
+; CHECK-NEXT:    ret i1 true
 ;
   %nn = and i32 %n, 2147483647
   %a = icmp slt i32 %x, %nn
diff --git a/llvm/test/Transforms/InstCombine/select.ll b/llvm/test/Transforms/InstCombine/select.ll
index a84904106eced4..d9734242a86891 100644
--- a/llvm/test/Transforms/InstCombine/select.ll
+++ b/llvm/test/Transforms/InstCombine/select.ll
@@ -2925,10 +2925,8 @@ define i8 @select_replacement_loop3(i32 noundef %x) {
 
 define i16 @select_replacement_loop4(i16 noundef %p_12) {
 ; CHECK-LABEL: @select_replacement_loop4(
-; CHECK-NEXT:    [[CMP1:%.*]] = icmp ult i16 [[P_12:%.*]], 2
-; CHECK-NEXT:    [[AND1:%.*]] = and i16 [[P_12]], 1
-; CHECK-NEXT:    [[AND2:%.*]] = select i1 [[CMP1]], i16 [[AND1]], i16 0
-; CHECK-NEXT:    [[CMP2:%.*]] = icmp eq i16 [[AND2]], [[P_12]]
+; CHECK-NEXT:    [[AND1:%.*]] = and i16 [[P_12:%.*]], 1
+; CHECK-NEXT:    [[CMP2:%.*]] = icmp ult i16 [[P_12]], 2
 ; CHECK-NEXT:    [[AND3:%.*]] = select i1 [[CMP2]], i16 [[AND1]], i16 0
 ; CHECK-NEXT:    ret i16 [[AND3]]
 ;
diff --git a/llvm/test/Transforms/InstCombine/shift.ll b/llvm/test/Transforms/InstCombine/shift.ll
index 62f32c28683711..bef7fc81a7d1f9 100644
--- a/llvm/test/Transforms/InstCombine/shift.ll
+++ b/llvm/test/Transforms/InstCombine/shift.ll
@@ -1751,12 +1751,11 @@ define void @ashr_out_of_range_1(ptr %A) {
 ; CHECK-NEXT:    [[L:%.*]] = load i177, ptr [[A:%.*]], align 4
 ; CHECK-NEXT:    [[L_FROZEN:%.*]] = freeze i177 [[L]]
 ; CHECK-NEXT:    [[TMP1:%.*]] = icmp eq i177 [[L_FROZEN]], -1
-; CHECK-NEXT:    [[B:%.*]] = select i1 [[TMP1]], i177 0, i177 [[L_FROZEN]]
-; CHECK-NEXT:    [[TMP2:%.*]] = trunc i177 [[B]] to i64
+; CHECK-NEXT:    [[TMP6:%.*]] = trunc i177 [[L_FROZEN]] to i64
+; CHECK-NEXT:    [[TMP2:%.*]] = select i1 [[TMP1]], i64 0, i64 [[TMP6]]
 ; CHECK-NEXT:    [[TMP3:%.*]] = getelementptr i177, ptr [[A]], i64 [[TMP2]]
 ; CHECK-NEXT:    [[G11:%.*]] = getelementptr i8, ptr [[TMP3]], i64 -24
-; CHECK-NEXT:    [[C17:%.*]] = icmp sgt i177 [[B]], [[L_FROZEN]]
-; CHECK-NEXT:    [[TMP4:%.*]] = sext i1 [[C17]] to i64
+; CHECK-NEXT:    [[TMP4:%.*]] = sext i1 [[TMP1]] to i64
 ; CHECK-NEXT:    [[G62:%.*]] = getelementptr i177, ptr [[G11]], i64 [[TMP4]]
 ; CHECK-NEXT:    [[TMP5:%.*]] = icmp eq i177 [[L_FROZEN]], -1
 ; CHECK-NEXT:    [[B28:%.*]] = select i1 [[TMP5]], i177 0, i177 [[L_FROZEN]]
diff --git a/llvm/test/Transforms/LoopVectorize/X86/pr23997.ll b/llvm/test/Transforms/LoopVectorize/X86/pr23997.ll
index 0b16d80a4adbc5..5aeac1101fe223 100644
--- a/llvm/test/Transforms/LoopVectorize/X86/pr23997.ll
+++ b/llvm/test/Transforms/LoopVectorize/X86/pr23997.ll
@@ -12,8 +12,7 @@ define void @foo(ptr addrspace(1) align 8 dereferenceable_or_null(16), ptr addrs
 ; CHECK:       preheader:
 ; CHECK-NEXT:    [[DOT10:%.*]] = getelementptr inbounds i8, ptr addrspace(1) [[TMP0:%.*]], i64 16
 ; CHECK-NEXT:    [[DOT12:%.*]] = getelementptr inbounds i8, ptr addrspace(1) [[TMP1:%.*]], i64 16
-; CHECK-NEXT:    [[UMAX2:%.*]] = call i64 @llvm.umax.i64(i64 [[TMP2:%.*]], i64 1)
-; CHECK-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[TMP2]], 16
+; CHECK-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[TMP2:%.*]], 16
 ; CHECK-NEXT:    br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.*]], label [[VECTOR_MEMCHECK:%.*]]
 ; CHECK:       vector.memcheck:
 ; CHECK-NEXT:    [[TMP3:%.*]] = shl i64 [[TMP2]], 3
@@ -25,7 +24,7 @@ define void @foo(ptr addrspace(1) align 8 dereferenceable_or_null(16), ptr addrs
 ; CHECK-NEXT:    [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]
 ; CHECK-NEXT:    br i1 [[FOUND_CONFLICT]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
 ; CHECK:       vector.ph:
-; CHECK-NEXT:    [[N_VEC:%.*]] = and i64 [[UMAX2]], -16
+; CHECK-NEXT:    [[N_VEC:%.*]] = and i64 [[TMP2]], -16
 ; CHECK-NEXT:    br label [[VECTOR_BODY:%.*]]
 ; CHECK:       vector.body:
 ; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
@@ -49,7 +48,7 @@ define void @foo(ptr addrspace(1) align 8 dereferenceable_or_null(16), ptr addrs
 ; CHECK-NEXT:    [[TMP13:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
 ; CHECK-NEXT:    br i1 [[TMP13]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP5:![0-9]+]]
 ; CHECK:       middle.block:
-; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i64 [[UMAX2]], [[N_VEC]]
+; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i64 [[N_VEC]], [[TMP2]]
 ; CHECK-NEXT:    br i1 [[CMP_N]], label [[LOOPEXIT:%.*]], label [[SCALAR_PH]]
 ; CHECK:       scalar.ph:
 ; CHECK-NEXT:    [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[PREHEADER]] ], [ 0, [[VECTOR_MEMCHECK]] ]
diff --git a/llvm/test/Transforms/NewGVN/pr35125.ll b/llvm/test/Transforms/NewGVN/pr35125.ll
index 9a96594e3446db..6724538a5a7f29 100644
--- a/llvm/test/Transforms/NewGVN/pr35125.ll
+++ b/llvm/test/Transforms/NewGVN/pr35125.ll
@@ -18,15 +18,12 @@ define i32 @main() #0 {
 ; CHECK-NEXT:    [[CMP2:%.*]] = icmp ult i32 [[STOREMERGE]], [[PHIOFOPS]]
 ; CHECK-NEXT:    br i1 [[CMP2]], label [[IF_THEN3:%.*]], label [[IF_END6:%.*]]
 ; CHECK:       if.then3:
-; CHECK-NEXT:    [[TOBOOL:%.*]] = icmp eq i32 [[STOREMERGE]], -1
-; CHECK-NEXT:    br i1 [[TOBOOL]], label [[LOR_RHS:%.*]], label [[LOR_END:%.*]]
+; CHECK-NEXT:    br i1 false, label [[LOR_RHS:%.*]], label [[LOR_END:%.*]]
 ; CHECK:       lor.rhs:
-; CHECK-NEXT:    [[TOBOOL5:%.*]] = icmp ne i32 [[TMP0]], 0
-; CHECK-NEXT:    [[PHITMP:%.*]] = zext i1 [[TOBOOL5]] to i32
+; CHECK-NEXT:    store i8 poison, ptr null, align 1
 ; CHECK-NEXT:    br label [[LOR_END]]
 ; CHECK:       lor.end:
-; CHECK-NEXT:    [[TMP1:%.*]] = phi i32 [ 1, [[IF_THEN3]] ], [ [[PHITMP]], [[LOR_RHS]] ]
-; CHECK-NEXT:    store i32 [[TMP1]], ptr @a, align 4
+; CHECK-NEXT:    store i32 1, ptr @a, align 4
 ; CHECK-NEXT:    br label [[IF_END6]]
 ; CHECK:       if.end6:
 ; CHECK-NEXT:    [[TMP2:%.*]] = load i32, ptr @a, align 4

llvmbot · 2024-03-16T23:42:07Z

@llvm/pr-subscribers-llvm-analysis

Author: None (goldsteinn)

Changes

Previously we only handled the L0 == R0 case if both L1 and R1
where constant.

We can get more out of the analysis using general constant ranges
instead.

For example, X u> Y implies X != 0.

In general, any strict comparison on X implies that X is not equal
to the boundary value for the sign and constant ranges with/without
sign bits can be useful in deducing implications.

Full diff: https://github.com/llvm/llvm-project/pull/85557.diff

7 Files Affected:

(modified) llvm/lib/Analysis/ValueTracking.cpp (+44-17)
(modified) llvm/test/Transforms/InstCombine/assume.ll (+1-1)
(modified) llvm/test/Transforms/InstCombine/range-check.ll (+2-10)
(modified) llvm/test/Transforms/InstCombine/select.ll (+2-4)
(modified) llvm/test/Transforms/InstCombine/shift.ll (+3-4)
(modified) llvm/test/Transforms/LoopVectorize/X86/pr23997.ll (+3-4)
(modified) llvm/test/Transforms/NewGVN/pr35125.ll (+3-6)

diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index edbeede910d7f7..b2846274a36e73 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -8499,20 +8499,20 @@ isImpliedCondMatchingOperands(CmpInst::Predicate LPred,
   return std::nullopt;
 }
 
-/// Return true if "icmp LPred X, LC" implies "icmp RPred X, RC" is true.
-/// Return false if "icmp LPred X, LC" implies "icmp RPred X, RC" is false.
+/// Return true if "icmp LPred X, LCR" implies "icmp RPred X, RCR" is true.
+/// Return false if "icmp LPred X, LCR" implies "icmp RPred X, RCR" is false.
 /// Otherwise, return std::nullopt if we can't infer anything.
-static std::optional<bool> isImpliedCondCommonOperandWithConstants(
-    CmpInst::Predicate LPred, const APInt &LC, CmpInst::Predicate RPred,
-    const APInt &RC) {
-  ConstantRange DomCR = ConstantRange::makeExactICmpRegion(LPred, LC);
-  ConstantRange CR = ConstantRange::makeExactICmpRegion(RPred, RC);
-  ConstantRange Intersection = DomCR.intersectWith(CR);
-  ConstantRange Difference = DomCR.difference(CR);
-  if (Intersection.isEmptySet())
-    return false;
-  if (Difference.isEmptySet())
+static std::optional<bool>
+isImpliedCondCommonOperandWithCR(CmpInst::Predicate LPred, ConstantRange LCR,
+                                 CmpInst::Predicate RPred, ConstantRange RCR) {
+  ConstantRange DomCR = ConstantRange::makeAllowedICmpRegion(LPred, LCR);
+  // If all true values for lhs and true for rhs, lhs implies rhs
+  if (DomCR.icmp(RPred, RCR))
     return true;
+
+  // If there is no overlap, lhs implies not rhs
+  if (DomCR.icmp(CmpInst::getInversePredicate(RPred), RCR))
+    return false;
   return std::nullopt;
 }
 
@@ -8532,11 +8532,38 @@ static std::optional<bool> isImpliedCondICmps(const ICmpInst *LHS,
   CmpInst::Predicate LPred =
       LHSIsTrue ? LHS->getPredicate() : LHS->getInversePredicate();
 
-  // Can we infer anything when the 0-operands match and the 1-operands are
-  // constants (not necessarily matching)?
-  const APInt *LC, *RC;
-  if (L0 == R0 && match(L1, m_APInt(LC)) && match(R1, m_APInt(RC)))
-    return isImpliedCondCommonOperandWithConstants(LPred, *LC, RPred, *RC);
+  if (L0 == R1) {
+    std::swap(R0, R1);
+    RPred = ICmpInst::getSwappedPredicate(RPred);
+  }
+  if (L1 == R0) {
+    std::swap(L0, L1);
+    LPred = ICmpInst::getSwappedPredicate(LPred);
+  }
+
+  // See if we can infer anything if operand-0 matches and we have at least one
+  // constant operand-1.
+  if (L0 == R0 && L0->getType()->isIntOrIntVectorTy()) {
+    // Potential TODO: We could also further use the constant range of L0/R0 to
+    // further constraint the constant ranges. At the moment this leads to
+    // several regressions related to not transforming `multi_use(A + C0) eq/ne
+    // C1` (see discussion: D58633).
+    ConstantRange LCR = computeConstantRange(
+        L1, ICmpInst::isSigned(LPred), /* UseInstrInfo=*/true, /*AC=*/nullptr,
+        /*CxtI=*/nullptr, /*DT=*/nullptr, Depth);
+    ConstantRange RCR = computeConstantRange(
+        R1, ICmpInst::isSigned(RPred), /* UseInstrInfo=*/true, /*AC=*/nullptr,
+        /*CxtI=*/nullptr, /*DT=*/nullptr, Depth);
+    // Even if L1/R1 are not both constant, we can still sometimes deduce
+    // relationship from a single constant. For example X u> Y implies X != 0.
+    if (auto R = isImpliedCondCommonOperandWithCR(LPred, LCR, RPred, RCR))
+      return R;
+    // If both L1/R1 where exact constant ranges and we didn't get anything
+    // here, we won't be able to deduce this.
+    const APInt *Unused;
+    if (match(L1, m_APInt(Unused)) && match(R1, m_APInt(Unused)))
+      return std::nullopt;
+  }
 
   // Can we infer anything when the two compares have matching operands?
   bool AreSwappedOps;
diff --git a/llvm/test/Transforms/InstCombine/assume.ll b/llvm/test/Transforms/InstCombine/assume.ll
index 927f0a86b0a252..87c75fb2b55592 100644
--- a/llvm/test/Transforms/InstCombine/assume.ll
+++ b/llvm/test/Transforms/InstCombine/assume.ll
@@ -386,7 +386,7 @@ define i1 @nonnull5(ptr %a) {
 define i32 @assumption_conflicts_with_known_bits(i32 %a, i32 %b) {
 ; CHECK-LABEL: @assumption_conflicts_with_known_bits(
 ; CHECK-NEXT:    store i1 true, ptr poison, align 1
-; CHECK-NEXT:    ret i32 1
+; CHECK-NEXT:    ret i32 poison
 ;
   %and1 = and i32 %b, 3
   %B1 = lshr i32 %and1, %and1
diff --git a/llvm/test/Transforms/InstCombine/range-check.ll b/llvm/test/Transforms/InstCombine/range-check.ll
index 0d138b6ba7e79d..210e57c1d1fe4c 100644
--- a/llvm/test/Transforms/InstCombine/range-check.ll
+++ b/llvm/test/Transforms/InstCombine/range-check.ll
@@ -340,11 +340,7 @@ define i1 @negative4_logical(i32 %x, i32 %n) {
 
 define i1 @negative5(i32 %x, i32 %n) {
 ; CHECK-LABEL: @negative5(
-; CHECK-NEXT:    [[NN:%.*]] = and i32 [[N:%.*]], 2147483647
-; CHECK-NEXT:    [[A:%.*]] = icmp sgt i32 [[NN]], [[X:%.*]]
-; CHECK-NEXT:    [[B:%.*]] = icmp sgt i32 [[X]], -1
-; CHECK-NEXT:    [[C:%.*]] = or i1 [[A]], [[B]]
-; CHECK-NEXT:    ret i1 [[C]]
+; CHECK-NEXT:    ret i1 true
 ;
   %nn = and i32 %n, 2147483647
   %a = icmp slt i32 %x, %nn
@@ -355,11 +351,7 @@ define i1 @negative5(i32 %x, i32 %n) {
 
 define i1 @negative5_logical(i32 %x, i32 %n) {
 ; CHECK-LABEL: @negative5_logical(
-; CHECK-NEXT:    [[NN:%.*]] = and i32 [[N:%.*]], 2147483647
-; CHECK-NEXT:    [[A:%.*]] = icmp sgt i32 [[NN]], [[X:%.*]]
-; CHECK-NEXT:    [[B:%.*]] = icmp sgt i32 [[X]], -1
-; CHECK-NEXT:    [[C:%.*]] = or i1 [[A]], [[B]]
-; CHECK-NEXT:    ret i1 [[C]]
+; CHECK-NEXT:    ret i1 true
 ;
   %nn = and i32 %n, 2147483647
   %a = icmp slt i32 %x, %nn
diff --git a/llvm/test/Transforms/InstCombine/select.ll b/llvm/test/Transforms/InstCombine/select.ll
index a84904106eced4..d9734242a86891 100644
--- a/llvm/test/Transforms/InstCombine/select.ll
+++ b/llvm/test/Transforms/InstCombine/select.ll
@@ -2925,10 +2925,8 @@ define i8 @select_replacement_loop3(i32 noundef %x) {
 
 define i16 @select_replacement_loop4(i16 noundef %p_12) {
 ; CHECK-LABEL: @select_replacement_loop4(
-; CHECK-NEXT:    [[CMP1:%.*]] = icmp ult i16 [[P_12:%.*]], 2
-; CHECK-NEXT:    [[AND1:%.*]] = and i16 [[P_12]], 1
-; CHECK-NEXT:    [[AND2:%.*]] = select i1 [[CMP1]], i16 [[AND1]], i16 0
-; CHECK-NEXT:    [[CMP2:%.*]] = icmp eq i16 [[AND2]], [[P_12]]
+; CHECK-NEXT:    [[AND1:%.*]] = and i16 [[P_12:%.*]], 1
+; CHECK-NEXT:    [[CMP2:%.*]] = icmp ult i16 [[P_12]], 2
 ; CHECK-NEXT:    [[AND3:%.*]] = select i1 [[CMP2]], i16 [[AND1]], i16 0
 ; CHECK-NEXT:    ret i16 [[AND3]]
 ;
diff --git a/llvm/test/Transforms/InstCombine/shift.ll b/llvm/test/Transforms/InstCombine/shift.ll
index 62f32c28683711..bef7fc81a7d1f9 100644
--- a/llvm/test/Transforms/InstCombine/shift.ll
+++ b/llvm/test/Transforms/InstCombine/shift.ll
@@ -1751,12 +1751,11 @@ define void @ashr_out_of_range_1(ptr %A) {
 ; CHECK-NEXT:    [[L:%.*]] = load i177, ptr [[A:%.*]], align 4
 ; CHECK-NEXT:    [[L_FROZEN:%.*]] = freeze i177 [[L]]
 ; CHECK-NEXT:    [[TMP1:%.*]] = icmp eq i177 [[L_FROZEN]], -1
-; CHECK-NEXT:    [[B:%.*]] = select i1 [[TMP1]], i177 0, i177 [[L_FROZEN]]
-; CHECK-NEXT:    [[TMP2:%.*]] = trunc i177 [[B]] to i64
+; CHECK-NEXT:    [[TMP6:%.*]] = trunc i177 [[L_FROZEN]] to i64
+; CHECK-NEXT:    [[TMP2:%.*]] = select i1 [[TMP1]], i64 0, i64 [[TMP6]]
 ; CHECK-NEXT:    [[TMP3:%.*]] = getelementptr i177, ptr [[A]], i64 [[TMP2]]
 ; CHECK-NEXT:    [[G11:%.*]] = getelementptr i8, ptr [[TMP3]], i64 -24
-; CHECK-NEXT:    [[C17:%.*]] = icmp sgt i177 [[B]], [[L_FROZEN]]
-; CHECK-NEXT:    [[TMP4:%.*]] = sext i1 [[C17]] to i64
+; CHECK-NEXT:    [[TMP4:%.*]] = sext i1 [[TMP1]] to i64
 ; CHECK-NEXT:    [[G62:%.*]] = getelementptr i177, ptr [[G11]], i64 [[TMP4]]
 ; CHECK-NEXT:    [[TMP5:%.*]] = icmp eq i177 [[L_FROZEN]], -1
 ; CHECK-NEXT:    [[B28:%.*]] = select i1 [[TMP5]], i177 0, i177 [[L_FROZEN]]
diff --git a/llvm/test/Transforms/LoopVectorize/X86/pr23997.ll b/llvm/test/Transforms/LoopVectorize/X86/pr23997.ll
index 0b16d80a4adbc5..5aeac1101fe223 100644
--- a/llvm/test/Transforms/LoopVectorize/X86/pr23997.ll
+++ b/llvm/test/Transforms/LoopVectorize/X86/pr23997.ll
@@ -12,8 +12,7 @@ define void @foo(ptr addrspace(1) align 8 dereferenceable_or_null(16), ptr addrs
 ; CHECK:       preheader:
 ; CHECK-NEXT:    [[DOT10:%.*]] = getelementptr inbounds i8, ptr addrspace(1) [[TMP0:%.*]], i64 16
 ; CHECK-NEXT:    [[DOT12:%.*]] = getelementptr inbounds i8, ptr addrspace(1) [[TMP1:%.*]], i64 16
-; CHECK-NEXT:    [[UMAX2:%.*]] = call i64 @llvm.umax.i64(i64 [[TMP2:%.*]], i64 1)
-; CHECK-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[TMP2]], 16
+; CHECK-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[TMP2:%.*]], 16
 ; CHECK-NEXT:    br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.*]], label [[VECTOR_MEMCHECK:%.*]]
 ; CHECK:       vector.memcheck:
 ; CHECK-NEXT:    [[TMP3:%.*]] = shl i64 [[TMP2]], 3
@@ -25,7 +24,7 @@ define void @foo(ptr addrspace(1) align 8 dereferenceable_or_null(16), ptr addrs
 ; CHECK-NEXT:    [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]
 ; CHECK-NEXT:    br i1 [[FOUND_CONFLICT]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
 ; CHECK:       vector.ph:
-; CHECK-NEXT:    [[N_VEC:%.*]] = and i64 [[UMAX2]], -16
+; CHECK-NEXT:    [[N_VEC:%.*]] = and i64 [[TMP2]], -16
 ; CHECK-NEXT:    br label [[VECTOR_BODY:%.*]]
 ; CHECK:       vector.body:
 ; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
@@ -49,7 +48,7 @@ define void @foo(ptr addrspace(1) align 8 dereferenceable_or_null(16), ptr addrs
 ; CHECK-NEXT:    [[TMP13:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
 ; CHECK-NEXT:    br i1 [[TMP13]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP5:![0-9]+]]
 ; CHECK:       middle.block:
-; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i64 [[UMAX2]], [[N_VEC]]
+; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i64 [[N_VEC]], [[TMP2]]
 ; CHECK-NEXT:    br i1 [[CMP_N]], label [[LOOPEXIT:%.*]], label [[SCALAR_PH]]
 ; CHECK:       scalar.ph:
 ; CHECK-NEXT:    [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[PREHEADER]] ], [ 0, [[VECTOR_MEMCHECK]] ]
diff --git a/llvm/test/Transforms/NewGVN/pr35125.ll b/llvm/test/Transforms/NewGVN/pr35125.ll
index 9a96594e3446db..6724538a5a7f29 100644
--- a/llvm/test/Transforms/NewGVN/pr35125.ll
+++ b/llvm/test/Transforms/NewGVN/pr35125.ll
@@ -18,15 +18,12 @@ define i32 @main() #0 {
 ; CHECK-NEXT:    [[CMP2:%.*]] = icmp ult i32 [[STOREMERGE]], [[PHIOFOPS]]
 ; CHECK-NEXT:    br i1 [[CMP2]], label [[IF_THEN3:%.*]], label [[IF_END6:%.*]]
 ; CHECK:       if.then3:
-; CHECK-NEXT:    [[TOBOOL:%.*]] = icmp eq i32 [[STOREMERGE]], -1
-; CHECK-NEXT:    br i1 [[TOBOOL]], label [[LOR_RHS:%.*]], label [[LOR_END:%.*]]
+; CHECK-NEXT:    br i1 false, label [[LOR_RHS:%.*]], label [[LOR_END:%.*]]
 ; CHECK:       lor.rhs:
-; CHECK-NEXT:    [[TOBOOL5:%.*]] = icmp ne i32 [[TMP0]], 0
-; CHECK-NEXT:    [[PHITMP:%.*]] = zext i1 [[TOBOOL5]] to i32
+; CHECK-NEXT:    store i8 poison, ptr null, align 1
 ; CHECK-NEXT:    br label [[LOR_END]]
 ; CHECK:       lor.end:
-; CHECK-NEXT:    [[TMP1:%.*]] = phi i32 [ 1, [[IF_THEN3]] ], [ [[PHITMP]], [[LOR_RHS]] ]
-; CHECK-NEXT:    store i32 [[TMP1]], ptr @a, align 4
+; CHECK-NEXT:    store i32 1, ptr @a, align 4
 ; CHECK-NEXT:    br label [[IF_END6]]
 ; CHECK:       if.end6:
 ; CHECK-NEXT:    [[TMP2:%.*]] = load i32, ptr @a, align 4

goldsteinn · 2024-03-16T23:43:53Z

Minimal (but existing) compile time impact: https://llvm-compile-time-tracker.com/compare.php?from=e77378cc14ec712942452aca155addacbe904c8f&to=4fe7d1d86ee416afa476594b4e5d38f342f76498&stat=instructions%3Au

Alternative if we req at least one of L1/R1 to be constant has lower compile time regression but misses a few of the folds:
https://llvm-compile-time-tracker.com/compare.php?from=e77378cc14ec712942452aca155addacbe904c8f&to=ca42cfefc5d89d915e42a3bfd55d6c1993019a6a&stat=instructions:u

nikic · 2024-03-17T14:36:14Z

@dtcxzyw Can you please test this?

PR Link: llvm/llvm-project#85557

dtcxzyw · 2024-03-17T14:52:13Z

I am not sure whether it is suitable to use computeConstantRange in isImpliedCond.

See also nikic's comment: #69840 (comment)

nikic · 2024-03-17T15:53:15Z

Yeah, I can't say I'm particularly fond of the direction.

From the test diffs, a fold we're missing is this: https://alive2.llvm.org/ce/z/-njJr8 Particularly profitable if the new comparison folds later, but also seems generally beneficial.

goldsteinn · 2024-03-17T15:58:05Z

I am not sure whether it is suitable to use computeConstantRange in isImpliedCond.

See also nikic's comment: #69840 (comment)

So originally my goal was just to handle cases like X u> Y implies X != 0.

My first attempt was a bespoke set of switching statements, but pretty quickly saw that just changing to CR would be an easier and less bugprone way to impl that. At that point I wasn't using computeConstantRange, just getFull if the argument wasn't an APInt.

Then figured, if there is no big compile time impact just using computeConstantRange, that just purely more information, so here we are.

My point is I think basically each step seems like a reasonable improvement on the alternative. If the compile time impact is a proper concern, think it may make sense to constrain computeConstantRange (maybe pass MaxDepth - 1), but other than that, not sure what the rationale against is.

goldsteinn · 2024-03-17T16:21:08Z

Yeah, I can't say I'm particularly fond of the direction.

From the test diffs, a fold we're missing is this: https://alive2.llvm.org/ce/z/-njJr8 Particularly profitable if the new comparison folds later, but also seems generally beneficial.

Would think its more principled to more generally handle the implication between conditions rather than add a transform for each possible case.

nikic · 2024-03-17T16:30:40Z

Yeah, I can't say I'm particularly fond of the direction.
From the test diffs, a fold we're missing is this: https://alive2.llvm.org/ce/z/-njJr8 Particularly profitable if the new comparison folds later, but also seems generally beneficial.

Would think its more principled to more generally handle the implication between conditions rather than add a transform for each possible case.

This is still an implied condition based transform. In fact, now that I look for it, it seems like foldSelectICmp() should be doing exactly that. I think maybe it doesn't trigger because the condition becomes icmp eq 2, %x and we need to handle non-canonical icmp order in isImpliedCond.

(This is not intended as a full replacement for what you want here, just something I noticed in the tests.)

nikic · 2024-03-17T16:36:25Z

Ah, I think the reason I saw it in the diffs is that this patch adds the non-canonical order handling -- the improvement is not actually related to the constant range support at all. Could you please split this out into a separate patch?

goldsteinn · 2024-03-17T16:50:25Z

Ah, I think the reason I saw it in the diffs is that this patch adds the non-canonical order handling -- the improvement is not actually related to the constant range support at all. Could you please split this out into a separate patch?

Didn't realize L0 / R0 could be constants. ill split that out. Seperate patch or commit?

nikic · 2024-03-17T17:08:41Z

Separate patch please.

goldsteinn · 2024-03-17T17:16:26Z

Separate patch please.

Done, see: #85575

goldsteinn · 2024-03-17T22:56:43Z

Rebased

PR Link: llvm/llvm-project#85557

dtcxzyw · 2024-03-18T05:56:43Z

This patch seems to block SROA: dtcxzyw/llvm-opt-benchmark#419 (comment).

goldsteinn · 2024-03-19T18:45:27Z

This patch seems to block SROA: dtcxzyw/llvm-opt-benchmark#419 (comment).

Seems to boil down to simplifications happening earlier.

A reduced form:


define void @fun0() {
entry:
  %first111 = alloca [0 x [0 x [0 x ptr]]], i32 0, align 8
  store i64 0, ptr %first111, align 8
  %last = getelementptr i8, ptr %first111, i64 8
  call void @fun3(ptr %first111, ptr %last)
  ret void
}

define void @fun3(ptr %first, ptr %last, ptr %p_in) {
entry:
  %sub.ptr.lhs.cast = ptrtoint ptr %last to i64
  %sub.ptr.rhs.cast = ptrtoint ptr %first to i64
  %sub.ptr.sub = sub i64 %sub.ptr.lhs.cast, %sub.ptr.rhs.cast
  %call = ashr exact i64 %sub.ptr.sub, 3
  %call2 = load volatile i64, ptr %p_in, align 8
  %cmp = icmp ugt i64 %call, %call2
  br i1 %cmp, label %common.ret, label %if.else

common.ret:                                       ; preds = %if.else29, %if.else, %entry
  ret void

if.else:                                          ; preds = %entry
  %c_load.cast0.i = ptrtoint ptr %last to i64
  %c_load.cast.div0.i = ashr exact i64 %c_load.cast0.i, 3
  %cmp24.not = icmp ult i64 %c_load.cast.div0.i, %call
  br i1 %cmp24.not, label %if.else29, label %common.ret

if.else29:                                        ; preds = %if.else
  %n_is_c = call i1 @llvm.is.constant.i64(i64 %c_load.cast.div0.i)
  %cmp2 = icmp eq i64 %c_load.cast.div0.i, -1
  %or.cond1 = and i1 %n_is_c, %cmp2
  %add.ptr = getelementptr i64, ptr %first, i64 %c_load.cast.div0.i
  %.pre = ptrtoint ptr %add.ptr to i64
  %ptr.lhs.pre-phi = select i1 %or.cond1, i64 0, i64 %.pre
  %ptr.sub = sub i64 %ptr.lhs.pre-phi, %sub.ptr.rhs.cast
  call void @llvm.memmove.p0.p0.i64(ptr null, ptr %first, i64 %ptr.sub, i1 false)
  br label %common.ret
}

; Function Attrs: nocallback nofree nounwind willreturn memory(argmem: readwrite)
declare void @llvm.memmove.p0.p0.i64(ptr nocapture writeonly, ptr nocapture readonly, i64, i1 immarg) #0

; Function Attrs: convergent nocallback nofree nosync nounwind willreturn memory(none)
declare i1 @llvm.is.constant.i64(i64) #1

attributes #0 = { nocallback nofree nounwind willreturn memory(argmem: readwrite) }
attributes #1 = { convergent nocallback nofree nosync nounwind willreturn memory(none) }

Where we go awry is when fold:

  %c_load.cast.div0.i = ashr exact i64 %c_load.cast0.i, 3
  %cmp24.not = icmp ult i64 %c_load.cast.div0.i, %call

->

  %cmp24.not = icmp ugt i64 %sub.ptr.sub, %c_load.cast0.i

Which eventually results in the following diff after inlining:

  %c_load.cast.div0.i.i = ashr exact i64 %c_load.cast0.i.i, 3
  %cmp24.not.i = icmp ult i64 %c_load.cast.div0.i.i, 1

  %cmp24.not.i = icmp ugt i64 8, %c_load.cast0.i.i

Then finally:

  %cmp24.not.i = icmp eq ptr %c_load0.i.i, null

vs

  %cmp24.not.i = icmp ult ptr %c_load0.i.i, inttoptr (i64 8 to ptr)

Essentially we throw away the information that the low 3 bits of the pointer are zero
before we have enough information to fully reduce the compare to something easy to
analyze.

Looking into a fix...

Edit: Similiar to last time, there is no point where it seems we make a "bad decision",
its just that the order we make good decisions varies.

goldsteinn · 2024-03-19T19:03:32Z

This patch seems to block SROA: dtcxzyw/llvm-opt-benchmark#419 (comment).

Seems to boil down to simplifications happening earlier.

A reduced form:


define void @fun0() {
entry:
  %first111 = alloca [0 x [0 x [0 x ptr]]], i32 0, align 8
  store i64 0, ptr %first111, align 8
  %last = getelementptr i8, ptr %first111, i64 8
  call void @fun3(ptr %first111, ptr %last)
  ret void
}

define void @fun3(ptr %first, ptr %last, ptr %p_in) {
entry:
  %sub.ptr.lhs.cast = ptrtoint ptr %last to i64
  %sub.ptr.rhs.cast = ptrtoint ptr %first to i64
  %sub.ptr.sub = sub i64 %sub.ptr.lhs.cast, %sub.ptr.rhs.cast
  %call = ashr exact i64 %sub.ptr.sub, 3
  %call2 = load volatile i64, ptr %p_in, align 8
  %cmp = icmp ugt i64 %call, %call2
  br i1 %cmp, label %common.ret, label %if.else

common.ret:                                       ; preds = %if.else29, %if.else, %entry
  ret void

if.else:                                          ; preds = %entry
  %c_load.cast0.i = ptrtoint ptr %last to i64
  %c_load.cast.div0.i = ashr exact i64 %c_load.cast0.i, 3
  %cmp24.not = icmp ult i64 %c_load.cast.div0.i, %call
  br i1 %cmp24.not, label %if.else29, label %common.ret

if.else29:                                        ; preds = %if.else
  %n_is_c = call i1 @llvm.is.constant.i64(i64 %c_load.cast.div0.i)
  %cmp2 = icmp eq i64 %c_load.cast.div0.i, -1
  %or.cond1 = and i1 %n_is_c, %cmp2
  %add.ptr = getelementptr i64, ptr %first, i64 %c_load.cast.div0.i
  %.pre = ptrtoint ptr %add.ptr to i64
  %ptr.lhs.pre-phi = select i1 %or.cond1, i64 0, i64 %.pre
  %ptr.sub = sub i64 %ptr.lhs.pre-phi, %sub.ptr.rhs.cast
  call void @llvm.memmove.p0.p0.i64(ptr null, ptr %first, i64 %ptr.sub, i1 false)
  br label %common.ret
}

; Function Attrs: nocallback nofree nounwind willreturn memory(argmem: readwrite)
declare void @llvm.memmove.p0.p0.i64(ptr nocapture writeonly, ptr nocapture readonly, i64, i1 immarg) #0

; Function Attrs: convergent nocallback nofree nosync nounwind willreturn memory(none)
declare i1 @llvm.is.constant.i64(i64) #1

attributes #0 = { nocallback nofree nounwind willreturn memory(argmem: readwrite) }
attributes #1 = { convergent nocallback nofree nosync nounwind willreturn memory(none) }

Where we go awry is when fold:

  %c_load.cast.div0.i = ashr exact i64 %c_load.cast0.i, 3
  %cmp24.not = icmp ult i64 %c_load.cast.div0.i, %call

->

  %cmp24.not = icmp ugt i64 %sub.ptr.sub, %c_load.cast0.i

Which eventually results in the following diff after inlining:

  %c_load.cast.div0.i.i = ashr exact i64 %c_load.cast0.i.i, 3
  %cmp24.not.i = icmp ult i64 %c_load.cast.div0.i.i, 1

  %cmp24.not.i = icmp ugt i64 8, %c_load.cast0.i.i

Then finally:

  %cmp24.not.i = icmp eq ptr %c_load0.i.i, null

vs

  %cmp24.not.i = icmp ult ptr %c_load0.i.i, inttoptr (i64 8 to ptr)

Essentially we throw away the information that the low 3 bits of the pointer are zero before we have enough information to fully reduce the compare to something easy to analyze.

Looking into a fix...

Edit: Similiar to last time, there is no point where it seems we make a "bad decision", its just that the order we make good decisions varies.

The only really thing I can think of is to preserve the information w/ assumes
when we fold a single use shr exact. But that requires the use has
a noundef b.c violating an assume is immediate UB:
https://alive2.llvm.org/ce/z/AkRPAs

Don't think thats really a great path to go down although some
method of ensuring we don't throw away information when
folding would be nice.

goldsteinn · 2024-03-21T00:37:52Z

@nikic, is the idea of this patch okay? Or are you strongly opposed to using computeConstantRange here.
Ill rework if thats the case (want to at least handle basic stuff like X u> Y -> X != 0 which we don't need computeConstantRange for).

dtcxzyw · 2024-03-22T13:19:33Z

The only really thing I can think of is to preserve the information w/ assumes
when we fold a single use shr exact.

We met the same issue in dtcxzyw/llvm-opt-benchmark#49 (comment).

goldsteinn · 2024-03-22T15:23:04Z

The only really thing I can think of is to preserve the information w/ assumes
when we fold a single use shr exact.

We met the same issue in dtcxzyw/llvm-opt-benchmark#49 (comment).

Yeah I think its a fairly common issue. And it seems you guys
reached the same conclusions.

goldsteinn · 2024-04-10T17:28:41Z

Rebased, limited CR analysis to MaxRecursiveDepth - 1. My feeling is this is mostly to capture relationships like X u> Y implies X != 0. Using CR is IMO the easiest way to avoid implementing bespoke logic here.

nikic · 2024-04-11T09:27:27Z

Rebased, limited CR analysis to MaxRecursiveDepth - 1. My feeling is this is mostly to capture relationships like X u> Y implies X != 0. Using CR is IMO the easiest way to avoid implementing bespoke logic here.

Using MaxRecursiveDepth - 1 is not particularly useful in this context, because computeConstantRange() is already essentially non-recursive.

This change would be ok if it were free, but it isn't (https://llvm-compile-time-tracker.com/compare.php?from=7d60232b38b66138dae1b31027d73ee5b9df5c58&to=2f155d6f9baacec48a9f69abffbfbca91ef57b46&stat=instructions:u). I don't think that it justifies the cost.

goldsteinn · 2024-04-11T15:38:48Z

Rebased, limited CR analysis to MaxRecursiveDepth - 1. My feeling is this is mostly to capture relationships like X u> Y implies X != 0. Using CR is IMO the easiest way to avoid implementing bespoke logic here.

Using MaxRecursiveDepth - 1 is not particularly useful in this context, because computeConstantRange() is already essentially non-recursive.

This change would be ok if it were free, but it isn't (https://llvm-compile-time-tracker.com/compare.php?from=7d60232b38b66138dae1b31027d73ee5b9df5c58&to=2f155d6f9baacec48a9f69abffbfbca91ef57b46&stat=instructions:u). I don't think that it justifies the cost.

Okay, ill work to make it free.

goldsteinn · 2024-04-12T02:51:22Z

Rebased, limited CR analysis to MaxRecursiveDepth - 1. My feeling is this is mostly to capture relationships like X u> Y implies X != 0. Using CR is IMO the easiest way to avoid implementing bespoke logic here.

Using MaxRecursiveDepth - 1 is not particularly useful in this context, because computeConstantRange() is already essentially non-recursive.
This change would be ok if it were free, but it isn't (https://llvm-compile-time-tracker.com/compare.php?from=7d60232b38b66138dae1b31027d73ee5b9df5c58&to=2f155d6f9baacec48a9f69abffbfbca91ef57b46&stat=instructions:u). I don't think that it justifies the cost.

Okay, ill work to make it free.

@nikic, limited to requiring one constant (either L1 or R1) and that seems to address all the compile time concerns:
https://llvm-compile-time-tracker.com/compare.php?from=7aa371687ace40b85f04e21956e03f1e93052b56&to=98198d78a9a2c0e8e5eadf60ff0f4b57ba248698&stat=instructions:u

goldsteinn · 2024-04-17T19:10:27Z

ping

goldsteinn · 2024-05-05T00:25:14Z

ping2

goldsteinn · 2024-05-10T19:08:52Z

ping

PR Link: llvm/llvm-project#85557

dtcxzyw · 2024-05-12T04:26:47Z

This patch seems to block SROA: dtcxzyw/llvm-opt-benchmark#419 (comment).

This problem is still outstanding. Any thoughts about preserving information from the exact flag?

goldsteinn · 2024-05-12T06:33:43Z

This patch seems to block SROA: dtcxzyw/llvm-opt-benchmark#419 (comment).

This problem is still outstanding. Any thoughts about preserving information from the exact flag?

I think this is a more general problem that we don't have any good solution for. Think the diff is net positive so would prefer to not block this patch w/ the potentially intractable problem of folds throwing away information

goldsteinn · 2024-05-24T17:10:50Z

ping

goldsteinn · 2024-06-04T16:38:31Z

ping

goldsteinn · 2024-06-17T17:36:25Z

ping @nikic, I think the compile time concerns have been addressed and this has value. Can we get this in?

dtcxzyw · 2024-06-18T05:56:27Z

Compile time measurement:

Top 5 improvements:
faiss/IndexFlat.cpp.ll 654470122 634322266 -3.08%
php/zend_ini_scanner.ll 3135395912 3060906239 -2.38%
ocio/Lut3DOpCPU.cpp.ll 2695049654 2634701496 -2.24%
faiss/IndexIVF.cpp.ll 2468718746 2414286811 -2.20%
faiss/utils.cpp.ll 1382069983 1352269707 -2.16%
Top 5 regressions:
faiss/IVFlib.cpp.ll 1511325490 1570395655 +3.91%
linux/printk_ringbuffer.ll 329303620 338516118 +2.80%
cvc5/inference_id.cpp.ll 338715670 348139511 +2.78%
gromacs/constr.cpp.ll 2172064912 2224281265 +2.40%
postgres/unicode_norm.ll 359020722 367297769 +2.31%
Overall: -0.00761517%

llvm/lib/Analysis/ValueTracking.cpp

dtcxzyw

LGTM. Thank you!

llvm/lib/Analysis/ValueTracking.cpp

goldsteinn · 2024-06-27T03:57:37Z

llvm/lib/Analysis/ValueTracking.cpp

+    CmpInst::Predicate RPred, const ConstantRange &RCR) {
+  ConstantRange DomCR = ConstantRange::makeAllowedICmpRegion(LPred, LCR);
+  // If all true values for lhs and true for rhs, lhs implies rhs
+  if (DomCR.icmp(RPred, RCR))


@nikic, did you comment a concern here then delete it, or is my github not updating properly? I see the email notification but nothing here.

I think ConstantRange::makeAllowedICmpRegion is correct here.

Thanks. I believe this is correct aswell b.c icmp uses makeSatisfyingICmpRegion. @nikic assuming you deleted your comment.

Yeah, I deleted my comment because I realized I was thinking about it the wrong way around right after I posted it. Sorry for the confusion.

goldsteinn · 2024-06-28T07:32:37Z

Assuming no objects, im going to push this in the next few days (I'll re-verify the compile-time impact before I do).

dtcxzyw · 2024-06-28T07:40:04Z

I'll re-verify the compile-time impact before I do.

See dtcxzyw/llvm-opt-benchmark#680 (comment). Hopefully it helps you :)

nikic

LGTM

nikic · 2024-07-01T15:01:47Z

llvm/lib/Analysis/ValueTracking.cpp

+    CmpInst::Predicate RPred, const ConstantRange &RCR) {
+  ConstantRange DomCR = ConstantRange::makeAllowedICmpRegion(LPred, LCR);
+  // If all true values for lhs and true for rhs, lhs implies rhs
+  if (DomCR.icmp(RPred, RCR))


Yeah, I deleted my comment because I realized I was thinking about it the wrong way around right after I posted it. Sorry for the confusion.

…constants. Previously we only handled the `L0 == R0` case if both `L1` and `R1` where constant. We can get more out of the analysis using general constant ranges instead. For example, `X u> Y` implies `X != 0`. In general, any strict comparison on `X` implies that `X` is not equal to the boundary value for the sign and constant ranges with/without sign bits can be useful in deducing implications.

…constants. Previously we only handled the `L0 == R0` case if both `L1` and `R1` where constant. We can get more out of the analysis using general constant ranges instead. For example, `X u> Y` implies `X != 0`. In general, any strict comparison on `X` implies that `X` is not equal to the boundary value for the sign and constant ranges with/without sign bits can be useful in deducing implications. Closes llvm#85557

goldsteinn requested a review from nikic as a code owner March 16, 2024 23:41

llvmbot added llvm:analysis llvm:transforms labels Mar 16, 2024

goldsteinn requested a review from dtcxzyw March 16, 2024 23:41

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Mar 17, 2024

pre-commit: test PR85557

5eeedcf

PR Link: llvm/llvm-project#85557

dtcxzyw mentioned this pull request Mar 17, 2024

pre-commit: test PR85557 dtcxzyw/llvm-opt-benchmark#419

Closed

goldsteinn force-pushed the perf/goldsteinn/implied-with-cr branch from 69ab890 to cc7df74 Compare March 17, 2024 22:54

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Mar 18, 2024

pre-commit: test PR85557

3643e63

PR Link: llvm/llvm-project#85557

goldsteinn force-pushed the perf/goldsteinn/implied-with-cr branch from cc7df74 to 2f155d6 Compare April 10, 2024 17:28

goldsteinn force-pushed the perf/goldsteinn/implied-with-cr branch from 2f155d6 to 98198d7 Compare April 12, 2024 01:28

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request May 12, 2024

pre-commit: test PR85557

331342a

PR Link: llvm/llvm-project#85557

dtcxzyw mentioned this pull request May 12, 2024

pre-commit: test PR85557 dtcxzyw/llvm-opt-benchmark#601

Closed

dtcxzyw reviewed Jun 18, 2024

View reviewed changes

llvm/lib/Analysis/ValueTracking.cpp Outdated Show resolved Hide resolved

llvm/lib/Analysis/ValueTracking.cpp Show resolved Hide resolved

llvm/lib/Analysis/ValueTracking.cpp Outdated Show resolved Hide resolved

goldsteinn force-pushed the perf/goldsteinn/implied-with-cr branch from 98198d7 to c74b101 Compare June 26, 2024 05:54

dtcxzyw approved these changes Jun 26, 2024

View reviewed changes

dtcxzyw reviewed Jun 26, 2024

View reviewed changes

llvm/lib/Analysis/ValueTracking.cpp Outdated Show resolved Hide resolved

goldsteinn force-pushed the perf/goldsteinn/implied-with-cr branch from c74b101 to e026598 Compare June 26, 2024 09:27

goldsteinn commented Jun 27, 2024

View reviewed changes

nikic approved these changes Jul 1, 2024

View reviewed changes

goldsteinn force-pushed the perf/goldsteinn/implied-with-cr branch from e026598 to 435e044 Compare July 3, 2024 08:02

goldsteinn closed this in 7c96469 Jul 3, 2024

[ValueTracking] Extend LHS/RHS with matching operand to work without constants. #85557

[ValueTracking] Extend LHS/RHS with matching operand to work without constants. #85557

Conversation

goldsteinn commented Mar 16, 2024

llvmbot commented Mar 16, 2024

llvmbot commented Mar 16, 2024

goldsteinn commented Mar 16, 2024

nikic commented Mar 17, 2024

dtcxzyw commented Mar 17, 2024

nikic commented Mar 17, 2024

goldsteinn commented Mar 17, 2024

goldsteinn commented Mar 17, 2024

nikic commented Mar 17, 2024

nikic commented Mar 17, 2024

goldsteinn commented Mar 17, 2024

nikic commented Mar 17, 2024

goldsteinn commented Mar 17, 2024

goldsteinn commented Mar 17, 2024

dtcxzyw commented Mar 18, 2024

goldsteinn commented Mar 19, 2024 • edited Loading

goldsteinn commented Mar 19, 2024

goldsteinn commented Mar 21, 2024

dtcxzyw commented Mar 22, 2024

goldsteinn commented Mar 22, 2024

goldsteinn commented Apr 10, 2024

nikic commented Apr 11, 2024

goldsteinn commented Apr 11, 2024

goldsteinn commented Apr 12, 2024

goldsteinn commented Apr 17, 2024

goldsteinn commented May 5, 2024

goldsteinn commented May 10, 2024

dtcxzyw commented May 12, 2024

goldsteinn commented May 12, 2024

goldsteinn commented May 24, 2024

goldsteinn commented Jun 4, 2024

goldsteinn commented Jun 17, 2024

dtcxzyw commented Jun 18, 2024

dtcxzyw left a comment

Choose a reason for hiding this comment

goldsteinn Jun 27, 2024

Choose a reason for hiding this comment

dtcxzyw Jun 27, 2024

Choose a reason for hiding this comment

goldsteinn Jun 27, 2024

Choose a reason for hiding this comment

nikic Jul 1, 2024

Choose a reason for hiding this comment

goldsteinn commented Jun 28, 2024

dtcxzyw commented Jun 28, 2024

nikic left a comment

Choose a reason for hiding this comment

nikic Jul 1, 2024

Choose a reason for hiding this comment

goldsteinn commented Mar 19, 2024 •

edited

Loading