Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MachineLICM] Relax overlay conservative PHI check #67186

Merged
merged 1 commit into from
Oct 9, 2023

Conversation

hgreving2304
Copy link
Member

Skip LICM if PHI belongs to the current loop, e.g. is in the
loop's header. This prevents LICM from bailing for CFGs like

L1:
R = LoopInvariant // can be LICM'd
BR L1
L2:
PHI(R, ..)
BR L2

Skip LICM if PHI belongs to the current loop, e.g. is in the
loop's header. This prevents LICM from bailing for CFGs like

L1:
  R = LoopInvariant // can be LICM'd
  BR L1
L2:
  PHI(R, ..)
  BR L2
@llvmbot
Copy link
Member

llvmbot commented Sep 22, 2023

@llvm/pr-subscribers-debuginfo

@llvm/pr-subscribers-backend-x86

Changes

Skip LICM if PHI belongs to the current loop, e.g. is in the
loop's header. This prevents LICM from bailing for CFGs like

L1:
R = LoopInvariant // can be LICM'd
BR L1
L2:
PHI(R, ..)
BR L2


Patch is 37.72 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/67186.diff

11 Files Affected:

  • (modified) llvm/lib/CodeGen/MachineLICM.cpp (+1-1)
  • (modified) llvm/test/CodeGen/X86/2008-04-28-CoalescerBug.ll (+8-11)
  • (modified) llvm/test/CodeGen/X86/2012-01-10-UndefExceptionEdge.ll (+14-15)
  • (modified) llvm/test/CodeGen/X86/cmpxchg-clobber-flags.ll (+2-29)
  • (modified) llvm/test/CodeGen/X86/conditional-tailcall.ll (+34-30)
  • (modified) llvm/test/CodeGen/X86/licm-nested.ll (+1-1)
  • (modified) llvm/test/CodeGen/X86/loop-strength-reduce7.ll (+5-5)
  • (modified) llvm/test/CodeGen/X86/pr38795.ll (+95-93)
  • (modified) llvm/test/CodeGen/X86/ragreedy-hoist-spill.ll (+41-42)
  • (modified) llvm/test/CodeGen/X86/setcc-non-simple-type.ll (+2-2)
  • (modified) llvm/test/DebugInfo/X86/dbg-value-transfer-order.ll (+1-1)
diff --git a/llvm/lib/CodeGen/MachineLICM.cpp b/llvm/lib/CodeGen/MachineLICM.cpp
index fa7aa4a7a44ad08..e8bd51f54565a98 100644
--- a/llvm/lib/CodeGen/MachineLICM.cpp
+++ b/llvm/lib/CodeGen/MachineLICM.cpp
@@ -1028,7 +1028,7 @@ bool MachineLICMBase::HasLoopPHIUse(const MachineInstr *MI) const {
         if (UseMI.isPHI()) {
           // A PHI inside the loop causes a copy because the live range of Reg is
           // extended across the PHI.
-          if (CurLoop->contains(&UseMI))
+          if (CurLoop->getHeader() == UseMI.getParent())
             return true;
           // A PHI in an exit block can cause a copy to be inserted if the PHI
           // has multiple predecessors in the loop with different values.
diff --git a/llvm/test/CodeGen/X86/2008-04-28-CoalescerBug.ll b/llvm/test/CodeGen/X86/2008-04-28-CoalescerBug.ll
index 8e6d2c11b7b3dcf..0e9a2457d090ab5 100644
--- a/llvm/test/CodeGen/X86/2008-04-28-CoalescerBug.ll
+++ b/llvm/test/CodeGen/X86/2008-04-28-CoalescerBug.ll
@@ -15,7 +15,7 @@ define void @t(ptr %depth, ptr %bop, i32 %mode) nounwind  {
 ; CHECK-NEXT:    je LBB0_3
 ; CHECK-NEXT:  ## %bb.1: ## %entry
 ; CHECK-NEXT:    cmpl $1, %edx
-; CHECK-NEXT:    jne LBB0_10
+; CHECK-NEXT:    jne LBB0_9
 ; CHECK-NEXT:    .p2align 4, 0x90
 ; CHECK-NEXT:  LBB0_2: ## %bb2898.us
 ; CHECK-NEXT:    ## =>This Inner Loop Header: Depth=1
@@ -26,15 +26,12 @@ define void @t(ptr %depth, ptr %bop, i32 %mode) nounwind  {
 ; CHECK-NEXT:  LBB0_4: ## %bb13088
 ; CHECK-NEXT:    ## =>This Inner Loop Header: Depth=1
 ; CHECK-NEXT:    testb %al, %al
-; CHECK-NEXT:    jne LBB0_5
-; CHECK-NEXT:  ## %bb.6: ## %bb13101
+; CHECK-NEXT:    movl $65535, %ecx ## imm = 0xFFFF
+; CHECK-NEXT:    jne LBB0_6
+; CHECK-NEXT:  ## %bb.5: ## %bb13101
 ; CHECK-NEXT:    ## in Loop: Header=BB0_4 Depth=1
 ; CHECK-NEXT:    xorl %ecx, %ecx
-; CHECK-NEXT:    jmp LBB0_7
-; CHECK-NEXT:    .p2align 4, 0x90
-; CHECK-NEXT:  LBB0_5: ## in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT:    movl $65535, %ecx ## imm = 0xFFFF
-; CHECK-NEXT:  LBB0_7: ## %bb13107
+; CHECK-NEXT:  LBB0_6: ## %bb13107
 ; CHECK-NEXT:    ## in Loop: Header=BB0_4 Depth=1
 ; CHECK-NEXT:    movl %ecx, %edx
 ; CHECK-NEXT:    shll $16, %edx
@@ -44,11 +41,11 @@ define void @t(ptr %depth, ptr %bop, i32 %mode) nounwind  {
 ; CHECK-NEXT:    subl %edx, %ecx
 ; CHECK-NEXT:    testw %cx, %cx
 ; CHECK-NEXT:    je LBB0_4
-; CHECK-NEXT:  ## %bb.8: ## %bb13236
+; CHECK-NEXT:  ## %bb.7: ## %bb13236
 ; CHECK-NEXT:    ## in Loop: Header=BB0_4 Depth=1
 ; CHECK-NEXT:    testb %al, %al
 ; CHECK-NEXT:    jne LBB0_4
-; CHECK-NEXT:  ## %bb.9: ## %bb13572
+; CHECK-NEXT:  ## %bb.8: ## %bb13572
 ; CHECK-NEXT:    ## in Loop: Header=BB0_4 Depth=1
 ; CHECK-NEXT:    movzwl %cx, %ecx
 ; CHECK-NEXT:    movl %ecx, %edx
@@ -58,7 +55,7 @@ define void @t(ptr %depth, ptr %bop, i32 %mode) nounwind  {
 ; CHECK-NEXT:    shrl $16, %edx
 ; CHECK-NEXT:    movw %dx, 0
 ; CHECK-NEXT:    jmp LBB0_4
-; CHECK-NEXT:  LBB0_10: ## %return
+; CHECK-NEXT:  LBB0_9: ## %return
 ; CHECK-NEXT:    retq
 entry:
 	switch i32 %mode, label %return [
diff --git a/llvm/test/CodeGen/X86/2012-01-10-UndefExceptionEdge.ll b/llvm/test/CodeGen/X86/2012-01-10-UndefExceptionEdge.ll
index 1962ddebc2115ef..9aa2143039c2658 100644
--- a/llvm/test/CodeGen/X86/2012-01-10-UndefExceptionEdge.ll
+++ b/llvm/test/CodeGen/X86/2012-01-10-UndefExceptionEdge.ll
@@ -33,11 +33,10 @@ define void @f(ptr nocapture %arg, ptr nocapture %arg1, ptr nocapture %arg2, ptr
 ; CHECK-NEXT:    .cfi_offset %esi, -20
 ; CHECK-NEXT:    .cfi_offset %edi, -16
 ; CHECK-NEXT:    .cfi_offset %ebx, -12
-; CHECK-NEXT:    xorl %eax, %eax
-; CHECK-NEXT:    xorl %edi, %edi
-; CHECK-NEXT:    testb %al, %al
+; CHECK-NEXT:    xorl %ebx, %ebx
+; CHECK-NEXT:    testb %bl, %bl
 ; CHECK-NEXT:  Ltmp0:
-; CHECK-NEXT:    ## implicit-def: $ebx
+; CHECK-NEXT:    ## implicit-def: $edi
 ; CHECK-NEXT:    calll __Znam
 ; CHECK-NEXT:  Ltmp1:
 ; CHECK-NEXT:  ## %bb.1: ## %bb11
@@ -86,16 +85,16 @@ define void @f(ptr nocapture %arg, ptr nocapture %arg1, ptr nocapture %arg2, ptr
 ; CHECK-NEXT:    jne LBB0_17
 ; CHECK-NEXT:  ## %bb.15: ## %bb49.preheader
 ; CHECK-NEXT:    ## in Loop: Header=BB0_13 Depth=2
-; CHECK-NEXT:    xorl %ecx, %ecx
-; CHECK-NEXT:    movl %esi, %edx
-; CHECK-NEXT:    movl %edi, %ebx
+; CHECK-NEXT:    movl %esi, %ecx
+; CHECK-NEXT:    movl %ebx, %edx
+; CHECK-NEXT:    xorl %edi, %edi
 ; CHECK-NEXT:  LBB0_16: ## %bb49
 ; CHECK-NEXT:    ## Parent Loop BB0_8 Depth=1
 ; CHECK-NEXT:    ## Parent Loop BB0_13 Depth=2
 ; CHECK-NEXT:    ## => This Inner Loop Header: Depth=3
-; CHECK-NEXT:    incl %ecx
-; CHECK-NEXT:    addl $4, %edx
-; CHECK-NEXT:    decl %ebx
+; CHECK-NEXT:    incl %edi
+; CHECK-NEXT:    addl $4, %ecx
+; CHECK-NEXT:    decl %edx
 ; CHECK-NEXT:    jne LBB0_16
 ; CHECK-NEXT:  LBB0_17: ## %bb57
 ; CHECK-NEXT:    ## in Loop: Header=BB0_13 Depth=2
@@ -113,7 +112,7 @@ define void @f(ptr nocapture %arg, ptr nocapture %arg1, ptr nocapture %arg2, ptr
 ; CHECK-NEXT:  ## %bb.20: ## %bb61.preheader
 ; CHECK-NEXT:    ## in Loop: Header=BB0_8 Depth=1
 ; CHECK-NEXT:    movl %esi, %eax
-; CHECK-NEXT:    movl %edi, %ecx
+; CHECK-NEXT:    movl %ebx, %ecx
 ; CHECK-NEXT:  LBB0_21: ## %bb61
 ; CHECK-NEXT:    ## Parent Loop BB0_8 Depth=1
 ; CHECK-NEXT:    ## => This Inner Loop Header: Depth=2
@@ -127,13 +126,13 @@ define void @f(ptr nocapture %arg, ptr nocapture %arg1, ptr nocapture %arg2, ptr
 ; CHECK-NEXT:    jmp LBB0_8
 ; CHECK-NEXT:  LBB0_18: ## %bb43
 ; CHECK-NEXT:  Ltmp5:
-; CHECK-NEXT:    movl %esi, %ebx
+; CHECK-NEXT:    movl %esi, %edi
 ; CHECK-NEXT:    calll _OnOverFlow
 ; CHECK-NEXT:  Ltmp6:
 ; CHECK-NEXT:    jmp LBB0_3
 ; CHECK-NEXT:  LBB0_2: ## %bb29
 ; CHECK-NEXT:  Ltmp7:
-; CHECK-NEXT:    movl %esi, %ebx
+; CHECK-NEXT:    movl %esi, %edi
 ; CHECK-NEXT:    calll _OnOverFlow
 ; CHECK-NEXT:  Ltmp8:
 ; CHECK-NEXT:  LBB0_3: ## %bb30
@@ -141,9 +140,9 @@ define void @f(ptr nocapture %arg, ptr nocapture %arg1, ptr nocapture %arg2, ptr
 ; CHECK-NEXT:  LBB0_4: ## %bb20.loopexit
 ; CHECK-NEXT:  Ltmp4:
 ; CHECK-NEXT:  LBB0_9:
-; CHECK-NEXT:    movl %esi, %ebx
+; CHECK-NEXT:    movl %esi, %edi
 ; CHECK-NEXT:  LBB0_6: ## %bb23
-; CHECK-NEXT:    testl %ebx, %ebx
+; CHECK-NEXT:    testl %edi, %edi
 ; CHECK-NEXT:    addl $28, %esp
 ; CHECK-NEXT:    popl %esi
 ; CHECK-NEXT:    popl %edi
diff --git a/llvm/test/CodeGen/X86/cmpxchg-clobber-flags.ll b/llvm/test/CodeGen/X86/cmpxchg-clobber-flags.ll
index 7738ce49a7636c5..d1855fbdbc04101 100644
--- a/llvm/test/CodeGen/X86/cmpxchg-clobber-flags.ll
+++ b/llvm/test/CodeGen/X86/cmpxchg-clobber-flags.ll
@@ -137,8 +137,8 @@ define i32 @test_control_flow(ptr %p, i32 %i, i32 %j) nounwind {
 ; 32-ALL-NEXT:    # Parent Loop BB1_2 Depth=1
 ; 32-ALL-NEXT:    # => This Inner Loop Header: Depth=2
 ; 32-ALL-NEXT:    movl %edx, %eax
-; 32-ALL-NEXT:    xorl %edx, %edx
-; 32-ALL-NEXT:    testl %eax, %eax
+; 32-ALL-NEXT:    testl %edx, %edx
+; 32-ALL-NEXT:    movl $0, %edx
 ; 32-ALL-NEXT:    je .LBB1_3
 ; 32-ALL-NEXT:  # %bb.4: # %while.body.i
 ; 32-ALL-NEXT:    # in Loop: Header=BB1_2 Depth=1
@@ -148,33 +148,6 @@ define i32 @test_control_flow(ptr %p, i32 %i, i32 %j) nounwind {
 ; 32-ALL-NEXT:    xorl %eax, %eax
 ; 32-ALL-NEXT:  .LBB1_6: # %cond.end
 ; 32-ALL-NEXT:    retl
-;
-; 64-ALL-LABEL: test_control_flow:
-; 64-ALL:       # %bb.0: # %entry
-; 64-ALL-NEXT:    movl %esi, %eax
-; 64-ALL-NEXT:    cmpl %edx, %esi
-; 64-ALL-NEXT:    jle .LBB1_5
-; 64-ALL-NEXT:    .p2align 4, 0x90
-; 64-ALL-NEXT:  .LBB1_1: # %while.condthread-pre-split.i
-; 64-ALL-NEXT:    # =>This Loop Header: Depth=1
-; 64-ALL-NEXT:    # Child Loop BB1_2 Depth 2
-; 64-ALL-NEXT:    movl (%rdi), %ecx
-; 64-ALL-NEXT:    .p2align 4, 0x90
-; 64-ALL-NEXT:  .LBB1_2: # %while.cond.i
-; 64-ALL-NEXT:    # Parent Loop BB1_1 Depth=1
-; 64-ALL-NEXT:    # => This Inner Loop Header: Depth=2
-; 64-ALL-NEXT:    movl %ecx, %eax
-; 64-ALL-NEXT:    xorl %ecx, %ecx
-; 64-ALL-NEXT:    testl %eax, %eax
-; 64-ALL-NEXT:    je .LBB1_2
-; 64-ALL-NEXT:  # %bb.3: # %while.body.i
-; 64-ALL-NEXT:    # in Loop: Header=BB1_1 Depth=1
-; 64-ALL-NEXT:    lock cmpxchgl %eax, (%rdi)
-; 64-ALL-NEXT:    jne .LBB1_1
-; 64-ALL-NEXT:  # %bb.4:
-; 64-ALL-NEXT:    xorl %eax, %eax
-; 64-ALL-NEXT:  .LBB1_5: # %cond.end
-; 64-ALL-NEXT:    retq
 entry:
   %cmp = icmp sgt i32 %i, %j
   br i1 %cmp, label %loop_start, label %cond.end
diff --git a/llvm/test/CodeGen/X86/conditional-tailcall.ll b/llvm/test/CodeGen/X86/conditional-tailcall.ll
index d1ef1ab390396cd..1bf46f0d79e1ce8 100644
--- a/llvm/test/CodeGen/X86/conditional-tailcall.ll
+++ b/llvm/test/CodeGen/X86/conditional-tailcall.ll
@@ -456,70 +456,74 @@ define zeroext i1 @pr31257(ptr nocapture readonly dereferenceable(8) %s) minsize
 ; WIN64-NEXT:  .LBB3_1: # %for.cond
 ; WIN64-NEXT:    # =>This Inner Loop Header: Depth=1
 ; WIN64-NEXT:    testq %rax, %rax # encoding: [0x48,0x85,0xc0]
-; WIN64-NEXT:    je .LBB3_11 # encoding: [0x74,A]
-; WIN64-NEXT:    # fixup A - offset: 1, value: .LBB3_11-1, kind: FK_PCRel_1
+; WIN64-NEXT:    je .LBB3_12 # encoding: [0x74,A]
+; WIN64-NEXT:    # fixup A - offset: 1, value: .LBB3_12-1, kind: FK_PCRel_1
 ; WIN64-NEXT:  # %bb.2: # %for.body
 ; WIN64-NEXT:    # in Loop: Header=BB3_1 Depth=1
 ; WIN64-NEXT:    cmpl $2, %r8d # encoding: [0x41,0x83,0xf8,0x02]
-; WIN64-NEXT:    je .LBB3_9 # encoding: [0x74,A]
-; WIN64-NEXT:    # fixup A - offset: 1, value: .LBB3_9-1, kind: FK_PCRel_1
+; WIN64-NEXT:    je .LBB3_10 # encoding: [0x74,A]
+; WIN64-NEXT:    # fixup A - offset: 1, value: .LBB3_10-1, kind: FK_PCRel_1
 ; WIN64-NEXT:  # %bb.3: # %for.body
 ; WIN64-NEXT:    # in Loop: Header=BB3_1 Depth=1
 ; WIN64-NEXT:    cmpl $1, %r8d # encoding: [0x41,0x83,0xf8,0x01]
-; WIN64-NEXT:    je .LBB3_7 # encoding: [0x74,A]
-; WIN64-NEXT:    # fixup A - offset: 1, value: .LBB3_7-1, kind: FK_PCRel_1
+; WIN64-NEXT:    je .LBB3_8 # encoding: [0x74,A]
+; WIN64-NEXT:    # fixup A - offset: 1, value: .LBB3_8-1, kind: FK_PCRel_1
 ; WIN64-NEXT:  # %bb.4: # %for.body
 ; WIN64-NEXT:    # in Loop: Header=BB3_1 Depth=1
 ; WIN64-NEXT:    testl %r8d, %r8d # encoding: [0x45,0x85,0xc0]
-; WIN64-NEXT:    jne .LBB3_10 # encoding: [0x75,A]
-; WIN64-NEXT:    # fixup A - offset: 1, value: .LBB3_10-1, kind: FK_PCRel_1
+; WIN64-NEXT:    jne .LBB3_11 # encoding: [0x75,A]
+; WIN64-NEXT:    # fixup A - offset: 1, value: .LBB3_11-1, kind: FK_PCRel_1
 ; WIN64-NEXT:  # %bb.5: # %sw.bb
 ; WIN64-NEXT:    # in Loop: Header=BB3_1 Depth=1
 ; WIN64-NEXT:    movzbl (%rcx), %r9d # encoding: [0x44,0x0f,0xb6,0x09]
 ; WIN64-NEXT:    cmpl $43, %r9d # encoding: [0x41,0x83,0xf9,0x2b]
 ; WIN64-NEXT:    movl $1, %r8d # encoding: [0x41,0xb8,0x01,0x00,0x00,0x00]
-; WIN64-NEXT:    je .LBB3_10 # encoding: [0x74,A]
-; WIN64-NEXT:    # fixup A - offset: 1, value: .LBB3_10-1, kind: FK_PCRel_1
+; WIN64-NEXT:    je .LBB3_11 # encoding: [0x74,A]
+; WIN64-NEXT:    # fixup A - offset: 1, value: .LBB3_11-1, kind: FK_PCRel_1
 ; WIN64-NEXT:  # %bb.6: # %sw.bb
 ; WIN64-NEXT:    # in Loop: Header=BB3_1 Depth=1
 ; WIN64-NEXT:    cmpl $45, %r9d # encoding: [0x41,0x83,0xf9,0x2d]
-; WIN64-NEXT:    je .LBB3_10 # encoding: [0x74,A]
-; WIN64-NEXT:    # fixup A - offset: 1, value: .LBB3_10-1, kind: FK_PCRel_1
-; WIN64-NEXT:    jmp .LBB3_8 # encoding: [0xeb,A]
-; WIN64-NEXT:    # fixup A - offset: 1, value: .LBB3_8-1, kind: FK_PCRel_1
-; WIN64-NEXT:  .LBB3_7: # %sw.bb14
-; WIN64-NEXT:    # in Loop: Header=BB3_1 Depth=1
-; WIN64-NEXT:    movzbl (%rcx), %r9d # encoding: [0x44,0x0f,0xb6,0x09]
-; WIN64-NEXT:  .LBB3_8: # %if.else
+; WIN64-NEXT:    je .LBB3_11 # encoding: [0x74,A]
+; WIN64-NEXT:    # fixup A - offset: 1, value: .LBB3_11-1, kind: FK_PCRel_1
+; WIN64-NEXT:  # %bb.7: # %if.else
 ; WIN64-NEXT:    # in Loop: Header=BB3_1 Depth=1
 ; WIN64-NEXT:    addl $-48, %r9d # encoding: [0x41,0x83,0xc1,0xd0]
-; WIN64-NEXT:    movl $2, %r8d # encoding: [0x41,0xb8,0x02,0x00,0x00,0x00]
 ; WIN64-NEXT:    cmpl $10, %r9d # encoding: [0x41,0x83,0xf9,0x0a]
-; WIN64-NEXT:    jb .LBB3_10 # encoding: [0x72,A]
-; WIN64-NEXT:    # fixup A - offset: 1, value: .LBB3_10-1, kind: FK_PCRel_1
-; WIN64-NEXT:    jmp .LBB3_12 # encoding: [0xeb,A]
-; WIN64-NEXT:    # fixup A - offset: 1, value: .LBB3_12-1, kind: FK_PCRel_1
-; WIN64-NEXT:  .LBB3_9: # %sw.bb22
+; WIN64-NEXT:    jmp .LBB3_9 # encoding: [0xeb,A]
+; WIN64-NEXT:    # fixup A - offset: 1, value: .LBB3_9-1, kind: FK_PCRel_1
+; WIN64-NEXT:  .LBB3_8: # %sw.bb14
+; WIN64-NEXT:    # in Loop: Header=BB3_1 Depth=1
+; WIN64-NEXT:    movzbl (%rcx), %r8d # encoding: [0x44,0x0f,0xb6,0x01]
+; WIN64-NEXT:    addl $-48, %r8d # encoding: [0x41,0x83,0xc0,0xd0]
+; WIN64-NEXT:    cmpl $10, %r8d # encoding: [0x41,0x83,0xf8,0x0a]
+; WIN64-NEXT:  .LBB3_9: # %if.else
 ; WIN64-NEXT:    # in Loop: Header=BB3_1 Depth=1
-; WIN64-NEXT:    movzbl (%rcx), %r9d # encoding: [0x44,0x0f,0xb6,0x09]
-; WIN64-NEXT:    addl $-48, %r9d # encoding: [0x41,0x83,0xc1,0xd0]
 ; WIN64-NEXT:    movl $2, %r8d # encoding: [0x41,0xb8,0x02,0x00,0x00,0x00]
-; WIN64-NEXT:    cmpl $10, %r9d # encoding: [0x41,0x83,0xf9,0x0a]
+; WIN64-NEXT:    jb .LBB3_11 # encoding: [0x72,A]
+; WIN64-NEXT:    # fixup A - offset: 1, value: .LBB3_11-1, kind: FK_PCRel_1
+; WIN64-NEXT:    jmp .LBB3_13 # encoding: [0xeb,A]
+; WIN64-NEXT:    # fixup A - offset: 1, value: .LBB3_13-1, kind: FK_PCRel_1
+; WIN64-NEXT:  .LBB3_10: # %sw.bb22
+; WIN64-NEXT:    # in Loop: Header=BB3_1 Depth=1
+; WIN64-NEXT:    movzbl (%rcx), %r8d # encoding: [0x44,0x0f,0xb6,0x01]
+; WIN64-NEXT:    addl $-48, %r8d # encoding: [0x41,0x83,0xc0,0xd0]
+; WIN64-NEXT:    cmpl $10, %r8d # encoding: [0x41,0x83,0xf8,0x0a]
+; WIN64-NEXT:    movl $2, %r8d # encoding: [0x41,0xb8,0x02,0x00,0x00,0x00]
 ; WIN64-NEXT:    jae _Z20isValidIntegerSuffixN9__gnu_cxx17__normal_iteratorIPKcSsEES3_ # TAILCALL
 ; WIN64-NEXT:    # encoding: [0x73,A]
 ; WIN64-NEXT:    # fixup A - offset: 1, value: _Z20isValidIntegerSuffixN9__gnu_cxx17__normal_iteratorIPKcSsEES3_-1, kind: FK_PCRel_1
-; WIN64-NEXT:  .LBB3_10: # %for.inc
+; WIN64-NEXT:  .LBB3_11: # %for.inc
 ; WIN64-NEXT:    # in Loop: Header=BB3_1 Depth=1
 ; WIN64-NEXT:    incq %rcx # encoding: [0x48,0xff,0xc1]
 ; WIN64-NEXT:    decq %rax # encoding: [0x48,0xff,0xc8]
 ; WIN64-NEXT:    jmp .LBB3_1 # encoding: [0xeb,A]
 ; WIN64-NEXT:    # fixup A - offset: 1, value: .LBB3_1-1, kind: FK_PCRel_1
-; WIN64-NEXT:  .LBB3_11:
+; WIN64-NEXT:  .LBB3_12:
 ; WIN64-NEXT:    cmpl $2, %r8d # encoding: [0x41,0x83,0xf8,0x02]
 ; WIN64-NEXT:    sete %al # encoding: [0x0f,0x94,0xc0]
 ; WIN64-NEXT:    # kill: def $al killed $al killed $eax
 ; WIN64-NEXT:    retq # encoding: [0xc3]
-; WIN64-NEXT:  .LBB3_12:
+; WIN64-NEXT:  .LBB3_13:
 ; WIN64-NEXT:    xorl %eax, %eax # encoding: [0x31,0xc0]
 ; WIN64-NEXT:    # kill: def $al killed $al killed $eax
 ; WIN64-NEXT:    retq # encoding: [0xc3]
diff --git a/llvm/test/CodeGen/X86/licm-nested.ll b/llvm/test/CodeGen/X86/licm-nested.ll
index fe50249e8749558..8a81c617a1dc239 100644
--- a/llvm/test/CodeGen/X86/licm-nested.ll
+++ b/llvm/test/CodeGen/X86/licm-nested.ll
@@ -1,5 +1,5 @@
 ; REQUIRES: asserts
-; RUN: llc -mtriple=x86_64-apple-darwin < %s -o /dev/null -stats -info-output-file - | grep "hoisted out of loops" | grep 3
+; RUN: llc -mtriple=x86_64-apple-darwin < %s -o /dev/null -stats -info-output-file - | grep "hoisted out of loops" | grep 7
 
 ; MachineLICM should be able to hoist the symbolic addresses out of
 ; the inner loops.
diff --git a/llvm/test/CodeGen/X86/loop-strength-reduce7.ll b/llvm/test/CodeGen/X86/loop-strength-reduce7.ll
index 3687194972f27f4..61efafa80567641 100644
--- a/llvm/test/CodeGen/X86/loop-strength-reduce7.ll
+++ b/llvm/test/CodeGen/X86/loop-strength-reduce7.ll
@@ -17,15 +17,15 @@ define fastcc void @outer_loop(ptr nocapture %gfp, ptr nocapture %xr, i32 %targ_
 ; CHECK-NEXT:  LBB0_2: ## %bb28.i37
 ; CHECK-NEXT:    ## =>This Loop Header: Depth=1
 ; CHECK-NEXT:    ## Child Loop BB0_3 Depth 2
-; CHECK-NEXT:    xorl %edx, %edx
-; CHECK-NEXT:    movl %eax, %esi
+; CHECK-NEXT:    movl %eax, %edx
+; CHECK-NEXT:    xorl %esi, %esi
 ; CHECK-NEXT:    .p2align 4, 0x90
 ; CHECK-NEXT:  LBB0_3: ## %bb29.i38
 ; CHECK-NEXT:    ## Parent Loop BB0_2 Depth=1
 ; CHECK-NEXT:    ## => This Inner Loop Header: Depth=2
-; CHECK-NEXT:    incl %edx
-; CHECK-NEXT:    addl $12, %esi
-; CHECK-NEXT:    cmpl $11, %edx
+; CHECK-NEXT:    incl %esi
+; CHECK-NEXT:    addl $12, %edx
+; CHECK-NEXT:    cmpl $11, %esi
 ; CHECK-NEXT:    jbe LBB0_3
 ; CHECK-NEXT:  ## %bb.1: ## %bb28.i37.loopexit
 ; CHECK-NEXT:    ## in Loop: Header=BB0_2 Depth=1
diff --git a/llvm/test/CodeGen/X86/pr38795.ll b/llvm/test/CodeGen/X86/pr38795.ll
index 8e0532e60652800..278d223749842e2 100644
--- a/llvm/test/CodeGen/X86/pr38795.ll
+++ b/llvm/test/CodeGen/X86/pr38795.ll
@@ -30,12 +30,13 @@ define dso_local void @fn() {
 ; CHECK-NEXT:    # implicit-def: $ebp
 ; CHECK-NEXT:    jmp .LBB0_1
 ; CHECK-NEXT:    .p2align 4, 0x90
-; CHECK-NEXT:  .LBB0_14: # in Loop: Header=BB0_1 Depth=1
+; CHECK-NEXT:  .LBB0_13: # in Loop: Header=BB0_1 Depth=1
+; CHECK-NEXT:    xorl %esi, %esi
 ; CHECK-NEXT:    movb %dl, %ch
 ; CHECK-NEXT:    movl %ecx, %edx
 ; CHECK-NEXT:  .LBB0_1: # %for.cond
 ; CHECK-NEXT:    # =>This Loop Header: Depth=1
-; CHECK-NEXT:    # Child Loop BB0_22 Depth 2
+; CHECK-NEXT:    # Child Loop BB0_20 Depth 2
 ; CHECK-NEXT:    cmpb $8, %dl
 ; CHECK-NEXT:    movb %dl, {{[-0-9]+}}(%e{{[sb]}}p) # 1-byte Spill
 ; CHECK-NEXT:    ja .LBB0_3
@@ -53,7 +54,7 @@ define dso_local void @fn() {
 ; CHECK-NEXT:    # kill: def $cl killed $cl killed $ecx
 ; CHECK-NEXT:    movl $0, h
 ; CHECK-NEXT:    cmpb $8, %dl
-; CHECK-NEXT:    jg .LBB0_8
+; CHECK-NEXT:    jg .LBB0_9
 ; CHECK-NEXT:  # %bb.5: # %if.then13
 ; CHECK-NEXT:    # in Loop: Header=BB0_1 Depth=1
 ; CHECK-NEXT:    movl %eax, %esi
@@ -66,7 +67,7 @@ define dso_local void @fn() {
 ; CHECK-NEXT:    movb {{[-0-9]+}}(%e{{[sb]}}p), %ch # 1-byte Reload
 ; CHECK-NEXT:    movl %ecx, %edx
 ; CHECK-NEXT:    je .LBB0_6
-; CHECK-NEXT:    jmp .LBB0_18
+; CHECK-NEXT:    jmp .LBB0_16
 ; CHECK-NEXT:    .p2align 4, 0x90
 ; CHECK-NEXT:  .LBB0_3: # %if.then
 ; CHECK-NEXT:    # in Loop: Header=BB0_1 Depth=1
@@ -76,88 +77,84 @@ define dso_local void @fn() {
 ; CHECK-NEXT:    movb {{[-0-9]+}}(%e{{[sb]}}p), %ch # 1-byte Reload
 ; CHECK-NEXT:    movzbl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 1-byte Folded Reload
 ; CHECK-NEXT:    # implicit-def: $eax
+; CHECK-NEXT:    jmp .LBB0_6
+; CHECK-NEXT:    .p2align 4, 0x90
+; CHECK-NEXT:  .LBB0_9: # %if.end21
+; CHECK-NEXT:    # in Loop: Header=BB0_1 Depth=1
+; CHECK-NEXT:    # implicit-def: $ebp
+; CHECK-NEXT:    jmp .LBB0_10
+; CHECK-NEXT:    .p2align 4, 0x90
 ; CHECK-NEXT:  .LBB0_6: # %for.cond35
 ; CHECK-NEXT:    # in Loop: Header=BB0_1 Depth=1
+; CHECK-NEXT:    movb %dl, %cl
 ; CHECK-NEXT:    testl %edi, %edi
-; CHECK-NEXT:    je .LBB0_7
-; CHECK-NEXT:  .LBB0_11: # %af
+; CHECK-NEXT:    movl %edi, %esi
+; CHECK-NEXT:    movl $0, %edi
+; CHECK-NEXT:    movb %ch, %dl
+; CHECK-NEXT:    je .LBB0_20
+; CHECK-NEXT:  # %bb.7: # %af
 ; CHECK-NEXT:    # in Loop: Header=BB0_1 Depth=1
 ; CHECK-NEXT:    testb %bl, %bl
-; CHECK-NEXT:    jne .LBB0_12
-; CHECK-NEXT:  .LBB0_19: # %if.end39
+; CHECK-NEXT:    jne .LBB0_8
+; CHECK-NEXT:  .LBB0_17: # %if.end39
 ; CHECK-NEXT:    # in Loop: Header=BB0_1 Depth=1
 ; CHECK-NEXT:    testl %eax, %eax
-; CHECK-NEXT:    je .LBB0_21
-; CHECK-NEXT:  # %bb.20: # %if.then41
+; CHECK-NEXT:    je .LBB0_19
+; CHECK-NEXT:  # %bb.18: # %if.then41
 ; CHECK-NEXT:    # in Loop: Header=BB0_1 Depth=1
 ; CHECK-NEXT:    movl $0, {{[0-9]+}}(%esp)
 ; CHECK-NEXT:    movl $fn, {{[0-9]+}}(%esp)
 ; CHECK-NEXT:    movl $.str, (%esp)
 ; CHECK-NEXT:    calll printf
-; CHECK-NEXT:  .LBB0_21: # %for.end46
+; CHECK-NEXT:  .LBB0_19: # %for.end46
 ; CHECK-NEXT:    # in Loop: Header=BB0_1 Depth=1
-; CHECK-NEXT:    # implicit-def: $ch
-; CHECK-NEXT:    # implicit-def: $cl
-; CHECK-NEXT:    # implicit-def: $ebp
-; CHECK-NEXT:    jmp .LBB0_22
-; CHECK-NEXT:    .p2align 4, 0x90
-; CHECK-NEXT:  .LBB0_8: # %if.end21
-; CHECK-NEXT:    # in Loop: Header=BB0_1 Depth=1
-; CHECK-NEXT:    # implicit-def: $ebp
-; CHECK-NEXT:    testb %bl, %bl
-; CHECK-NEXT:    je .LBB0_13
-; CHECK-NEXT:    .p2align 4, 0x90
-; CHECK-NEXT:  .LBB0_10: # in Loop: Header=BB0_1 Depth=1
-; CHECK-NEXT:    # implicit-def: $eax
-; CHECK-NEXT:    testb %bl, %bl
-; CHECK-NEXT:    je .LBB0_19
-; CHECK-NEXT:  .LBB0_12: # in Loop: Header=BB0_1 Depth=1
-; CHECK-NEXT:    # implicit-def: $edi
-; CHECK-NEXT:    # implicit-def: $ch
+; CHECK-NEXT:    movl %esi, %edi
 ; CHECK-NEXT:    # implicit-def: $dl
+; CHECK-NEXT:    # implici...
[truncated]

@hgreving2304
Copy link
Member Author

Please assign reviewers as needed

@hgreving2304 hgreving2304 merged commit 71a8d2e into llvm:main Oct 9, 2023
@hgreving2304
Copy link
Member Author

Sorry I accidentally merged this PR, trying to undo

hgreving2304 pushed a commit to hgreving2304/llvm-project that referenced this pull request Oct 9, 2023
hgreving2304 pushed a commit that referenced this pull request Oct 9, 2023
@nikic
Copy link
Contributor

nikic commented Oct 9, 2023

Due to the merge we have some compile-time numbers: http://llvm-compile-time-tracker.com/compare.php?from=7b3bbd83c0c24087072ec5b22a76799ab31f87d5&to=71a8d2e3064fcb3ff76565e6e8529613f90aa51b&stat=instructions%3Au

This causes significant overhead in some cases, e.g. libclamav_autoit.c goes up by 3.5% and libclamav_htmlnorm.c goes up by 6%. Please mitigate compile-time regressions before resubmitting this.

@hgreving2304
Copy link
Member Author

hgreving2304 commented Oct 9, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants