Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV] MachineScheduler.cpp:1368: updatePressureDiffs assertion failure #68730

Open
preames opened this issue Oct 10, 2023 · 10 comments
Open

[RISCV] MachineScheduler.cpp:1368: updatePressureDiffs assertion failure #68730

preames opened this issue Oct 10, 2023 · 10 comments
Labels
backend:RISC-V crash Prefer [crash-on-valid] or [crash-on-invalid] llvm:codegen LTO Link time optimization (regular/full LTO or ThinLTO)

Comments

@preames
Copy link
Collaborator

preames commented Oct 10, 2023

Reduce test case for a crash observed in an LTO build of spec2017 namd. This is a newish failure, but not new. I waited a bit before bothering to reduce this under the hopes that the triggering change would be reverted.

@fpetrogalli I'm guessing this is related to your large rewrite, but haven't yet confirmed that.

$ llc < hand-reduced.ll 
	.text
	.attribute	4, 16
	.attribute	5, "rv64i2p1"
	.file	"<stdin>"
llc: ../llvm-project/llvm/lib/CodeGen/MachineScheduler.cpp:1368: void llvm::ScheduleDAGMILive::updatePressureDiffs(llvm::ArrayRef<llvm::RegisterMaskPair>): Assertion `VNI && "No live value at use."' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /home/preames/llvm-dev/llvm-project/llvm/llc
1.	Running pass 'Function Pass Manager' on module '<stdin>'.
2.	Running pass 'Machine Instruction Scheduler' on function '@test'
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  llc       0x0000557d99f08b70 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 240
1  llc       0x0000557d99f05f7f llvm::sys::RunSignalHandlers() + 47
2  llc       0x0000557d99f060d5
3  libc.so.6 0x00007fb8ece42520
4  libc.so.6 0x00007fb8ece969fc pthread_kill + 300
5  libc.so.6 0x00007fb8ece42476 raise + 22
6  libc.so.6 0x00007fb8ece287f3 abort + 211
7  libc.so.6 0x00007fb8ece2871b
8  libc.so.6 0x00007fb8ece39e96
9  llc       0x0000557d992220a6 llvm::ScheduleDAGMILive::updatePressureDiffs(llvm::ArrayRef<llvm::RegisterMaskPair>) + 2038
10 llc       0x0000557d9922e311 llvm::ScheduleDAGMILive::initRegPressure() + 465
11 llc       0x0000557d99233294 llvm::ScheduleDAGMILive::schedule() + 68
12 llc       0x0000557d9921bf3d
13 llc       0x0000557d9922d322
14 llc       0x0000557d991814d7
15 llc       0x0000557d9972ee3e llvm::FPPassManager::runOnFunction(llvm::Function&) + 878
16 llc       0x0000557d9972f089 llvm::FPPassManager::runOnModule(llvm::Module&) + 57
17 llc       0x0000557d9972f9c5 llvm::legacy::PassManagerImpl::run(llvm::Module&) + 1061
18 llc       0x0000557d970dea1e
19 llc       0x0000557d97028006 main + 1878
20 libc.so.6 0x00007fb8ece29d90
21 libc.so.6 0x00007fb8ece29e40 __libc_start_main + 128
22 llc       0x0000557d970d5605 _start + 37
Aborted (core dumped)

$ cat hand-reduced.ll 
target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n32:64-S128"
target triple = "riscv64-unknown-linux-gnu"

define void @test(i1 %exitcond6351.not) #0 {
entry:
  br label %for.body2902

for.body2987.preheader:                           ; preds = %for.body2902
  %0 = shufflevector <2 x double> %3, <2 x double> zeroinitializer, <8 x i32> <i32 0, i32 poison, i32 poison, i32 1, i32 poison, i32 poison, i32 poison, i32 poison>
  %1 = insertelement <8 x double> %0, double 0.000000e+00, i64 0
  store <8 x double> %0, ptr null, align 8
  br label %delete.end

for.body2902:                                     ; preds = %for.body2902, %entry
  %2 = phi <2 x double> [ %3, %for.body2902 ], [ zeroinitializer, %entry ]
  %3 = fadd <2 x double> zeroinitializer, %2
  br i1 %exitcond6351.not, label %for.body2987.preheader, label %for.body2902

delete.end:                                       ; preds = %for.body2987.preheader
  ret void
}

attributes #0 = { "target-features"="+64bit,+a,+c,+d,+f,+m,+v,,+zvl128b,+zvl256b" }
@llvmbot
Copy link
Member

llvmbot commented Oct 10, 2023

@llvm/issue-subscribers-backend-risc-v

Author: Philip Reames (preames)

Reduce test case for a crash observed in an LTO build of spec2017 namd. This is a newish failure, but not new. I waited a bit before bothering to reduce this under the hopes that the triggering change would be reverted.

@fpetrogalli I'm guessing this is related to your large rewrite, but haven't yet confirmed that.

$ llc &lt; hand-reduced.ll 
	.text
	.attribute	4, 16
	.attribute	5, "rv64i2p1"
	.file	"&lt;stdin&gt;"
llc: ../llvm-project/llvm/lib/CodeGen/MachineScheduler.cpp:1368: void llvm::ScheduleDAGMILive::updatePressureDiffs(llvm::ArrayRef&lt;llvm::RegisterMaskPair&gt;): Assertion `VNI &amp;&amp; "No live value at use."' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /home/preames/llvm-dev/llvm-project/llvm/llc
1.	Running pass 'Function Pass Manager' on module '&lt;stdin&gt;'.
2.	Running pass 'Machine Instruction Scheduler' on function '@<!-- -->test'
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  llc       0x0000557d99f08b70 llvm::sys::PrintStackTrace(llvm::raw_ostream&amp;, int) + 240
1  llc       0x0000557d99f05f7f llvm::sys::RunSignalHandlers() + 47
2  llc       0x0000557d99f060d5
3  libc.so.6 0x00007fb8ece42520
4  libc.so.6 0x00007fb8ece969fc pthread_kill + 300
5  libc.so.6 0x00007fb8ece42476 raise + 22
6  libc.so.6 0x00007fb8ece287f3 abort + 211
7  libc.so.6 0x00007fb8ece2871b
8  libc.so.6 0x00007fb8ece39e96
9  llc       0x0000557d992220a6 llvm::ScheduleDAGMILive::updatePressureDiffs(llvm::ArrayRef&lt;llvm::RegisterMaskPair&gt;) + 2038
10 llc       0x0000557d9922e311 llvm::ScheduleDAGMILive::initRegPressure() + 465
11 llc       0x0000557d99233294 llvm::ScheduleDAGMILive::schedule() + 68
12 llc       0x0000557d9921bf3d
13 llc       0x0000557d9922d322
14 llc       0x0000557d991814d7
15 llc       0x0000557d9972ee3e llvm::FPPassManager::runOnFunction(llvm::Function&amp;) + 878
16 llc       0x0000557d9972f089 llvm::FPPassManager::runOnModule(llvm::Module&amp;) + 57
17 llc       0x0000557d9972f9c5 llvm::legacy::PassManagerImpl::run(llvm::Module&amp;) + 1061
18 llc       0x0000557d970dea1e
19 llc       0x0000557d97028006 main + 1878
20 libc.so.6 0x00007fb8ece29d90
21 libc.so.6 0x00007fb8ece29e40 __libc_start_main + 128
22 llc       0x0000557d970d5605 _start + 37
Aborted (core dumped)

$ cat hand-reduced.ll 
target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n32:64-S128"
target triple = "riscv64-unknown-linux-gnu"

define void @<!-- -->test(i1 %exitcond6351.not) #<!-- -->0 {
entry:
  br label %for.body2902

for.body2987.preheader:                           ; preds = %for.body2902
  %0 = shufflevector &lt;2 x double&gt; %3, &lt;2 x double&gt; zeroinitializer, &lt;8 x i32&gt; &lt;i32 0, i32 poison, i32 poison, i32 1, i32 poison, i32 poison, i32 poison, i32 poison&gt;
  %1 = insertelement &lt;8 x double&gt; %0, double 0.000000e+00, i64 0
  store &lt;8 x double&gt; %0, ptr null, align 8
  br label %delete.end

for.body2902:                                     ; preds = %for.body2902, %entry
  %2 = phi &lt;2 x double&gt; [ %3, %for.body2902 ], [ zeroinitializer, %entry ]
  %3 = fadd &lt;2 x double&gt; zeroinitializer, %2
  br i1 %exitcond6351.not, label %for.body2987.preheader, label %for.body2902

delete.end:                                       ; preds = %for.body2987.preheader
  ret void
}

attributes #<!-- -->0 = { "target-features"="+64bit,+a,+c,+d,+f,+m,+v,,+zvl128b,+zvl256b" }

@preames
Copy link
Collaborator Author

preames commented Oct 10, 2023

Note that this problem does not appear to be in 17.x. The following commit does not crash:
commit b2417f5 (HEAD -> release/17.x, tag: llvmorg-17.0.2, upstream/release/17.x)

@preames
Copy link
Collaborator Author

preames commented Oct 10, 2023

git bisect led to the following result:

b5ff71e261b637ab7088fb5c3314bf71d6e01da7 is the first bad commit
commit b5ff71e261b637ab7088fb5c3314bf71d6e01da7
Author: Luke Lau <[email protected]>
Date:   Thu Sep 21 13:55:49 2023 +0100

    [RISCV] Shrink vslideup's LMUL when lowering fixed insert_subvector  (#65997)
    
    Similar to #65598, if we're using a vslideup to insert a fixed length
    vector into another vector, then we can work out the minimum number of
    registers it will need to slide up across given the minimum VLEN, and
    shrink the type operated on to reduce LMUL accordingly.
    
    This is somewhat dependent on #66211 , since it introduces a subregister
    copy that triggers a crash with -early-live-intervals in one of the
    tests.
    
    Stacked upon #66211

 llvm/lib/Target/RISCV/RISCVISelLowering.cpp        |  18 ++
 .../RISCV/rvv/fixed-vectors-insert-subvector.ll    |  45 ++-
 .../rvv/fixed-vectors-strided-load-combine.ll      |  80 ++---
 llvm/test/CodeGen/RISCV/rvv/fpclamptosat_vec.ll    | 330 ++++++++++-----------
 4 files changed, 229 insertions(+), 244 deletions(-)

@luke957 FYI.

This was a surprise to me. I don't see anything immediately obvious in this change, but am going to take a closer look. This may simple be exposing another issue.

@EugeneZelenko EugeneZelenko added crash Prefer [crash-on-valid] or [crash-on-invalid] LTO Link time optimization (regular/full LTO or ThinLTO) and removed new issue labels Oct 10, 2023
preames added a commit that referenced this issue Oct 10, 2023
…vector (#65997)"

This reverts commit b5ff71e.  As described in
#68730, this appears to have exposed
an existing liveness issue.  Revert to green until we can figure out how to
address the root cause.

Note: This was not a clean revert.  I ended up doing it by hand.
@preames
Copy link
Collaborator Author

preames commented Oct 10, 2023

I reverted 3a6cc52 to get the tree back into a green state through the dev conference. Note that I don't actually believe there's anything wrong with that change at the moment. I think it's just exposing an underlying issue.

@preames
Copy link
Collaborator Author

preames commented Oct 10, 2023

@jayfoad You had the last fix for a liveness issue exposed by this patch (#66211). Any chance you see what might be going wrong here?

@jayfoad
Copy link
Contributor

jayfoad commented Oct 11, 2023

@jayfoad You had the last fix for a liveness issue exposed by this patch (#66211). Any chance you see what might be going wrong here?

No not really. I notice that at the point of failure LIS->verifyAnalysis() and MF.verify(LIS, LIS->getSlotIndexes(), nullptr, true) don't report any problems, so I guess the problem may be in ScheduleDAGMILive itself.

@fpetrogalli
Copy link
Member

fpetrogalli commented Oct 11, 2023

One extra data point for the investigation: removing the following line from the reduced test case prevents the crash:

%1 = insertelement <8 x double> %0, double 0.000000e+00, i64 0

You need of course to rename %2 and %3 to something else, for example %x and %y

@fpetrogalli
Copy link
Member

For the record, here is the non-crashing test (uncomment the %1, and it will crash):

target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n32:64-S128"
target triple = "riscv64-unknown-linux-gnu"


define void @test2(i1 %exitcond6351.not) #0 {
entry:
  br label %for.body2902

for.body2987.preheader:                           ; preds = %for.body2902
  %0 = shufflevector <2 x double> %y, <2 x double> zeroinitializer, <8 x i32> <i32 0, i32 poison, i32 poison, i32 1, i32 poison, i32 poison, i32 poison, i32 poison>
;  %1 = insertelement <8 x double> %0, double 0.000000e+00, i64 0
  store <8 x double> %0, ptr null, align 8
  br label %delete.end

for.body2902:                                     ; preds = %for.body2902, %entry
  %x = phi <2 x double> [ %y, %for.body2902 ], [ zeroinitializer, %entry ]
  %y = fadd <2 x double> zeroinitializer, %x
  br i1 %exitcond6351.not, label %for.body2987.preheader, label %for.body2902

delete.end:                                       ; preds = %for.body2987.preheader
  ret void
}

attributes #0 = { "target-features"="+64bit,+a,+c,+d,+f,+m,+v,,+zvl128b,+zvl256b" }

@preames
Copy link
Collaborator Author

preames commented Oct 26, 2023

Spent a bunch of time digging into this yesterday, and today. I have a pretty good feel for what's going on, but not really what the "right" fix is.

Here is the machine IR, and live intervals for this example immediately before machine-scheduler.

# Machine code for function test: NoPHIs, TracksLiveness, TiedOpsRewritten
Function Live Ins: $x10 in %2

bb.0.entry:
  successors: %bb.2(0x80000000); %bb.2(100.00%)
  liveins: $x10
  %2:gpr = COPY $x10
  dead $x0 = PseudoVSETIVLI 2, 216, implicit-def $vl, implicit-def $vtype
  undef %12.sub_vrm1_0:vrm2 = PseudoVMV_V_I_M1 undef %12.sub_vrm1_0:vrm2(tied-def 0), 0, 2, 6, 0, implicit $vl, implicit $vtype
  %8:fpr64 = FMV_D_X $x0
  %11:gpr = ANDI %2:gpr, 1
  PseudoBR %bb.2

bb.1.for.body2987.preheader:
; predecessors: %bb.2
  successors: %bb.3(0x80000000); %bb.3(100.00%)

  dead $x0 = PseudoVSETIVLI 4, 152, implicit-def $vl, implicit-def $vtype
  %16:vr = COPY %12.sub_vrm1_0:vrm2
  early-clobber %16:vr = PseudoVSLIDEUP_VI_M1 %16:vr(tied-def 0), %12.sub_vrm1_0:vrm2, 2, 4, 6, 0, implicit $vl, implicit $vtype
  undef %12.sub_vrm1_0:vrm2 = COPY %16:vr
  dead $x0 = PseudoVSETIVLI 8, 217, implicit-def $vl, implicit-def $vtype
  PseudoVSE64_V_M2 %12:vrm2, $x0, 8, 6, implicit $vl, implicit $vtype :: (store (s512) into `ptr null`, align 8)
  PseudoBR %bb.3

bb.2.for.body2902:
; predecessors: %bb.0, %bb.2
  successors: %bb.1(0x04000000), %bb.2(0x7c000000); %bb.1(3.12%), %bb.2(96.88%)

  undef %12.sub_vrm1_0:vrm2 = nofpexcept PseudoVFADD_VFPR64_M1 undef %12.sub_vrm1_0:vrm2(tied-def 0), %12.sub_vrm1_0:vrm2, %8:fpr64, 7, 2, 6, 1, implicit $frm, implicit $vl, implicit $vtype
  BNE %11:gpr, $x0, %bb.1
  PseudoBR %bb.2

bb.3.delete.end:
; predecessors: %bb.1

  PseudoRET

# End machine code for function test.

********** INTERVALS **********
X10 [0B,16r:0) 0@0B-phi
%2 [16r,112r:0) 0@16r  weight:0.000000e+00
%8 [96r,160B:0)[352B,464B:0) 0@96r  weight:0.000000e+00
%11 [112r,160B:0)[352B,464B:0) 0@112r  weight:0.000000e+00
%12 [48r,160B:2)[160B,272r:1)[272r,320r:3)[352B,384r:0)[384r,464B:1) 0@352B-phi 1@384r 2@48r 3@272r  L0000000000000004 [48r,160B:1)[160B,240r:0)[272r,320r:3)[352B,384r:2)[384r,464B:0) 0@384r 1@48r 2@352B-phi 3@272r  weight:0.000000e+00
%16 [224r,240e:0)[240e,272r:1) 0@224r 1@240e  weight:0.000000e+00
RegMasks:
********** MACHINEINSTRS **********
# Machine code for function test: NoPHIs, TracksLiveness, TiedOpsRewritten
Function Live Ins: $x10 in %2

0B	bb.0.entry:
	  successors: %bb.2(0x80000000); %bb.2(100.00%)
	  liveins: $x10
16B	  %2:gpr = COPY $x10
32B	  dead $x0 = PseudoVSETIVLI 2, 216, implicit-def $vl, implicit-def $vtype
48B	  undef %12.sub_vrm1_0:vrm2 = PseudoVMV_V_I_M1 undef %12.sub_vrm1_0:vrm2(tied-def 0), 0, 2, 6, 0, implicit $vl, implicit $vtype
96B	  %8:fpr64 = FMV_D_X $x0
112B	  %11:gpr = ANDI %2:gpr, 1
144B	  PseudoBR %bb.2

160B	bb.1.for.body2987.preheader:
	; predecessors: %bb.2
	  successors: %bb.3(0x80000000); %bb.3(100.00%)

208B	  dead $x0 = PseudoVSETIVLI 4, 152, implicit-def $vl, implicit-def $vtype
224B	  %16:vr = COPY %12.sub_vrm1_0:vrm2
240B	  early-clobber %16:vr = PseudoVSLIDEUP_VI_M1 %16:vr(tied-def 0), %12.sub_vrm1_0:vrm2, 2, 4, 6, 0, implicit $vl, implicit $vtype
272B	  undef %12.sub_vrm1_0:vrm2 = COPY %16:vr
304B	  dead $x0 = PseudoVSETIVLI 8, 217, implicit-def $vl, implicit-def $vtype
320B	  PseudoVSE64_V_M2 %12:vrm2, $x0, 8, 6, implicit $vl, implicit $vtype :: (store (s512) into `ptr null`, align 8)
336B	  PseudoBR %bb.3

352B	bb.2.for.body2902:
	; predecessors: %bb.0, %bb.2
	  successors: %bb.1(0x04000000), %bb.2(0x7c000000); %bb.1(3.12%), %bb.2(96.88%)

384B	  undef %12.sub_vrm1_0:vrm2 = nofpexcept PseudoVFADD_VFPR64_M1 undef %12.sub_vrm1_0:vrm2(tied-def 0), %12.sub_vrm1_0:vrm2, %8:fpr64, 7, 2, 6, 1, implicit $frm, implicit $vl, implicit $vtype
432B	  BNE %11:gpr, $x0, %bb.1
448B	  PseudoBR %bb.2

464B	bb.3.delete.end:
	; predecessors: %bb.1

480B	  PseudoRET

# End machine code for function test.

There's something surprising in those live intervals. I'm saying surprising here carefully - I don't know if it's correct or not.

There's a segment for %12 from 160B to 272r for sub-reg1. If you look at the definition given on MachineOperand::readsReg, the undef should imply there's no read of the register for the sub-register write. The alternate segment would terminate on 240r.

This doesn't occur with -riscv-enable-subreg-liveness=0. This removes the undef from 272, and thus makes the live intervals non-surprising.

Looking at MachineVerifier::verifyLiveRangeSegment this behavior appears to be allowed.

From there, we end up with a mismatch between RegisterPressure's expectations and the actual LiveInterval.

RegisterPressure determines that %12 is live out of block bb.1. This is very odd as there's no user outside this block. What appears to be happening is that isLiveThroughAt is given the slot index of the current instruction being processed, not the end of the region.

When processing 272, we correctly recognize the undef sub-register write as killing the register. However, when processing 240B, we see a "new" use and decide that %12 must be live out. This happens because we ask not if %12 is live at 336d, but at 272B. This decision is the root of our later troubles.

Later on, MachineScheduler has the following line:

  // For each live out vreg reduce the pressure change associated with other
  // uses of the same vreg below the live-out reaching def.
  updatePressureDiffs(RPTracker.getPressure().LiveOutRegs);

updatePressureDiffs then looks up the live interval for current position of the BotRPTracker - NOT the same as the current position of the RPTracker used in the backwards walk above - which is the end of the block. LIS correctly says that our register isn't live at 336B and we fail consistency asserts.

The question is how to fix this. I see a couple options.

Option 1 - Adjust isLiveThroughAt to use the BottomIdx (i.e. the bottom of the region).

This results in %12 not being consider lived at the end of the block. However, this code is quite old, so I'm a bit distrustful that the seemingly "obviously" wrong check isn't important for something I'm missing.

Option 2 - Bail out in updatePressureDiffs

We could simply skip the entry which isn't live at the end of the region. This is the most localized fix, but pretty questionable.

Option 3 - Declare the "surprising" live interval result wrong.

Argument for this is that the surprise lead to a crash here, there may be other code which is surprised by this result as well. I haven't found the bit of code which formed this range.

p.s. When reading this code, ignore all of the ShouldTrackLaneMasks codepaths. As far as I can tell, that configuration is completely unreachable in tree?

@jayfoad
Copy link
Contributor

jayfoad commented Oct 26, 2023

There's a segment for %12 from 160B to 272r for sub-reg1.

I think you might be misreading the dump here. This is the main range: %12 [48r,160B:2)[160B,272r:1)[272r,320r:3)[352B,384r:0)[384r,464B:1) 0@352B-phi 1@384r 2@48r 3@272r. And it has one subrange: L0000000000000004 [48r,160B:1)[160B,240r:0)[272r,320r:3)[352B,384r:2)[384r,464B:0) 0@384r 1@48r 2@352B-phi 3@272r

If you look at the definition given on MachineOperand::readsReg, the undef should imply there's no read of the register for the sub-register write. The alternate segment would terminate on 240r.

The subrange does indeed have a segment that ends at 240r.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:RISC-V crash Prefer [crash-on-valid] or [crash-on-invalid] llvm:codegen LTO Link time optimization (regular/full LTO or ThinLTO)
Projects
None yet
Development

No branches or pull requests

5 participants