Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rv_dm] rv_dm_access_after_wakeup FPGA failures #22823

Open
jwnrt opened this issue Apr 25, 2024 · 18 comments
Open

[rv_dm] rv_dm_access_after_wakeup FPGA failures #22823

jwnrt opened this issue Apr 25, 2024 · 18 comments
Assignees
Labels

Comments

@jwnrt
Copy link
Contributor

jwnrt commented Apr 25, 2024

Description

This test (which runs in rma, dev, and test_unlocked1) has been failing on FPGAs since commit b2239fc.

That commit is almost certainly not the cause of the RV DM error, but the size change seems to have triggered a change in the FPGA routing and broken something.

The error comes from OpenOCD failing to connect to the debug module after the chip wakes from deep sleep.

Here are the parts of the test where the failure triggers:

Here's what OpenOCD says:

CMSIS-DAP: JTAG supported
CMSIS-DAP: FW Version = 2.1.1
CMSIS-DAP: Serial# = 204437845853
CMSIS-DAP: Interface Initialised (JTAG)
SWCLK/TCK = 0 SWDIO/TMS = 0 TDI = 0 TDO = 0 nTRST = 0 nRESET = 0
CMSIS-DAP: Interface ready
clock speed 1000 kHz
cmsis-dap JTAG TLR_RESET
cmsis-dap JTAG TLR_RESET
JTAG scan chain interrogation failed: all ones
Check JTAG interface, timings, target power, etc.
Trying to use configured scan chain anyway...
riscv.tap: IR capture error; saw 0x1f not 0x01
cmsis-dap JTAG TLR_RESET
Bypassing JTAG setup events due to errors
Unsupported DTM version: 15
target riscv.tap.0 examination failed
gdb port disabled

@a-will reports that this issue does not present in FPGA bitsreams built with Vivado version 2023 but it does with version 2021 that our CI uses. The lifecycle controller TAP is also not working.

@andreaskurth
Copy link
Contributor

Thx for reporting this issue @jwnrt. Adding to M4 to ensure we resolve this in time.

This test (which runs in rma, dev, and test_unlocked1) has been failing on FPGAs since commit b2239fc.

Do you know if the parent commit (d77a3a3) is known good, i.e., the test passes in all LC states for that parent commit?

@a-will
Copy link
Contributor

a-will commented Apr 25, 2024

Thx for reporting this issue @jwnrt. Adding to M4 to ensure we resolve this in time.

This test (which runs in rma, dev, and test_unlocked1) has been failing on FPGAs since commit b2239fc.

Do you know if the parent commit (d77a3a3) is known good, i.e., the test passes in all LC states for that parent commit?

It does.

The commit where failures first appear seems to have merely triggered a latent bug, likely either in Vivado's synthesis / layout tools or in the timing of the JTAG enablement pathways.

@andreaskurth
Copy link
Contributor

Ok, I'll take a closer look at the JTAG enablement pathways.

@andreaskurth andreaskurth self-assigned this Apr 25, 2024
@jwnrt
Copy link
Contributor Author

jwnrt commented Apr 26, 2024

The test has started passing again with some recent RTL changes today, but it doesn't look like it was intentionally fixed. This could mean the issue still exists but is masked by a different routing on FPGAs?

@a-will
Copy link
Contributor

a-will commented Apr 26, 2024

The test has started passing again with some recent RTL changes today, but it doesn't look like it was intentionally fixed. This could mean the issue still exists but is masked by a different routing on FPGAs?

Yes, that's right. We don't know if it is just a tool bug or some timing problem, though.

moidx added a commit to moidx/opentitan that referenced this issue Apr 29, 2024
Ongoing investigation in
lowRISC#22823.

Signed-off-by: Miguel Osorio <[email protected]>
moidx added a commit that referenced this issue Apr 29, 2024
Ongoing investigation in
#22823.

Signed-off-by: Miguel Osorio <[email protected]>
@timothytrippel
Copy link
Contributor

FYI: the following tests need to be re-activated in CI once this is addressed: bd3e4ed

@johngt johngt added the Component:DV DV issue: testbench, test case, etc. label May 16, 2024
@vogelpi
Copy link
Contributor

vogelpi commented Jun 4, 2024

@andreaskurth has been able to reproduce this but couldn't root cause this. Thinking that it could be a problem on ASIC but so far no indication of that DV is fine. Prioritizing other P0 and P1.

@a-will if timing related, it could be that stuff is handled better on ASIC because the SDCs are not the same.

@moidx do have test coverage in GLS.

Discussed to leave priority as is but we prioritize other P0s and P1s first.

@vogelpi
Copy link
Contributor

vogelpi commented Jun 6, 2024

@moidx it would be nice to capture the findings such that someone else can pick up the work if someone becomes available. @andreaskurth , would be able to document the steps taken please?

@moidx
Copy link
Contributor

moidx commented Jun 6, 2024

This may or may not be relevant:

--build-seed 104714960319679935410420483500971829136303708457300037460974663680452494898918

GitHub Revision: b29ffbb03c

VCS

UVM_FATAL @ * us: (chip_sw_rv_dm_access_after_wakeup_vseq.sv:56) [chip_sw_rv_dm_access_after_wakeup_vseq] Timed out waiting for device to enter normal sleep. has 3 failures:

Test chip_sw_rv_dm_access_after_wakeup has 3 failures.
0.chip_sw_rv_dm_access_after_wakeup.77787982882959533724642802343103680401343926437350432772420472162649361881555
Line 802, in log /container/opentitan-public/scratch/os_regression/chip_earlgrey_asic-sim-vcs/0.chip_sw_rv_dm_access_after_wakeup/latest/run.log

  UVM_FATAL @ 4575.453826 us: (chip_sw_rv_dm_access_after_wakeup_vseq.sv:56) [uvm_test_top.env.virtual_sequencer.chip_sw_rv_dm_access_after_wakeup_vseq] Timed out waiting for device to enter normal sleep.
  UVM_INFO @ 4575.453826 us: (uvm_report_catcher.svh:705) [UVM/REPORT/CATCHER]
  --- UVM Report catcher Summary ---
  
  
1.chip_sw_rv_dm_access_after_wakeup.42045925832267773038863112318651299469133308811198817911363044455600557074244
Line 780, in log /container/opentitan-public/scratch/os_regression/chip_earlgrey_asic-sim-vcs/1.chip_sw_rv_dm_access_after_wakeup/latest/run.log

  UVM_FATAL @ 3673.546356 us: (chip_sw_rv_dm_access_after_wakeup_vseq.sv:56) [uvm_test_top.env.virtual_sequencer.chip_sw_rv_dm_access_after_wakeup_vseq] Timed out waiting for device to enter normal sleep.
  UVM_INFO @ 3673.546356 us: (uvm_report_catcher.svh:705) [UVM/REPORT/CATCHER]

@moidx
Copy link
Contributor

moidx commented Jun 6, 2024

Moving to P2 as most critical use cases for rv_dm don't involve power transitions.

@timothytrippel
Copy link
Contributor

They do involve software initiated resets and PORs, but no sleep / wake functionality. Can we remove the broken tags here and here then (to get these running in presubmit again)?

@andreaskurth andreaskurth added the Triage: deprioritize? temporary label for triage; issue could be deprioritized label Jul 4, 2024
@andreaskurth
Copy link
Contributor

andreaskurth commented Jul 4, 2024

Just discussed in triage, keeping this in M5 for now since there are related DV tests that are failing

@andreaskurth andreaskurth removed the Triage: deprioritize? temporary label for triage; issue could be deprioritized label Jul 4, 2024
@andreaskurth
Copy link
Contributor

The DV runs of rv_dm_access_after_wakeup should get fixed by PR #23924.

@andreaskurth
Copy link
Contributor

Just discussed in triage: If we cannot close this by the end of next week due to resourcing constraints, we'll take it with us to M6.

@timothytrippel
Copy link
Contributor

Just curious: @andreaskurth is this reproducible in DV? or only on FPGA?

@andreaskurth
Copy link
Contributor

Only on FPGA at this point

@andreaskurth
Copy link
Contributor

Moving to M6 to continue analysis through CDC tools

@vogelpi
Copy link
Contributor

vogelpi commented Jul 26, 2024

Discussed in the triage meeting to move this to M7 as P1. In the remaining time of M6, we want to focus on the analysis through CDC.

@moidx Could also test that in GLS. But may opt to not fix it if it fails given the timeline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants