Watchdog Example not working as expected on a Nordic chip #29991

caco3 · 2020-11-13T09:40:12Z

In the Watchdog Example the example shows that it is possible to trigger a callback on a watchdog trigger before the hardwired watchdog action. The callback re-feeds the watchdog and prints a whole line to the log.

At least on a Nordic nrf52 chip, this does not work as shown in the example!

Expected Outcome:
The log shows the line "Handled things..ready to reset" before the device gets reset

Seen Outcome:
Only the first 1 character is shown: A▒*** Booting Zephyr OS build zephyr-v2.4.0 ***
(Note: Character 3 and later are already the next startup)

https://devzone.nordicsemi.com/f/nordic-q-a/38926/wdt-and-error-handler states the following:

And when checking if the watchdog callback function is called, remember that this function is called two cycles of the 32 kHz clock before the reset is done, so it is a very little time to do something.

This leads me to the conclusion, that the Watchdog Example (at least on a Nordic nRF52) does not work as expected!

Suggestions:

Add a note to the example indicating that this does not behave as expected on all devices
IMO the line static bool handled_event; should be changed to static bool handled_event = false; to make sure it does not have a random value!

The text was updated successfully, but these errors were encountered:

caco3 · 2020-11-13T09:41:23Z

If you can confirm my conclusion, I am happy to write a patch for it.

pabigot · 2020-11-17T16:42:20Z

If you can confirm my conclusion, I am happy to write a patch for it.

@caco3 Please do. There are multiple parts of the watchdog API that aren't entirely cross-vendor capable, so documenting it would be good. But:

static bool handled_event; should be changed to static bool handled_event = false; to make sure it does not have a random value!

No, we don't want that. C defines that static variables, like file-scope variables, are default-initialized to all zeros, so the value will be false when first encountered. An explicit assignment, if not optimized out because it has no effect, would waste flash memory for storing the initial value then writing it to RAM the first time the function is invoked.

katsuster · 2020-12-02T17:08:26Z

I faced same problem on SiFive HiFive1 rev.b board.

No one has enough time to callback wdt_callback() because WDT_FLAG_RESET_SOC request
hardwired reset from watchdog. WDT will asserts reset signal immediately when it reached timeout.

I think this sample (and also tests/drivers/watchdog) is a little strange.
If sample want to show callback mechanism from Zephyr WDT framework,
this sample should use WDT_FLAG_NO_RESET instead of WDT_FLAG_RESET_SOC.

So in my opinion, shall we split sample into 2 parts:

part 1: using WDT_FLAG_NORESET with callback
part 2: using WDT_FLAG_RESET_SOC with hardwired reset

I will be appreciated any comments from you. Thanks.

pabigot · 2021-01-26T16:58:07Z

Copied from slack for posterity:

pabigot Topic for 2021-01-26: Watchdog Example not working as expected on a Nordic chip #29991 notes that Nordic allows scheduling a callback on a RESET_SOC watchdog configuration, but in practice this isn't useful as you get 61.2 us for the callback to complete its action before the reset occurs. I suspect other SoCs are also failing to reject this configuration when they should, causing the test to fail. I'd like to refactor the test so callback is tested using NO_RESET, but want to understand whether removing the callback-before-reset feature in Nordic is an acceptable solution. It would be a breaking behavioral change if anybody actually uses this.
jyates I'm not sure where to comment on this, so I'll do it here. I currently use the callback feature to store state before the reboot occurs. Primarily the current civil time so we can restore it on reboot, but also the thread we were in when the watchdog fired. As noted, 60uS isn't much time, so we don't bother trying to call LOG_PANIC , our function is basically 6 variable assignments.
pabigot Fooie. OK, will have to figure out another solution.

So I don't think this change is appropriate and the sample needs to be fixed in a different way.

mnkp · 2021-01-26T21:50:44Z

The current implementation of watchdog sample application is incorrect and violates original design of watchdog API. It is not possible to reliably support scenario when watchdog timeout triggers a callback and SoC reset at the same time. The watchdog API was not designed to be used this way.

The scenario described by jyates where watchdog timeout triggers a callback followed by SoC reset can only be supported in a reliable manner by a multistage watchdog. This requires dedicated hardware support and can be enabled by CONFIG_WDT_MULTISTAGE Kconfig option. In this case the first stage should be configured to trigger a callback, and the second stage to trigger SoC reset.

Unfortunately, none of the existing drivers seems to implement support for multistage watchdog.

I propose to fix the sample application and improve watchdog API specification.

pabigot · 2021-02-03T13:34:40Z

PR #31925 addresses the reported issue and would close this. However, there are a variety of additional problems with that sample, and with several drivers, that mandate further attention early in 2.6, including:

Both sifive and gecko should return -ENOTSUP on the attempt to configure a callback on RESET_SOC since they don't support that configuration, and the documentation says that unsupported configurations should be diagnosed.
I think same70 should support the callback, but I'm not seeing it executed.

Those issues as well as the concerns in #29991 (comment) should be tracked in an enhancement issue for a review of the watchdog API and implementation.

caco3 added the bug The issue is a bug, or the PR is fixing a bug label Nov 13, 2020

carlescufi assigned anangl Nov 17, 2020

carlescufi added area: Watchdog Watchdog platform: nRF Nordic nRFx labels Nov 17, 2020

carlescufi assigned pabigot Nov 17, 2020

nashif added the priority: low Low impact/importance bug label Nov 17, 2020

katsuster mentioned this issue Dec 14, 2020

drivers: watchdog: implement SiFive FE310 watchdog driver #30320

Merged

carlescufi unassigned anangl Feb 3, 2021

pabigot mentioned this issue Feb 3, 2021

samples: driver: watchdog: rework how callback support is handled #31925

Merged

pabigot modified the milestone: v2.5.0 Feb 4, 2021

nashif closed this as completed in #31925 Feb 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Watchdog Example not working as expected on a Nordic chip #29991

Watchdog Example not working as expected on a Nordic chip #29991

caco3 commented Nov 13, 2020

caco3 commented Nov 13, 2020

pabigot commented Nov 17, 2020

katsuster commented Dec 2, 2020

pabigot commented Jan 26, 2021

mnkp commented Jan 26, 2021 •

edited

Loading

pabigot commented Feb 3, 2021

Watchdog Example not working as expected on a Nordic chip #29991

Watchdog Example not working as expected on a Nordic chip #29991

Comments

caco3 commented Nov 13, 2020

caco3 commented Nov 13, 2020

pabigot commented Nov 17, 2020

katsuster commented Dec 2, 2020

pabigot commented Jan 26, 2021

mnkp commented Jan 26, 2021 • edited Loading

pabigot commented Feb 3, 2021

mnkp commented Jan 26, 2021 •

edited

Loading