Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

retained_mem/retention: Allow disabling mutex support #59264

Merged
merged 2 commits into from
Jul 12, 2023

Conversation

nordicjm
Copy link
Collaborator

@nordicjm nordicjm commented Jun 15, 2023

drivers: retained_mem: Allow disabling mutex support

Changes the Kconfig option to allow disabling mutex support, this
is to allow other Kconfig options to disable the feature.

and

retention: Allow disabling mutex support

Changes the Kconfig option to allow disabling mutex support, this
is to allow other Kconfig options to disable the feature. Also
adds an optional dts option which can be used to disable mutex
support for a specific retention area.

Fixes #59254

@nordicjm
Copy link
Collaborator Author

CC @gromero

de-nordic
de-nordic previously approved these changes Jun 21, 2023
Copy link
Collaborator

@gromero gromero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nordicjm Hi Jamie! Sorry for the delay in reviewing it, I was commuting to Prague for ZDS.

Disabling the mutexes from Kconfig works great now with

imply RETENTION_MUTEX_FORCE_DISABLE
imply RETAINED_MEM_MUTEX_FORCE_DISABLE

Cool!

However the addition of "no-mutex" attribute to the retention DT node has not the same effect as
disabling mutexes for both retention and retained_mem driver, i.e. has not the same effect as setting RETENTION_MUTEX_FORCE_DISABLE=y and RETAINED_MEM_MUTEX_FORCE_DISABLE=y because the retained mem code is not aware of that attribute, hence, in my case, only adding "no-mutex" attribute to the retention DT node is not sufficient. But I'm cool with adding the configs to the Kconfig, so it's more a heads up in case the idea was to use the attribute to disable the mutexes for retention and retained mem driver (both at the same time).

Other than that there is just a minor typo (see inline comment).

Thanks!

access. Should only be disabled whereby retained memory access is
required in an ISR or for special use cases.
Disable use of mutexes which prevent issues with concurrent retention
device access. This option should should only be enabled when
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/should should/should/

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, have updated. I've added no-mutex support to retained memory drivers too but was a bit hesitant as it's not really a "driver" feature but seems like it's worth adding as someone might want multiple areas and requiring mutex support to be disabled for all of them is a bit of a nuisance

@@ -124,6 +132,7 @@ static const struct retained_mem_driver_api zephyr_retained_mem_ram_api = {
zephyr_retained_mem_ram_config_##inst = { \
.address = (uint8_t *)DT_REG_ADDR(DT_PARENT(DT_INST(inst, DT_DRV_COMPAT))), \
.size = DT_REG_SIZE(DT_PARENT(DT_INST(inst, DT_DRV_COMPAT))), \
RETAINED_MEM_MUTEX(inst) \
Copy link
Collaborator

@gromero gromero Jun 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nordicjm Which value is set for config->mutex in reatined_mem_zephyr_ram.c if no DT node is defined for retained mem driver, i.e. just the retention DT node is defined? I'm asking that because by reading the code I would assume it would be true however I'm getting it set to false:

#0  zephyr_retained_mem_ram_lock_take (dev=dev@entry=0x800deb8 <__device_dts_ord_154>) at /home/gromero/zephyrproject/zephyr/drivers/retained_mem/retained_mem_zephyr_ram.c:33
#1  0x08007300 in zephyr_retained_mem_ram_read (dev=0x800deb8 <__device_dts_ord_154>, offset=0, buffer=0x20007f58 <z_interrupt_stacks+1928> '\252' <repeats 16 times>, "\f", size=2)
    at /home/gromero/zephyrproject/zephyr/drivers/retained_mem/retained_mem_zephyr_ram.c:83
#2  0x080040a8 in z_impl_retained_mem_read (dev=dev@entry=0x800deb8 <__device_dts_ord_154>, offset=offset@entry=0, buffer=buffer@entry=0x20007f58 <z_interrupt_stacks+1928> '\252' <repeats 16 times>, "\f", size=size@entry=2)
    at /home/gromero/zephyrproject/zephyr/include/zephyr/drivers/retained_mem.h:132
#3  0x080040ea in retained_mem_read (dev=0x800deb8 <__device_dts_ord_154>, offset=0, buffer=buffer@entry=0x20007f58 <z_interrupt_stacks+1928> '\252' <repeats 16 times>, "\f", size=size@entry=2)
    at /home/gromero/zephyrproject/zephyr/build/PR59264/instrumentation/zephyr/include/generated/syscalls/retained_mem.h:59
#4  0x080043c2 in retention_is_valid (dev=0x800ded0 <__device_dts_ord_155>) at /home/gromero/zephyrproject/zephyr/subsys/retention/retention.c:188
#5  0x08001c14 in instr_init () at /home/gromero/zephyrproject/zephyr/subsys/instrumentation/common/instr_common.c:94
#6  0x0800dc90 in __cyg_profile_func_enter (callee=callee@entry=0x8009ac5 <z_early_memcpy>, caller=caller@entry=0x800d451 <z_data_copy+36>) at /home/gromero/zephyrproject/zephyr/subsys/instrumentation/handlers/instr_handlers_gcc.c:25
#7  0x08009adc in z_early_memcpy (dst=0x20000000 <_char_out>, src=0x800e9a8, n=0) at /home/gromero/zephyrproject/zephyr/kernel/init.c:131
#8  0x0800d450 in z_data_copy () at /home/gromero/zephyrproject/zephyr/kernel/xip.c:27
#9  0x08002e42 in z_arm_prep_c () at /home/gromero/zephyrproject/zephyr/arch/arm/core/aarch32/prep_c.c:264
#10 0x08003860 in z_arm_reset () at /home/gromero/zephyrproject/zephyr/arch/arm/core/aarch32/cortex_m/reset.S:169
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) n
38		if (config->mutex) {
(gdb) list
33	{
34	#ifdef CONFIG_RETAINED_MEM_MUTEXES
35		struct zephyr_retained_mem_ram_data *data = dev->data;
36		const struct zephyr_retained_mem_ram_config *config = dev->config;
37	
38		if (config->mutex) {
39			k_mutex_lock(&data->lock, K_FOREVER);
40		}
41	#else
42		ARG_UNUSED(dev);
(gdb) p config->mutex
$1 = false
(gdb) 

It means that now "by default" the mutexes in retained mem driver are disabled, i.e. if I don't force any mutex disabled by CONFIG or in the DT node I get mutexes disabled. Is that behavior intentional -- it's different from the current one, where mutexes are enabled by default?

I think that happens because no DT node is defined for retained mem driver? But on the other way it's a bit tricky to make config->mutex in retained mem driver dependent on the no-mutex attribute in retention DT node?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the DT macro I used does not actually work on bool entries, which is not documented, so have switched to an alternative macro which works.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Now the default behavior is ok (true). However I'm still not able to disable the mutexes by only adding attribute no-mutex to the retention DT node. Just to be sure I'm not missing something. Is the idea that the following overlay would disable mutexed for both retention and the retained mem driver?

/*
 * Copyright 2023 Linaro
 *
 * SPDX-License-Identifier: Apache-2.0
 */

#include <common/mem.h>

/ {
        sram@200BF000 {
                compatible = "zephyr,memory-region", "mmio-sram";
                reg = <0x200BF000 0x20>;
                zephyr,memory-region = "RetainedMem";
                status = "okay";

                retainedmem {
                        compatible = "zephyr,retained-ram";
                        status = "okay";
                        #address-cells = <1>;
                        #size-cells = <1>;

                        retention0: retention@0 {
                                compatible = "zephyr,retention";
                                status = "okay";

                                no-mutex;

                                reg = <0x0 0x20>;

                                prefix = [BE EF];
                        };
               };
        };
};

&sram0 {
       reg = <0x20000000 DT_SIZE_K(764)>;
};

or it's missing that attribute in another DT node?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also needs it in the retained memory part too, so would be listed in the overlay twice

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nordicjm Awesome, all good so! I've just tested and it works nicely. So think it's only missing to address the build issue with bool missing a string (my last review).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for the records and others, that's an example for disabling the mutexes using DT attribute no-mutex:

/*
 * Copyright 2023 Linaro
 *
 * SPDX-License-Identifier: Apache-2.0
 */

#include <common/mem.h>

/ {
        sram@200BF000 {
                compatible = "zephyr,memory-region", "mmio-sram";
                reg = <0x200BF000 0x20>;
                zephyr,memory-region = "RetainedMem";
                status = "okay";

                retainedmem {
                        compatible = "zephyr,retained-ram";
                        status = "okay";
                        #address-cells = <1>;
                        #size-cells = <1>;

                        no-mutex;

                        retention0: retention@0 {
                                compatible = "zephyr,retention";
                                status = "okay";

                                reg = <0x0 0x20>;

                                no-mutex;

                                prefix = [BE EF];
                        };
               };
        };
};

&sram0 {
       reg = <0x20000000 DT_SIZE_K(764)>;
};

Copy link
Collaborator

@gromero gromero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing: if I set CONFIG_RETAINED_MEM_MUTEXES=n and CONFIG_RETENTION_MUTEXES=n in prj.conf I get the following build errors:

-- west debug: rebuilding
[0/1] Re-running CMake...
Loading Zephyr default modules (Zephyr base (cached)).
-- Application: /home/gromero/zephyrproject/zephyr/samples/subsys/instrumentation
-- CMake version: 3.22.1
-- Cache files will be written to: /home/gromero/.cache/zephyr
-- Zephyr version: 3.4.0-rc3 (/home/gromero/zephyrproject/zephyr)
-- Found west (found suitable version "1.0.0", minimum required is "0.14.0")
-- Board: b_u585i_iot02a
-- Found host-tools: zephyr 0.16.0 (/home/gromero/zephyr-sdk-0.16.0)
-- Found toolchain: zephyr 0.16.0 (/home/gromero/zephyr-sdk-0.16.0)
-- Found BOARD.dts: /home/gromero/zephyrproject/zephyr/boards/arm/b_u585i_iot02a/b_u585i_iot02a.dts
-- Found devicetree overlay: /home/gromero/zephyrproject/zephyr/samples/subsys/instrumentation/boards/b_u585i_iot02a.overlay
-- Generated zephyr.dts: /home/gromero/zephyrproject/zephyr/build/PR59264/instrumentation/zephyr/zephyr.dts
-- Generated devicetree_generated.h: /home/gromero/zephyrproject/zephyr/build/PR59264/instrumentation/zephyr/include/generated/devicetree_generated.h
-- Including generated dts.cmake file: /home/gromero/zephyrproject/zephyr/build/PR59264/instrumentation/zephyr/dts.cmake
/home/gromero/zephyrproject/zephyr/build/PR59264/instrumentation/zephyr/zephyr.dts:732.30-761.5: Warning (spi_bus_bridge): /soc/octospi@420d2400: node name for SPI buses should be 'spi'
<stdout>: Warning (spi_bus_reg): Failed prerequisite 'spi_bus_bridge'
Parsing /home/gromero/zephyrproject/zephyr/Kconfig
Loaded configuration '/home/gromero/zephyrproject/zephyr/boards/arm/b_u585i_iot02a/b_u585i_iot02a_defconfig'
Merged configuration '/home/gromero/zephyrproject/zephyr/samples/subsys/instrumentation/prj.conf'

error: RETAINED_MEM_MUTEXES (defined at drivers/retained_mem/Kconfig:18) is assigned in a
configuration file, but is not directly user-configurable (has no prompt). It gets its value
indirectly from other symbols. See
http://docs.zephyrproject.org/latest/kconfig.html#CONFIG_RETAINED_MEM_MUTEXES and/or look up
RETAINED_MEM_MUTEXES in the menuconfig/guiconfig interface. The Application Development Primer,
Setting Configuration Values, and Kconfig - Tips and Best Practices sections of the manual might be
helpful too.

CMake Error at /home/gromero/zephyrproject/zephyr/cmake/modules/kconfig.cmake:343 (message):
  command failed with return code: 1
Call Stack (most recent call first):
  /home/gromero/zephyrproject/zephyr/cmake/modules/zephyr_default.cmake:115 (include)
  /home/gromero/zephyrproject/zephyr/share/zephyr-package/cmake/ZephyrConfig.cmake:66 (include)
  /home/gromero/zephyrproject/zephyr/share/zephyr-package/cmake/ZephyrConfig.cmake:97 (include_boilerplate)
  CMakeLists.txt:5 (find_package)


-- Configuring incomplete, errors occurred!
See also "/home/gromero/zephyrproject/zephyr/build/PR59264/instrumentation/CMakeFiles/CMakeOutput.log".
See also "/home/gromero/zephyrproject/zephyr/build/PR59264/instrumentation/CMakeFiles/CMakeError.log".
FAILED: build.ninja 
/usr/bin/cmake --regenerate-during-build -S/home/gromero/zephyrproject/zephyr/samples/subsys/instrumentation -B/home/gromero/zephyrproject/zephyr/build/PR59264/instrumentation
ninja: error: rebuilding 'build.ninja': subcommand failed
FATAL ERROR: re-build in ./build/PR59264/instrumentation/ failed

and

-- west debug: rebuilding
[0/1] Re-running CMake...
Loading Zephyr default modules (Zephyr base (cached)).
-- Application: /home/gromero/zephyrproject/zephyr/samples/subsys/instrumentation
-- CMake version: 3.22.1
-- Cache files will be written to: /home/gromero/.cache/zephyr
-- Zephyr version: 3.4.0-rc3 (/home/gromero/zephyrproject/zephyr)
-- Found west (found suitable version "1.0.0", minimum required is "0.14.0")
-- Board: b_u585i_iot02a
-- Found host-tools: zephyr 0.16.0 (/home/gromero/zephyr-sdk-0.16.0)
-- Found toolchain: zephyr 0.16.0 (/home/gromero/zephyr-sdk-0.16.0)
-- Found BOARD.dts: /home/gromero/zephyrproject/zephyr/boards/arm/b_u585i_iot02a/b_u585i_iot02a.dts
-- Found devicetree overlay: /home/gromero/zephyrproject/zephyr/samples/subsys/instrumentation/boards/b_u585i_iot02a.overlay
-- Generated zephyr.dts: /home/gromero/zephyrproject/zephyr/build/PR59264/instrumentation/zephyr/zephyr.dts
-- Generated devicetree_generated.h: /home/gromero/zephyrproject/zephyr/build/PR59264/instrumentation/zephyr/include/generated/devicetree_generated.h
-- Including generated dts.cmake file: /home/gromero/zephyrproject/zephyr/build/PR59264/instrumentation/zephyr/dts.cmake
/home/gromero/zephyrproject/zephyr/build/PR59264/instrumentation/zephyr/zephyr.dts:732.30-761.5: Warning (spi_bus_bridge): /soc/octospi@420d2400: node name for SPI buses should be 'spi'
<stdout>: Warning (spi_bus_reg): Failed prerequisite 'spi_bus_bridge'
Parsing /home/gromero/zephyrproject/zephyr/Kconfig
Loaded configuration '/home/gromero/zephyrproject/zephyr/boards/arm/b_u585i_iot02a/b_u585i_iot02a_defconfig'
Merged configuration '/home/gromero/zephyrproject/zephyr/samples/subsys/instrumentation/prj.conf'

error: RETENTION_MUTEXES (defined at subsys/retention/Kconfig:22) is assigned in a configuration
file, but is not directly user-configurable (has no prompt). It gets its value indirectly from other
symbols. See http://docs.zephyrproject.org/latest/kconfig.html#CONFIG_RETENTION_MUTEXES and/or look
up RETENTION_MUTEXES in the menuconfig/guiconfig interface. The Application Development Primer,
Setting Configuration Values, and Kconfig - Tips and Best Practices sections of the manual might be
helpful too.

CMake Error at /home/gromero/zephyrproject/zephyr/cmake/modules/kconfig.cmake:343 (message):
  command failed with return code: 1
Call Stack (most recent call first):
  /home/gromero/zephyrproject/zephyr/cmake/modules/zephyr_default.cmake:115 (include)
  /home/gromero/zephyrproject/zephyr/share/zephyr-package/cmake/ZephyrConfig.cmake:66 (include)
  /home/gromero/zephyrproject/zephyr/share/zephyr-package/cmake/ZephyrConfig.cmake:97 (include_boilerplate)
  CMakeLists.txt:5 (find_package)


-- Configuring incomplete, errors occurred!
See also "/home/gromero/zephyrproject/zephyr/build/PR59264/instrumentation/CMakeFiles/CMakeOutput.log".
See also "/home/gromero/zephyrproject/zephyr/build/PR59264/instrumentation/CMakeFiles/CMakeError.log".
FAILED: build.ninja 
/usr/bin/cmake --regenerate-during-build -S/home/gromero/zephyrproject/zephyr/samples/subsys/instrumentation -B/home/gromero/zephyrproject/zephyr/build/PR59264/instrumentation
ninja: error: rebuilding 'build.ninja': subcommand failed
FATAL ERROR: re-build in ./build/PR59264/instrumentation/ failed

which I fixed with:

diff --git a/drivers/retained_mem/Kconfig b/drivers/retained_mem/Kconfig
index 4ea7dcd894..fb1d7bca1d 100644
--- a/drivers/retained_mem/Kconfig
+++ b/drivers/retained_mem/Kconfig
@@ -16,7 +16,7 @@ config RETAINED_MEM_INIT_PRIORITY
          Retained memory devices initialization priority,
 
 config RETAINED_MEM_MUTEXES
-       bool
+       bool "Use mutexes"
        default y
        depends on MULTITHREADING
        depends on !RETAINED_MEM_MUTEX_FORCE_DISABLE
diff --git a/subsys/retention/Kconfig b/subsys/retention/Kconfig
index dbca084c5a..ccbed2c4e2 100644
--- a/subsys/retention/Kconfig
+++ b/subsys/retention/Kconfig
@@ -20,7 +20,7 @@ config RETENTION_INIT_PRIORITY
          priorities for retained memory drivers.
 
 config RETENTION_MUTEXES
-       bool
+       bool "Use mutexes"
        default y
        depends on MULTITHREADING
        depends on !RETENTION_MUTEX_FORCE_DISABLE

@nordicjm
Copy link
Collaborator Author

These can no longer be used, the reason being that you cannot unselect something, so instead you would set CONFIG_RETAINED_MEM_MUTEX_FORCE_DISABLE and CONFIG_RETENTION_MUTEX_FORCE_DISABLE to y and it would then disable mutex support

@gromero
Copy link
Collaborator

gromero commented Jun 26, 2023

These can no longer be used, the reason being that you cannot unselect something, so instead you would set CONFIG_RETAINED_MEM_MUTEX_FORCE_DISABLE and CONFIG_RETENTION_MUTEX_FORCE_DISABLE to y and it would then disable mutex support

ah, yeah, just like in Kconfig, I see! 👍

Copy link
Collaborator

@gromero gromero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks a lot @nordicjm !

@gromero
Copy link
Collaborator

gromero commented Jun 27, 2023

@de-nordic @mzagrabski Do you mind taking another look since there were some changes since your last review? :-) Thanks!

@gromero gromero requested a review from mzagrabski June 30, 2023 10:34
@gromero
Copy link
Collaborator

gromero commented Jun 30, 2023

@de-nordic Sending a friendly ping about it since Github re-requesting option seems to be unavailable for your username. Thanks!

@nordicjm
Copy link
Collaborator Author

nordicjm commented Jul 5, 2023

@bjarki-trackunit would you mind giving this a review please?

@nordicjm
Copy link
Collaborator Author

nordicjm commented Jul 6, 2023

Good idea, have reworked it to support that with an optional Kconfig to enable it. @gromero not tried this since I don't have a readily available sample to trigger it from an interrupt, can you give this a try?

Copy link
Collaborator

@bjarki-andreasen bjarki-andreasen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice :)

Copy link
Collaborator

@gromero gromero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nordicjm @bjarki-trackunit Hi folks!

My use case is a bit different than taking a coredump in an ISR. It's related to the Instrumentation subsys: #57371

Actually, the main issue is that I need to use retention subsys read/write quite early in the boot process where there isn't a thread context, so _current and owner (in k_mutex Struct) used by k_mutex_lock() aren't even available (are invalid). For instance, with the current change and its defaults (so without forcing mutexes disabled) I get a fault in (stack #7, code line 110):

(gdb) bt
#0  arch_system_halt (reason=25) at /home/gromero/zephyrproject/zephyr/kernel/fatal.c:32
#1  0x0800e6aa in k_sys_fatal_error_handler (reason=25, esf=0x200087a4 <z_main_stack+4020>) at /home/gromero/zephyrproject/zephyr/kernel/fatal.c:46
#2  0x0800e796 in z_fatal_error (reason=25, esf=0x200087a4 <z_main_stack+4020>) at /home/gromero/zephyrproject/zephyr/kernel/fatal.c:131
#3  0x08004632 in z_arm_fatal_error (reason=25, esf=0x200087a4 <z_main_stack+4020>) at /home/gromero/zephyrproject/zephyr/arch/arm/core/aarch32/fatal.c:63
#4  0x080059a6 in z_arm_fault (msp=536905712, psp=536906768, exc_return=4294967228, callee_regs=0x0) at /home/gromero/zephyrproject/zephyr/arch/arm/core/aarch32/cortex_m/fault.c:1132
#5  0x08005a08 in z_arm_usage_fault () at /home/gromero/zephyrproject/zephyr/arch/arm/core/aarch32/cortex_m/fault_s.S:102
#6  <signal handler called>
#7  0x08010b0a in z_impl_k_mutex_lock (mutex=0x20007254 <retention_data_0+4>, timeout=...) at /home/gromero/zephyrproject/zephyr/kernel/mutex.c:110
#8  0x08006762 in k_mutex_lock (mutex=0x20007254 <retention_data_0+4>, timeout=...) at /home/gromero/zephyrproject/zephyr/build/PR59264/zephyr/include/generated/syscalls/kernel.h:955
#9  0x08006a3c in retention_lock_take (dev=0x8016134 <__device_dts_ord_155>) at /home/gromero/zephyrproject/zephyr/subsys/retention/retention.c:55
#10 0x08006cf6 in retention_is_valid (dev=0x8016134 <__device_dts_ord_155>) at /home/gromero/zephyrproject/zephyr/subsys/retention/retention.c:168
#11 0x08003240 in instr_init () at /home/gromero/zephyrproject/zephyr/subsys/instrumentation/common/instr_common.c:106
#12 0x08015ede in __cyg_profile_func_enter (callee=0x800eca5 <z_early_memcpy>, caller=0x8015191 <z_data_copy+44>) at /home/gromero/zephyrproject/zephyr/subsys/instrumentation/handlers/instr_handlers_gcc.c:25
#13 0x0800ecbc in z_early_memcpy (dst=0x20000000 <_char_out>, src=0x8016c68, n=0) at /home/gromero/zephyrproject/zephyr/kernel/init.c:131
#14 0x08015190 in z_data_copy () at /home/gromero/zephyrproject/zephyr/kernel/xip.c:27
#15 0x08004bbe in z_arm_prep_c () at /home/gromero/zephyrproject/zephyr/arch/arm/core/aarch32/prep_c.c:264
#16 0x08005af4 in z_arm_reset () at /home/gromero/zephyrproject/zephyr/arch/arm/core/aarch32/cortex_m/reset.S:169
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) list /home/gromero/zephyrproject/zephyr/kernel/mutex.c:110
105		key = k_spin_lock(&lock);
106	
107		if (likely((mutex->lock_count == 0U) || (mutex->owner == _current))) {
108	
109			mutex->owner_orig_prio = (mutex->lock_count == 0U) ?
110						_current->base.prio :
111						mutex->owner_orig_prio;
112	
113			mutex->lock_count++;
114			mutex->owner = _current;
(gdb) # Line which caused the fault:
(gdb) # 110                                             _current->base.prio :
(gdb) 
(gdb) 

It happens when z_early_memcpy (which runs pretty early in the boot process!) is called and the instrumentation code kicks in, which used the retained mem to "remember" some key addresses for tracing and profiling. @bjarki-trackunit At this point there is no harm in accessing data in a mutex / lock-free way as it will not happen in a concurrent way.

Hence this last change doesn't help my use case (testing confirmed that).

However, I'll go with:

imply RETENTION_MUTEX_FORCE_DISABLE
imply RETAINED_MEM_MUTEX_FORCE_DISABLE

in the Instrumentation subsys Kconfig, which works great!

I have no strong opinion on removing no-mutex attribute from DT, so I'm cool with it.

Thus only the CONFIG_ prefixes are missing in the options given to IS_ENABLED() -- see comments inline, otherwise LGTM!

Thanks!

@@ -32,7 +32,9 @@ static inline void nrf_gpregret_lock_take(const struct device *dev)
#ifdef CONFIG_RETAINED_MEM_MUTEXES
struct nrf_gpregret_data *data = dev->data;

k_mutex_lock(&data->lock, K_FOREVER);
if (!IS_ENABLED(RETAINED_MEM_ALLOW_ISR_ACCESS) || !k_is_in_isr()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/RETAINED_MEM_ALLOW_ISR_ACCESS/CONFIG_RETAINED_MEM_ALLOW_ISR_ACCESS/ ?

help
Disable use of mutexes which prevent issues with concurrent retained
memory access. This option should only be enabled when retained
memory access is required in an ISR or for special use cases.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the idea that RETAINED_MEM_ALLOW_ISR_ACCESS plus k_is_in_isr() handle the use in an ISR? If so, should it mentions ISR in the text for RETAINED_MEM_MUTEX_FORCE_DISABLE. This option now seems more useful for special cases -- which is my case I guess :-)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the ISR check for now and just kept the force disable in this PR, might re-visit the ISR check in future when I have time to add a test for it and check it works

@@ -43,7 +45,9 @@ static inline void nrf_gpregret_lock_release(const struct device *dev)
#ifdef CONFIG_RETAINED_MEM_MUTEXES
struct nrf_gpregret_data *data = dev->data;

k_mutex_unlock(&data->lock);
if (!IS_ENABLED(RETAINED_MEM_ALLOW_ISR_ACCESS) || !k_is_in_isr()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

S/RETAINED_MEM_ALLOW_ISR_ACCESS/CONFIG_RETAINED_MEM_ALLOW_ISR_ACCESS/ ?

@@ -32,7 +32,9 @@ static inline void zephyr_retained_mem_ram_lock_take(const struct device *dev)
#ifdef CONFIG_RETAINED_MEM_MUTEXES
struct zephyr_retained_mem_ram_data *data = dev->data;

k_mutex_lock(&data->lock, K_FOREVER);
if (!IS_ENABLED(RETAINED_MEM_ALLOW_ISR_ACCESS) || !k_is_in_isr()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

S/RETAINED_MEM_ALLOW_ISR_ACCESS/CONFIG_RETAINED_MEM_ALLOW_ISR_ACCESS/?

@@ -43,7 +45,9 @@ static inline void zephyr_retained_mem_ram_lock_release(const struct device *dev
#ifdef CONFIG_RETAINED_MEM_MUTEXES
struct zephyr_retained_mem_ram_data *data = dev->data;

k_mutex_unlock(&data->lock);
if (!IS_ENABLED(RETAINED_MEM_ALLOW_ISR_ACCESS) || !k_is_in_isr()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

S/RETAINED_MEM_ALLOW_ISR_ACCESS/CONFIG_RETAINED_MEM_ALLOW_ISR_ACCESS/ ?

help
Disable use of mutexes which prevent issues with concurrent retention
device access. This option should only be enabled when retention
access is required in an ISR or for special use cases.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here as the one for retained mem driver, about mentioning ISR here.

@@ -51,7 +51,9 @@ static inline void retention_lock_take(const struct device *dev)
#ifdef CONFIG_RETENTION_MUTEXES
struct retention_data *data = dev->data;

k_mutex_lock(&data->lock, K_FOREVER);
if (!IS_ENABLED(RETENTION_ALLOW_ISR_ACCESS) || !k_is_in_isr()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing CONFIG_ prefix as well.

@@ -62,7 +64,9 @@ static inline void retention_lock_release(const struct device *dev)
#ifdef CONFIG_RETENTION_MUTEXES
struct retention_data *data = dev->data;

k_mutex_unlock(&data->lock);
if (!IS_ENABLED(RETENTION_ALLOW_ISR_ACCESS) || !k_is_in_isr()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing CONFIG_ prefix as well.

@bjarki-andreasen
Copy link
Collaborator

@gromero Has the function k_is_pre_kernel() been considered? A combination of that could maybe be used?

Something like

...

    if (k_is_pre_kernel() == false)
    {
        k_timeout_t timeout = k_is_in_isr() ? K_NO_WAIT : K_FOREVER;
        k_mutex_lock(&mut, timeout);
    }

...

I have not tested this, but it could maybe work? :)

@gromero
Copy link
Collaborator

gromero commented Jul 6, 2023

@gromero Has the function k_is_pre_kernel() been considered? A combination of that could maybe be used?

Something like

...

    if (k_is_pre_kernel() == false)
    {
        k_timeout_t timeout = k_is_in_isr() ? K_NO_WAIT : K_FOREVER;
        k_mutex_lock(&mut, timeout);
    }

...

I have not tested this, but it could maybe work? :)

Hi @bjarki-trackunit ! I like the idea of checking it at runtime, however, by experimenting with k_is_pre_kernel() I now realized that there are other contexts where, for instance, _current is not valid, which also causes a fault. So the code suggested works to correctly skip the first uses of retention subsys in the boot stage, but the code crashes anyway more ahead, in another context where although k_is_pre_kernel() returns false and owner seems even valid, _current is not ... the instrumentation code can run in any of these contexts.

@nordicjm
Copy link
Collaborator Author

nordicjm commented Jul 7, 2023

@andyross is there any "generic" function that can be used to know if the kernel has fully started and mutexes and mutex-required information is available from that point on?

@nordicjm
Copy link
Collaborator Author

nordicjm commented Jul 7, 2023

The only way of testing for this that I can think of, which seems to be a bit of a hack, is having a global bool which is set to true by a SYS_INIT() at APPLICATION level, then all the drivers reference that bool to know if a mutex should be taken (and only release the mutex if the earlier take was successful).

@andyross
Copy link
Contributor

@andyross is there any "generic" function that can be used to know if the kernel has fully started and mutexes and mutex-required information is available from that point on?

Officially anything after the PRE_KERNEL_2 init level has access to the full kernel API. There is no public API available to test for this, but there is a z_sys_post_kernel boolean that's used in a few places internally. We could wrap an API around that easily.

In practice, FWIW, the only thing you're not going to be able to do in that mutex before POST_KERNEL is block; everything else should work. The reason for that is that, essentially, early init is an "interrupt" context -- we aren't a thread yet and there's nothing to suspend. But since blocking during startup is obviously a really bad idea to begin with, this doesn't seem like much of a limitation.

@andyross
Copy link
Contributor

I like the idea of checking it at runtime, however, by experimenting with k_is_pre_kernel() I now realized that there are other contexts where, for instance, _current is not valid

Unless you mean "interrupt context", this sounds like a bug to me. What was the problem you were having?

@gromero
Copy link
Collaborator

gromero commented Jul 10, 2023

I like the idea of checking it at runtime, however, by experimenting with k_is_pre_kernel() I now realized that there are other contexts where, for instance, _current is not valid

Unless you mean "interrupt context", this sounds like a bug to me. What was the problem you were having?

@andyross Hi Andy. There is a new subsystem I'm proposing called Instrumentation, which basically enables compiler instrumentation that adds hooks at the entry and the exit of functions (any function in theory). The code hence can run in any context, pre-kernel, post-kernel, etc. Since it relies on the retention subsystem I'm getting crashes due to the use of mutexes in the retention / retained mem driver in some contexts -- I'd need to debug further to understand which contexts exactly are problematic, but I don't want to go that deep. However, for my use case, I really don't mind if the mutexes are either avoided at runtime or disabled at compile, since RETAINED_MEM_MUTEX_FORCE_DISABLE and RETENTION_MUTEX_FORCE_DISABLE already solve the crashes. The runtime check idea is from @bjarki-trackunit and @nordicjm , but I don't really need that, so I think it's over you to guys, to decide what's the best solution :-)

@nordicjm For my use case, disabling the mutexes at compile time is totally okay, and I believe it falls into one of the 'special cases' you mentioned earlier. I've been testing it now quite a few times and it works nicely ;-)

@andyross
Copy link
Contributor

Yeah, I'm slowly coming up to speed here. There's nothing special about k_mutex really, it shares the same underlying tooling that all the IPC mechanisms do, and of those the only problematic bits are (1) the use of a spinlock, which requires a kernel mode thread (i.e. won't work in userspace -- I really doubt this is what you're having trouble with) and (2) the ability to block, which obviously requires that it be running in a valid thread context and not an interrupt (or early init, which is effectively an interrupt).

The early init case is detectable as mentioned above. You can query if you're in an interrupt with k_is_in_isr(). If both of those are false, you really shouldn't be having any trouble with a k_mutex, and I'd view whatever is going wrong as a likely kernel bug. Though my money is still on the instrumentation getting confused in an ISR.

nordicjm added 2 commits July 12, 2023 07:53
Changes the Kconfig option to allow disabling mutex support, this
is to allow other Kconfig options to disable the feature.

Signed-off-by: Jamie McCrae <[email protected]>
Changes the Kconfig option to allow disabling mutex support, this
is to allow other Kconfig options to disable the feature.

Signed-off-by: Jamie McCrae <[email protected]>
@nordicjm
Copy link
Collaborator Author

OK I've reverted this back to the original disable only Kconfig, @gromero if you are able to create a small demo for Andy Ross showing that there is a bug which can be fixed and then use the is port kernel function and it works, that seems like a great "version 2" for the future 😄

@carlescufi carlescufi merged commit 566fd8c into zephyrproject-rtos:main Jul 12, 2023
@gromero
Copy link
Collaborator

gromero commented Jul 12, 2023

@nordicjm OK! Fair enough. Thank you ;-)

@nordicjm nordicjm deleted the retentionmutex branch August 14, 2023 11:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

retention/retained memory: Mutex use cannot be de-selected by Kconfig
8 participants