Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPIO H-I / gpiochip7-gpiochip8 Doesn't work #2

Closed
RSTurgay opened this issue Mar 31, 2023 · 1 comment
Closed

GPIO H-I / gpiochip7-gpiochip8 Doesn't work #2

RSTurgay opened this issue Mar 31, 2023 · 1 comment

Comments

@RSTurgay
Copy link

RSTurgay commented Mar 31, 2023

Hi,

I am using NuMaker-IoT-MA35D16F90-V2 Kit. I booting kernel on SD. There is no problem here. I want to use Raspberry Pi Connector. Raspberry Pi connector has a few H and I ports. But I write to pin PH0-PH1-PH2-PH3 and has no effect. Pins are always high. I write doesnt work on Pin. I try another pins PG4-PG5-PG6. it is work. I using tools are sysfs, libgiod and python-periphery. Same result in all.

# gpiodetect
gpiochip0 [gpioa] (16 lines)
gpiochip1 [gpiob] (16 lines)
gpiochip10 [gpiok] (16 lines)
gpiochip11 [gpiol] (16 lines)
gpiochip12 [gpiom] (16 lines)
gpiochip13 [gpion] (16 lines)
gpiochip2 [gpioc] (16 lines)
gpiochip3 [gpiod] (16 lines)
gpiochip4 [gpioe] (16 lines)
gpiochip5 [gpiof] (16 lines)
gpiochip6 [gpiog] (16 lines)
gpiochip7 [gpioh] (16 lines)
gpiochip8 [gpioi] (16 lines)
gpiochip9 [gpioj] (16 lines)

Also I changes effect this place.

# gpioinfo
gpiochip0 - 16 lines:
        line   0:      unnamed       unused   input  active-high
        line   1:      unnamed       unused   input  active-high
        line   2:      unnamed       unused   input  active-high
        line   3:      unnamed       unused   input  active-high
        line   4:      unnamed       unused   input  active-high
        line   5:      unnamed       unused   input  active-high
        line   6:      unnamed       unused   input  active-high
        line   7:      unnamed       unused   input  active-high
        line   8:      unnamed       unused   input  active-high
        line   9:      unnamed       unused   input  active-high
        line  10:      unnamed       unused   input  active-high
        line  11:      unnamed       unused   input  active-high
        line  12:      unnamed       unused   input  active-high
        line  13:      unnamed       unused   input  active-high
        line  14:      unnamed       unused   input  active-high
        line  15:      unnamed       unused   input  active-high
gpiochip1 - 16 lines:
        line   0:      unnamed       unused   input  active-high
        line   1:      unnamed       unused   input  active-high
        line   2:      unnamed       unused   input  active-high
        line   3:      unnamed       unused   input  active-high
        line   4:      unnamed       unused   input  active-high
        line   5:      unnamed       unused   input  active-high
        line   6:      unnamed       unused   input  active-high
        line   7:      unnamed       unused   input  active-high
        line   8:      unnamed       unused  output  active-high
        line   9:      unnamed       unused  output  active-high
        line  10:      unnamed       unused   input  active-high
        line  11:      unnamed       unused   input  active-high
        line  12:      unnamed       unused   input  active-high
        line  13:      unnamed       unused   input  active-high
        line  14:      unnamed       unused   input  active-high
        line  15:      unnamed       unused   input  active-high
gpiochip10 - 16 lines:
        line   0:      unnamed       unused   input  active-high
        line   1:      unnamed       unused   input  active-high
        line   2:      unnamed       unused   input  active-high
        line   3:      unnamed       unused   input  active-high
        line   4:      unnamed       unused   input  active-high
        line   5:      unnamed       unused   input  active-high
        line   6:      unnamed       unused   input  active-high
        line   7:      unnamed       unused   input  active-high
        line   8:      unnamed       unused   input  active-high
        line   9:      unnamed       unused   input  active-high
        line  10:      unnamed       unused   input  active-high
        line  11:      unnamed       unused   input  active-high
        line  12:      unnamed       unused   input  active-high
        line  13:      unnamed       unused   input  active-high
        line  14:      unnamed       unused   input  active-high
        line  15:      unnamed       unused   input  active-high
gpiochip11 - 16 lines:
        line   0:      unnamed       unused   input  active-high
        line   1:      unnamed       unused   input  active-high
        line   2:      unnamed       unused   input  active-high
        line   3:      unnamed       unused   input  active-high
        line   4:      unnamed       unused   input  active-high
        line   5:      unnamed       unused   input  active-high
        line   6:      unnamed       unused   input  active-high
        line   7:      unnamed       unused   input  active-high
        line   8:      unnamed       unused   input  active-high
        line   9:      unnamed       unused   input  active-high
        line  10:      unnamed       unused   input  active-high
        line  11:      unnamed       unused   input  active-high
        line  12:      unnamed       unused   input  active-high
        line  13:      unnamed       unused   input  active-high
        line  14:      unnamed       unused   input  active-high
        line  15:      unnamed       unused   input  active-high
gpiochip12 - 16 lines:
        line   0:      unnamed       unused   input  active-high
        line   1:      unnamed       unused   input  active-high
        line   2:      unnamed       unused   input  active-high
        line   3:      unnamed       unused   input  active-high
        line   4:      unnamed       unused   input  active-high
        line   5:      unnamed       unused   input  active-high
        line   6:      unnamed       unused   input  active-high
        line   7:      unnamed       unused   input  active-high
        line   8:      unnamed       unused   input  active-high
        line   9:      unnamed       unused   input  active-high
        line  10:      unnamed       unused   input  active-high
        line  11:      unnamed       unused   input  active-high
        line  12:      unnamed       unused   input  active-high
        line  13:      unnamed       unused   input  active-high
        line  14:      unnamed       unused   input  active-high
        line  15:      unnamed       unused   input  active-high
gpiochip13 - 16 lines:
        line   0:      unnamed       unused   input  active-high
        line   1:      unnamed       unused   input  active-high
        line   2:      unnamed   "Key Down"   input   active-low [used]
        line   3:      unnamed     "Key Up"   input   active-low [used]
        line   4:      unnamed       unused   input  active-high
        line   5:      unnamed       unused   input  active-high
        line   6:      unnamed       "LED0"  output   active-low [used]
        line   7:      unnamed       "LED1"  output   active-low [used]
        line   8:      unnamed       unused   input  active-high
        line   9:      unnamed       unused   input  active-high
        line  10:      unnamed       "LED2"  output   active-low [used]
        line  11:      unnamed "volt1_sdhci1" output active-high [used]
        line  12:      unnamed   "Key Left"   input   active-low [used]
        line  13:      unnamed  "Key Right"   input   active-low [used]
        line  14:      unnamed       unused   input  active-high
        line  15:      unnamed       unused   input  active-high
gpiochip2 - 16 lines:
        line   0:      unnamed       unused   input  active-high
        line   1:      unnamed       unused  output  active-high
        line   2:      unnamed       unused   input  active-high
        line   3:      unnamed       unused   input  active-high
        line   4:      unnamed       unused   input  active-high
        line   5:      unnamed       unused   input  active-high
        line   6:      unnamed       unused   input  active-high
        line   7:      unnamed       unused   input  active-high
        line   8:      unnamed       unused   input  active-high
        line   9:      unnamed       unused   input  active-high
        line  10:      unnamed       unused   input  active-high
        line  11:      unnamed       unused   input  active-high
        line  12:      unnamed       unused   input  active-high
        line  13:      unnamed       unused   input  active-high
        line  14:      unnamed       unused   input  active-high
        line  15:      unnamed       unused   input  active-high
gpiochip3 - 16 lines:
        line   0:      unnamed       unused   input  active-high
        line   1:      unnamed       unused   input  active-high
        line   2:      unnamed       unused   input  active-high
        line   3:      unnamed       unused   input  active-high
        line   4:      unnamed       unused   input  active-high
        line   5:      unnamed       unused   input  active-high
        line   6:      unnamed       unused   input  active-high
        line   7:      unnamed       unused   input  active-high
        line   8:      unnamed       unused   input  active-high
        line   9:      unnamed       unused   input  active-high
        line  10:      unnamed       unused   input  active-high
        line  11:      unnamed       unused   input  active-high
        line  12:      unnamed       unused   input  active-high
        line  13:      unnamed  "powerdown"  output  active-high [used]
        line  14:      unnamed       unused   input  active-high
        line  15:      unnamed       unused   input  active-high
gpiochip4 - 16 lines:
        line   0:      unnamed       kernel   input  active-high [used]
        line   1:      unnamed       kernel   input  active-high [used]
        line   2:      unnamed       kernel   input  active-high [used]
        line   3:      unnamed       kernel   input  active-high [used]
        line   4:      unnamed       kernel   input  active-high [used]
        line   5:      unnamed       kernel   input  active-high [used]
        line   6:      unnamed       kernel   input  active-high [used]
        line   7:      unnamed       kernel   input  active-high [used]
        line   8:      unnamed       kernel   input  active-high [used]
        line   9:      unnamed       kernel   input  active-high [used]
        line  10:      unnamed       kernel   input  active-high [used]
        line  11:      unnamed       kernel   input  active-high [used]
        line  12:      unnamed       kernel   input  active-high [used]
        line  13:      unnamed       kernel   input  active-high [used]
        line  14:      unnamed       unused   input  active-high
        line  15:      unnamed       unused   input  active-high
gpiochip5 - 16 lines:
        line   0:      unnamed       unused   input  active-high
        line   1:      unnamed       unused   input  active-high
        line   2:      unnamed       unused   input  active-high
        line   3:      unnamed       unused   input  active-high
        line   4:      unnamed       unused   input  active-high
        line   5:      unnamed       unused   input  active-high
        line   6:      unnamed       unused   input  active-high
        line   7:      unnamed       unused   input  active-high
        line   8:      unnamed       unused   input  active-high
        line   9:      unnamed       unused   input  active-high
        line  10:      unnamed       unused   input  active-high
        line  11:      unnamed       unused   input  active-high
        line  12:      unnamed       unused   input  active-high
        line  13:      unnamed       unused   input  active-high
        line  14:      unnamed       unused   input  active-high
        line  15:      unnamed       unused   input  active-high
gpiochip6 - 16 lines:
        line   0:      unnamed       kernel  output  active-high [used]
        line   1:      unnamed       unused   input  active-high
        line   2:      unnamed       unused   input  active-high
        line   3:      unnamed       unused   input  active-high
        line   4:      unnamed       unused  output  active-high
        line   5:      unnamed       unused  output  active-high
        line   6:      unnamed       unused  output  active-high
        line   7:      unnamed       unused   input  active-high
        line   8:      unnamed       kernel   input  active-high [used]
        line   9:      unnamed       kernel   input  active-high [used]
        line  10:      unnamed       kernel   input  active-high [used]
        line  11:      unnamed       unused   input  active-high
        line  12:      unnamed       unused   input  active-high
        line  13:      unnamed       unused   input  active-high
        line  14:      unnamed       unused   input  active-high
        line  15:      unnamed       unused   input  active-high
gpiochip7 - 16 lines:
        line   0:      unnamed       unused   input  active-high
        line   1:      unnamed       unused   input  active-high
        line   2:      unnamed       unused   input  active-high
        line   3:      unnamed       unused   input  active-high
        line   4:      unnamed       unused   input  active-high
        line   5:      unnamed       unused   input  active-high
        line   6:      unnamed       unused   input  active-high
        line   7:      unnamed       unused   input  active-high
        line   8:      unnamed       unused   input  active-high
        line   9:      unnamed       unused   input  active-high
        line  10:      unnamed       unused   input  active-high
        line  11:      unnamed       unused   input  active-high
        line  12:      unnamed       unused   input  active-high
        line  13:      unnamed       unused   input  active-high
        line  14:      unnamed       unused   input  active-high
        line  15:      unnamed       unused   input  active-high
gpiochip8 - 16 lines:
        line   0:      unnamed       unused   input  active-high
        line   1:      unnamed       unused   input  active-high
        line   2:      unnamed       unused   input  active-high
        line   3:      unnamed       unused   input  active-high
        line   4:      unnamed       unused   input  active-high
        line   5:      unnamed       unused   input  active-high
        line   6:      unnamed       unused   input  active-high
        line   7:      unnamed       unused   input  active-high
        line   8:      unnamed       unused   input  active-high
        line   9:      unnamed       unused   input  active-high
        line  10:      unnamed       unused   input  active-high
        line  11:      unnamed       unused   input  active-high
        line  12:      unnamed       unused   input  active-high
        line  13:      unnamed       unused   input  active-high
        line  14:      unnamed       unused   input  active-high
        line  15:      unnamed       unused   input  active-high
gpiochip9 - 16 lines:
        line   0:      unnamed       unused   input  active-high
        line   1:      unnamed       unused   input  active-high
        line   2:      unnamed       unused   input  active-high
        line   3:      unnamed       unused   input  active-high
        line   4:      unnamed         "wp"   input   active-low [used]
        line   5:      unnamed       unused   input  active-high
        line   6:      unnamed       unused   input  active-high
        line   7:      unnamed       unused   input  active-high
        line   8:      unnamed       unused   input  active-high
        line   9:      unnamed       unused   input  active-high
        line  10:      unnamed       unused   input  active-high
        line  11:      unnamed       unused   input  active-high
        line  12:      unnamed       unused   input  active-high
        line  13:      unnamed       unused   input  active-high
        line  14:      unnamed       unused   input  active-high
        line  15:      unnamed       unused   input  active-high

I check sysfs gpio

# echo 114 >/sys/class/gpio/export
# echo out >/sys/class/gpio/gpio114/direction
# echo 1 >/sys/class/gpio/gpio114/value
# cat /sys/class/gpio/gpio114/value
1
# echo 0 >/sys/class/gpio/gpio114/value
# cat /sys/class/gpio/gpio114/value
1

# echo 115 >/sys/class/gpio/export
# echo out >/sys/class/gpio/gpio115/direction
# echo 1 >/sys/class/gpio/gpio115/value
# cat /sys/class/gpio/gpio115/value
1
# echo 0 >/sys/class/gpio/gpio115/value
# cat /sys/class/gpio/gpio115/value
1

I check this table.

gpiochip7 - 16 lines:
        line   0:      unnamed       unused   input  active-high
        line   1:      unnamed       unused   input  active-high
        line   2:      unnamed      "sysfs"  output  active-high [used]
        line   3:      unnamed      "sysfs"  output  active-high [used]
        line   4:      unnamed       unused   input  active-high
        line   5:      unnamed       unused   input  active-high
        line   6:      unnamed       unused   input  active-high
        line   7:      unnamed       unused   input  active-high
        line   8:      unnamed       unused   input  active-high
        line   9:      unnamed       unused   input  active-high
        line  10:      unnamed       unused   input  active-high
        line  11:      unnamed       unused   input  active-high
        line  12:      unnamed       unused   input  active-high
        line  13:      unnamed       unused   input  active-high
        line  14:      unnamed       unused   input  active-high
        line  15:      unnamed       unused   input  active-high

When I check the schematic, there is no connection anywhere. But I can't use it. Why can't I use it? I would be very happy if you could help me with this subject. Thank you for your interest.

@ychuang3
Copy link
Contributor

ychuang3 commented May 2, 2023

It has been confirmed that the GPIO failure was caused by u-boot enabling display, which modifies MFP of many GPIOs including H and I. To fix this issue, we have disabled u-boot display by default and updated it on GitHub. Please update your u-boot version.

@RSTurgay RSTurgay closed this as completed May 2, 2023
jserv pushed a commit to jserv/linux-ma35d1 that referenced this issue Sep 23, 2024
Without this we get system hangs within a couple of days.
It's also reproducible in minutes with "stress-ng --exec 20".

Example error in dmesg:
INFO: task stress-ng:163916 blocked for more than 120 seconds.
      Not tainted 5.10.168-rt83 OpenNuvoton#2
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:stress-ng       state:D stack:    0 pid:163916 ppid: 72833 flags:0x00004000
Call Trace:
 __schedule+0x2bd/0x940
 preempt_schedule_lock+0x23/0x50
 rt_spin_lock_slowlock_locked+0x117/0x2c0
 rt_spin_lock_slowlock+0x51/0x80
 rt_write_lock+0x1e/0x1c0
 do_exit+0x3ac/0xb20
 do_group_exit+0x39/0xb0
 get_signal+0x145/0x960
 ? wake_up_new_task+0x21f/0x3c0
 arch_do_signal_or_restart+0xf1/0x830
 ? __x64_sys_futex+0x146/0x1d0
 exit_to_user_mode_prepare+0x116/0x1a0
 syscall_exit_to_user_mode+0x28/0x190
 entry_SYSCALL_64_after_hwframe+0x61/0xc6
RIP: 0033:0x7f738d9074a7
RSP: 002b:00007ffdafda3cb0 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00000000000000ca RCX: 00007f738d9074a7
RDX: 0000000000028051 RSI: 0000000000000000 RDI: 00007f738be949d0
RBP: 00007ffdafda3d88 R08: 0000000000000000 R09: 00007f738be94700
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000028051
R13: 00007f738be949d0 R14: 00007ffdafda51e0 R15: 00007f738be94700

Fixes: 1ba44dc ("Merge tag 'v5.10.162' into v5.10-rt")
Acked-by: Joe Korty <[email protected]>
Signed-off-by: Steffen Dirkwinkel <[email protected]>
Signed-off-by: Luis Claudio R. Goncalves <[email protected]>
jserv pushed a commit to jserv/linux-ma35d1 that referenced this issue Sep 23, 2024
…voton#2

Split the IRQ-off section while accessing the PCP list from zone->lock
while freeing pages.
Introcude  isolate_pcp_pages() which separates the pages from the PCP
list onto a temporary list and then free the temporary list via
free_pcppages_bulk().

Signed-off-by: Peter Zijlstra <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
jserv pushed a commit to jserv/linux-ma35d1 that referenced this issue Sep 23, 2024
…text

The following trace is triggered when running ltp oom test cases:

BUG: sleeping function called from invalid context at kernel/rtmutex.c:659
in_atomic(): 1, irqs_disabled(): 0, pid: 17188, name: oom03
Preemption disabled at:[<ffffffff8112ba70>] mem_cgroup_reclaim+0x90/0xe0

CPU: 2 PID: 17188 Comm: oom03 Not tainted 3.10.10-rt3 OpenNuvoton#2
Hardware name: Intel Corporation Calpella platform/MATXM-CORE-411-B, BIOS 4.6.3 08/18/2010
ffff88007684d730 ffff880070df9b58 ffffffff8169918d ffff880070df9b70
ffffffff8106db31 ffff88007688b4a0 ffff880070df9b88 ffffffff8169d9c0
ffff88007688b4a0 ffff880070df9bc8 ffffffff81059da1 0000000170df9bb0
Call Trace:
[<ffffffff8169918d>] dump_stack+0x19/0x1b
[<ffffffff8106db31>] __might_sleep+0xf1/0x170
[<ffffffff8169d9c0>] rt_spin_lock+0x20/0x50
[<ffffffff81059da1>] queue_work_on+0x61/0x100
[<ffffffff8112b361>] drain_all_stock+0xe1/0x1c0
[<ffffffff8112ba70>] mem_cgroup_reclaim+0x90/0xe0
[<ffffffff8112beda>] __mem_cgroup_try_charge+0x41a/0xc40
[<ffffffff810f1c91>] ? release_pages+0x1b1/0x1f0
[<ffffffff8106f200>] ? sched_exec+0x40/0xb0
[<ffffffff8112cc87>] mem_cgroup_charge_common+0x37/0x70
[<ffffffff8112e2c6>] mem_cgroup_newpage_charge+0x26/0x30
[<ffffffff8110af68>] handle_pte_fault+0x618/0x840
[<ffffffff8103ecf6>] ? unpin_current_cpu+0x16/0x70
[<ffffffff81070f94>] ? migrate_enable+0xd4/0x200
[<ffffffff8110cde5>] handle_mm_fault+0x145/0x1e0
[<ffffffff810301e1>] __do_page_fault+0x1a1/0x4c0
[<ffffffff8169c9eb>] ? preempt_schedule_irq+0x4b/0x70
[<ffffffff8169e3b7>] ? retint_kernel+0x37/0x40
[<ffffffff8103053e>] do_page_fault+0xe/0x10
[<ffffffff8169e4c2>] page_fault+0x22/0x30

So, to prevent schedule_work_on from being called in preempt disabled context,
replace the pair of get/put_cpu() to get/put_cpu_light().

Signed-off-by: Yang Shi <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
jserv pushed a commit to jserv/linux-ma35d1 that referenced this issue Sep 23, 2024
Without this we get system hangs within a couple of days.
It's also reproducible in minutes with "stress-ng --exec 20".

Example error in dmesg:
INFO: task stress-ng:163916 blocked for more than 120 seconds.
      Not tainted 5.10.168-rt83 OpenNuvoton#2
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:stress-ng       state:D stack:    0 pid:163916 ppid: 72833 flags:0x00004000
Call Trace:
 __schedule+0x2bd/0x940
 preempt_schedule_lock+0x23/0x50
 rt_spin_lock_slowlock_locked+0x117/0x2c0
 rt_spin_lock_slowlock+0x51/0x80
 rt_write_lock+0x1e/0x1c0
 do_exit+0x3ac/0xb20
 do_group_exit+0x39/0xb0
 get_signal+0x145/0x960
 ? wake_up_new_task+0x21f/0x3c0
 arch_do_signal_or_restart+0xf1/0x830
 ? __x64_sys_futex+0x146/0x1d0
 exit_to_user_mode_prepare+0x116/0x1a0
 syscall_exit_to_user_mode+0x28/0x190
 entry_SYSCALL_64_after_hwframe+0x61/0xc6
RIP: 0033:0x7f738d9074a7
RSP: 002b:00007ffdafda3cb0 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00000000000000ca RCX: 00007f738d9074a7
RDX: 0000000000028051 RSI: 0000000000000000 RDI: 00007f738be949d0
RBP: 00007ffdafda3d88 R08: 0000000000000000 R09: 00007f738be94700
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000028051
R13: 00007f738be949d0 R14: 00007ffdafda51e0 R15: 00007f738be94700

Fixes: 1ba44dc ("Merge tag 'v5.10.162' into v5.10-rt")
Acked-by: Joe Korty <[email protected]>
Signed-off-by: Steffen Dirkwinkel <[email protected]>
Signed-off-by: Luis Claudio R. Goncalves <[email protected]>
jserv pushed a commit to jserv/linux-ma35d1 that referenced this issue Sep 23, 2024
…voton#2

Split the IRQ-off section while accessing the PCP list from zone->lock
while freeing pages.
Introcude  isolate_pcp_pages() which separates the pages from the PCP
list onto a temporary list and then free the temporary list via
free_pcppages_bulk().

Signed-off-by: Peter Zijlstra <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
jserv pushed a commit to jserv/linux-ma35d1 that referenced this issue Sep 23, 2024
…text

The following trace is triggered when running ltp oom test cases:

BUG: sleeping function called from invalid context at kernel/rtmutex.c:659
in_atomic(): 1, irqs_disabled(): 0, pid: 17188, name: oom03
Preemption disabled at:[<ffffffff8112ba70>] mem_cgroup_reclaim+0x90/0xe0

CPU: 2 PID: 17188 Comm: oom03 Not tainted 3.10.10-rt3 OpenNuvoton#2
Hardware name: Intel Corporation Calpella platform/MATXM-CORE-411-B, BIOS 4.6.3 08/18/2010
ffff88007684d730 ffff880070df9b58 ffffffff8169918d ffff880070df9b70
ffffffff8106db31 ffff88007688b4a0 ffff880070df9b88 ffffffff8169d9c0
ffff88007688b4a0 ffff880070df9bc8 ffffffff81059da1 0000000170df9bb0
Call Trace:
[<ffffffff8169918d>] dump_stack+0x19/0x1b
[<ffffffff8106db31>] __might_sleep+0xf1/0x170
[<ffffffff8169d9c0>] rt_spin_lock+0x20/0x50
[<ffffffff81059da1>] queue_work_on+0x61/0x100
[<ffffffff8112b361>] drain_all_stock+0xe1/0x1c0
[<ffffffff8112ba70>] mem_cgroup_reclaim+0x90/0xe0
[<ffffffff8112beda>] __mem_cgroup_try_charge+0x41a/0xc40
[<ffffffff810f1c91>] ? release_pages+0x1b1/0x1f0
[<ffffffff8106f200>] ? sched_exec+0x40/0xb0
[<ffffffff8112cc87>] mem_cgroup_charge_common+0x37/0x70
[<ffffffff8112e2c6>] mem_cgroup_newpage_charge+0x26/0x30
[<ffffffff8110af68>] handle_pte_fault+0x618/0x840
[<ffffffff8103ecf6>] ? unpin_current_cpu+0x16/0x70
[<ffffffff81070f94>] ? migrate_enable+0xd4/0x200
[<ffffffff8110cde5>] handle_mm_fault+0x145/0x1e0
[<ffffffff810301e1>] __do_page_fault+0x1a1/0x4c0
[<ffffffff8169c9eb>] ? preempt_schedule_irq+0x4b/0x70
[<ffffffff8169e3b7>] ? retint_kernel+0x37/0x40
[<ffffffff8103053e>] do_page_fault+0xe/0x10
[<ffffffff8169e4c2>] page_fault+0x22/0x30

So, to prevent schedule_work_on from being called in preempt disabled context,
replace the pair of get/put_cpu() to get/put_cpu_light().

Signed-off-by: Yang Shi <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
jserv pushed a commit to jserv/linux-ma35d1 that referenced this issue Sep 23, 2024
Without this we get system hangs within a couple of days.
It's also reproducible in minutes with "stress-ng --exec 20".

Example error in dmesg:
INFO: task stress-ng:163916 blocked for more than 120 seconds.
      Not tainted 5.10.168-rt83 OpenNuvoton#2
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:stress-ng       state:D stack:    0 pid:163916 ppid: 72833 flags:0x00004000
Call Trace:
 __schedule+0x2bd/0x940
 preempt_schedule_lock+0x23/0x50
 rt_spin_lock_slowlock_locked+0x117/0x2c0
 rt_spin_lock_slowlock+0x51/0x80
 rt_write_lock+0x1e/0x1c0
 do_exit+0x3ac/0xb20
 do_group_exit+0x39/0xb0
 get_signal+0x145/0x960
 ? wake_up_new_task+0x21f/0x3c0
 arch_do_signal_or_restart+0xf1/0x830
 ? __x64_sys_futex+0x146/0x1d0
 exit_to_user_mode_prepare+0x116/0x1a0
 syscall_exit_to_user_mode+0x28/0x190
 entry_SYSCALL_64_after_hwframe+0x61/0xc6
RIP: 0033:0x7f738d9074a7
RSP: 002b:00007ffdafda3cb0 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00000000000000ca RCX: 00007f738d9074a7
RDX: 0000000000028051 RSI: 0000000000000000 RDI: 00007f738be949d0
RBP: 00007ffdafda3d88 R08: 0000000000000000 R09: 00007f738be94700
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000028051
R13: 00007f738be949d0 R14: 00007ffdafda51e0 R15: 00007f738be94700

Fixes: 1ba44dc ("Merge tag 'v5.10.162' into v5.10-rt")
Acked-by: Joe Korty <[email protected]>
Signed-off-by: Steffen Dirkwinkel <[email protected]>
Signed-off-by: Luis Claudio R. Goncalves <[email protected]>
jserv pushed a commit to jserv/linux-ma35d1 that referenced this issue Nov 15, 2024
…voton#2

Split the IRQ-off section while accessing the PCP list from zone->lock
while freeing pages.
Introcude  isolate_pcp_pages() which separates the pages from the PCP
list onto a temporary list and then free the temporary list via
free_pcppages_bulk().

Signed-off-by: Peter Zijlstra <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
jserv pushed a commit to jserv/linux-ma35d1 that referenced this issue Nov 15, 2024
…text

The following trace is triggered when running ltp oom test cases:

BUG: sleeping function called from invalid context at kernel/rtmutex.c:659
in_atomic(): 1, irqs_disabled(): 0, pid: 17188, name: oom03
Preemption disabled at:[<ffffffff8112ba70>] mem_cgroup_reclaim+0x90/0xe0

CPU: 2 PID: 17188 Comm: oom03 Not tainted 3.10.10-rt3 OpenNuvoton#2
Hardware name: Intel Corporation Calpella platform/MATXM-CORE-411-B, BIOS 4.6.3 08/18/2010
ffff88007684d730 ffff880070df9b58 ffffffff8169918d ffff880070df9b70
ffffffff8106db31 ffff88007688b4a0 ffff880070df9b88 ffffffff8169d9c0
ffff88007688b4a0 ffff880070df9bc8 ffffffff81059da1 0000000170df9bb0
Call Trace:
[<ffffffff8169918d>] dump_stack+0x19/0x1b
[<ffffffff8106db31>] __might_sleep+0xf1/0x170
[<ffffffff8169d9c0>] rt_spin_lock+0x20/0x50
[<ffffffff81059da1>] queue_work_on+0x61/0x100
[<ffffffff8112b361>] drain_all_stock+0xe1/0x1c0
[<ffffffff8112ba70>] mem_cgroup_reclaim+0x90/0xe0
[<ffffffff8112beda>] __mem_cgroup_try_charge+0x41a/0xc40
[<ffffffff810f1c91>] ? release_pages+0x1b1/0x1f0
[<ffffffff8106f200>] ? sched_exec+0x40/0xb0
[<ffffffff8112cc87>] mem_cgroup_charge_common+0x37/0x70
[<ffffffff8112e2c6>] mem_cgroup_newpage_charge+0x26/0x30
[<ffffffff8110af68>] handle_pte_fault+0x618/0x840
[<ffffffff8103ecf6>] ? unpin_current_cpu+0x16/0x70
[<ffffffff81070f94>] ? migrate_enable+0xd4/0x200
[<ffffffff8110cde5>] handle_mm_fault+0x145/0x1e0
[<ffffffff810301e1>] __do_page_fault+0x1a1/0x4c0
[<ffffffff8169c9eb>] ? preempt_schedule_irq+0x4b/0x70
[<ffffffff8169e3b7>] ? retint_kernel+0x37/0x40
[<ffffffff8103053e>] do_page_fault+0xe/0x10
[<ffffffff8169e4c2>] page_fault+0x22/0x30

So, to prevent schedule_work_on from being called in preempt disabled context,
replace the pair of get/put_cpu() to get/put_cpu_light().

Signed-off-by: Yang Shi <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
jserv pushed a commit to jserv/linux-ma35d1 that referenced this issue Nov 15, 2024
Without this we get system hangs within a couple of days.
It's also reproducible in minutes with "stress-ng --exec 20".

Example error in dmesg:
INFO: task stress-ng:163916 blocked for more than 120 seconds.
      Not tainted 5.10.168-rt83 OpenNuvoton#2
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:stress-ng       state:D stack:    0 pid:163916 ppid: 72833 flags:0x00004000
Call Trace:
 __schedule+0x2bd/0x940
 preempt_schedule_lock+0x23/0x50
 rt_spin_lock_slowlock_locked+0x117/0x2c0
 rt_spin_lock_slowlock+0x51/0x80
 rt_write_lock+0x1e/0x1c0
 do_exit+0x3ac/0xb20
 do_group_exit+0x39/0xb0
 get_signal+0x145/0x960
 ? wake_up_new_task+0x21f/0x3c0
 arch_do_signal_or_restart+0xf1/0x830
 ? __x64_sys_futex+0x146/0x1d0
 exit_to_user_mode_prepare+0x116/0x1a0
 syscall_exit_to_user_mode+0x28/0x190
 entry_SYSCALL_64_after_hwframe+0x61/0xc6
RIP: 0033:0x7f738d9074a7
RSP: 002b:00007ffdafda3cb0 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00000000000000ca RCX: 00007f738d9074a7
RDX: 0000000000028051 RSI: 0000000000000000 RDI: 00007f738be949d0
RBP: 00007ffdafda3d88 R08: 0000000000000000 R09: 00007f738be94700
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000028051
R13: 00007f738be949d0 R14: 00007ffdafda51e0 R15: 00007f738be94700

Fixes: 1ba44dc ("Merge tag 'v5.10.162' into v5.10-rt")
Acked-by: Joe Korty <[email protected]>
Signed-off-by: Steffen Dirkwinkel <[email protected]>
Signed-off-by: Luis Claudio R. Goncalves <[email protected]>
jserv pushed a commit to jserv/linux-ma35d1 that referenced this issue Nov 27, 2024
…voton#2

Split the IRQ-off section while accessing the PCP list from zone->lock
while freeing pages.
Introcude  isolate_pcp_pages() which separates the pages from the PCP
list onto a temporary list and then free the temporary list via
free_pcppages_bulk().

Signed-off-by: Peter Zijlstra <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
jserv pushed a commit to jserv/linux-ma35d1 that referenced this issue Nov 27, 2024
…text

The following trace is triggered when running ltp oom test cases:

BUG: sleeping function called from invalid context at kernel/rtmutex.c:659
in_atomic(): 1, irqs_disabled(): 0, pid: 17188, name: oom03
Preemption disabled at:[<ffffffff8112ba70>] mem_cgroup_reclaim+0x90/0xe0

CPU: 2 PID: 17188 Comm: oom03 Not tainted 3.10.10-rt3 OpenNuvoton#2
Hardware name: Intel Corporation Calpella platform/MATXM-CORE-411-B, BIOS 4.6.3 08/18/2010
ffff88007684d730 ffff880070df9b58 ffffffff8169918d ffff880070df9b70
ffffffff8106db31 ffff88007688b4a0 ffff880070df9b88 ffffffff8169d9c0
ffff88007688b4a0 ffff880070df9bc8 ffffffff81059da1 0000000170df9bb0
Call Trace:
[<ffffffff8169918d>] dump_stack+0x19/0x1b
[<ffffffff8106db31>] __might_sleep+0xf1/0x170
[<ffffffff8169d9c0>] rt_spin_lock+0x20/0x50
[<ffffffff81059da1>] queue_work_on+0x61/0x100
[<ffffffff8112b361>] drain_all_stock+0xe1/0x1c0
[<ffffffff8112ba70>] mem_cgroup_reclaim+0x90/0xe0
[<ffffffff8112beda>] __mem_cgroup_try_charge+0x41a/0xc40
[<ffffffff810f1c91>] ? release_pages+0x1b1/0x1f0
[<ffffffff8106f200>] ? sched_exec+0x40/0xb0
[<ffffffff8112cc87>] mem_cgroup_charge_common+0x37/0x70
[<ffffffff8112e2c6>] mem_cgroup_newpage_charge+0x26/0x30
[<ffffffff8110af68>] handle_pte_fault+0x618/0x840
[<ffffffff8103ecf6>] ? unpin_current_cpu+0x16/0x70
[<ffffffff81070f94>] ? migrate_enable+0xd4/0x200
[<ffffffff8110cde5>] handle_mm_fault+0x145/0x1e0
[<ffffffff810301e1>] __do_page_fault+0x1a1/0x4c0
[<ffffffff8169c9eb>] ? preempt_schedule_irq+0x4b/0x70
[<ffffffff8169e3b7>] ? retint_kernel+0x37/0x40
[<ffffffff8103053e>] do_page_fault+0xe/0x10
[<ffffffff8169e4c2>] page_fault+0x22/0x30

So, to prevent schedule_work_on from being called in preempt disabled context,
replace the pair of get/put_cpu() to get/put_cpu_light().

Signed-off-by: Yang Shi <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
jserv pushed a commit to jserv/linux-ma35d1 that referenced this issue Nov 27, 2024
Without this we get system hangs within a couple of days.
It's also reproducible in minutes with "stress-ng --exec 20".

Example error in dmesg:
INFO: task stress-ng:163916 blocked for more than 120 seconds.
      Not tainted 5.10.168-rt83 OpenNuvoton#2
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:stress-ng       state:D stack:    0 pid:163916 ppid: 72833 flags:0x00004000
Call Trace:
 __schedule+0x2bd/0x940
 preempt_schedule_lock+0x23/0x50
 rt_spin_lock_slowlock_locked+0x117/0x2c0
 rt_spin_lock_slowlock+0x51/0x80
 rt_write_lock+0x1e/0x1c0
 do_exit+0x3ac/0xb20
 do_group_exit+0x39/0xb0
 get_signal+0x145/0x960
 ? wake_up_new_task+0x21f/0x3c0
 arch_do_signal_or_restart+0xf1/0x830
 ? __x64_sys_futex+0x146/0x1d0
 exit_to_user_mode_prepare+0x116/0x1a0
 syscall_exit_to_user_mode+0x28/0x190
 entry_SYSCALL_64_after_hwframe+0x61/0xc6
RIP: 0033:0x7f738d9074a7
RSP: 002b:00007ffdafda3cb0 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00000000000000ca RCX: 00007f738d9074a7
RDX: 0000000000028051 RSI: 0000000000000000 RDI: 00007f738be949d0
RBP: 00007ffdafda3d88 R08: 0000000000000000 R09: 00007f738be94700
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000028051
R13: 00007f738be949d0 R14: 00007ffdafda51e0 R15: 00007f738be94700

Fixes: 1ba44dc ("Merge tag 'v5.10.162' into v5.10-rt")
Acked-by: Joe Korty <[email protected]>
Signed-off-by: Steffen Dirkwinkel <[email protected]>
Signed-off-by: Luis Claudio R. Goncalves <[email protected]>
jserv pushed a commit to jserv/linux-ma35d1 that referenced this issue Nov 27, 2024
…voton#2

Split the IRQ-off section while accessing the PCP list from zone->lock
while freeing pages.
Introcude  isolate_pcp_pages() which separates the pages from the PCP
list onto a temporary list and then free the temporary list via
free_pcppages_bulk().

Signed-off-by: Peter Zijlstra <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
jserv pushed a commit to jserv/linux-ma35d1 that referenced this issue Nov 27, 2024
…text

The following trace is triggered when running ltp oom test cases:

BUG: sleeping function called from invalid context at kernel/rtmutex.c:659
in_atomic(): 1, irqs_disabled(): 0, pid: 17188, name: oom03
Preemption disabled at:[<ffffffff8112ba70>] mem_cgroup_reclaim+0x90/0xe0

CPU: 2 PID: 17188 Comm: oom03 Not tainted 3.10.10-rt3 OpenNuvoton#2
Hardware name: Intel Corporation Calpella platform/MATXM-CORE-411-B, BIOS 4.6.3 08/18/2010
ffff88007684d730 ffff880070df9b58 ffffffff8169918d ffff880070df9b70
ffffffff8106db31 ffff88007688b4a0 ffff880070df9b88 ffffffff8169d9c0
ffff88007688b4a0 ffff880070df9bc8 ffffffff81059da1 0000000170df9bb0
Call Trace:
[<ffffffff8169918d>] dump_stack+0x19/0x1b
[<ffffffff8106db31>] __might_sleep+0xf1/0x170
[<ffffffff8169d9c0>] rt_spin_lock+0x20/0x50
[<ffffffff81059da1>] queue_work_on+0x61/0x100
[<ffffffff8112b361>] drain_all_stock+0xe1/0x1c0
[<ffffffff8112ba70>] mem_cgroup_reclaim+0x90/0xe0
[<ffffffff8112beda>] __mem_cgroup_try_charge+0x41a/0xc40
[<ffffffff810f1c91>] ? release_pages+0x1b1/0x1f0
[<ffffffff8106f200>] ? sched_exec+0x40/0xb0
[<ffffffff8112cc87>] mem_cgroup_charge_common+0x37/0x70
[<ffffffff8112e2c6>] mem_cgroup_newpage_charge+0x26/0x30
[<ffffffff8110af68>] handle_pte_fault+0x618/0x840
[<ffffffff8103ecf6>] ? unpin_current_cpu+0x16/0x70
[<ffffffff81070f94>] ? migrate_enable+0xd4/0x200
[<ffffffff8110cde5>] handle_mm_fault+0x145/0x1e0
[<ffffffff810301e1>] __do_page_fault+0x1a1/0x4c0
[<ffffffff8169c9eb>] ? preempt_schedule_irq+0x4b/0x70
[<ffffffff8169e3b7>] ? retint_kernel+0x37/0x40
[<ffffffff8103053e>] do_page_fault+0xe/0x10
[<ffffffff8169e4c2>] page_fault+0x22/0x30

So, to prevent schedule_work_on from being called in preempt disabled context,
replace the pair of get/put_cpu() to get/put_cpu_light().

Signed-off-by: Yang Shi <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
jserv pushed a commit to jserv/linux-ma35d1 that referenced this issue Nov 27, 2024
Without this we get system hangs within a couple of days.
It's also reproducible in minutes with "stress-ng --exec 20".

Example error in dmesg:
INFO: task stress-ng:163916 blocked for more than 120 seconds.
      Not tainted 5.10.168-rt83 OpenNuvoton#2
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:stress-ng       state:D stack:    0 pid:163916 ppid: 72833 flags:0x00004000
Call Trace:
 __schedule+0x2bd/0x940
 preempt_schedule_lock+0x23/0x50
 rt_spin_lock_slowlock_locked+0x117/0x2c0
 rt_spin_lock_slowlock+0x51/0x80
 rt_write_lock+0x1e/0x1c0
 do_exit+0x3ac/0xb20
 do_group_exit+0x39/0xb0
 get_signal+0x145/0x960
 ? wake_up_new_task+0x21f/0x3c0
 arch_do_signal_or_restart+0xf1/0x830
 ? __x64_sys_futex+0x146/0x1d0
 exit_to_user_mode_prepare+0x116/0x1a0
 syscall_exit_to_user_mode+0x28/0x190
 entry_SYSCALL_64_after_hwframe+0x61/0xc6
RIP: 0033:0x7f738d9074a7
RSP: 002b:00007ffdafda3cb0 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00000000000000ca RCX: 00007f738d9074a7
RDX: 0000000000028051 RSI: 0000000000000000 RDI: 00007f738be949d0
RBP: 00007ffdafda3d88 R08: 0000000000000000 R09: 00007f738be94700
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000028051
R13: 00007f738be949d0 R14: 00007ffdafda51e0 R15: 00007f738be94700

Fixes: 1ba44dc ("Merge tag 'v5.10.162' into v5.10-rt")
Acked-by: Joe Korty <[email protected]>
Signed-off-by: Steffen Dirkwinkel <[email protected]>
Signed-off-by: Luis Claudio R. Goncalves <[email protected]>
jserv pushed a commit to jserv/linux-ma35d1 that referenced this issue Dec 27, 2024
commit 5a22fbcc10f3f7d94c5d88afbbffa240a3677057 upstream.

When LAN9303 is MDIO-connected two callchains exist into
mdio->bus->write():

1. switch ports 1&2 ("physical" PHYs):

virtual (switch-internal) MDIO bus (lan9303_switch_ops->phy_{read|write})->
  lan9303_mdio_phy_{read|write} -> mdiobus_{read|write}_nested

2. LAN9303 virtual PHY:

virtual MDIO bus (lan9303_phy_{read|write}) ->
  lan9303_virt_phy_reg_{read|write} -> regmap -> lan9303_mdio_{read|write}

If the latter functions just take
mutex_lock(&sw_dev->device->bus->mdio_lock) it triggers a LOCKDEP
false-positive splat. It's false-positive because the first
mdio_lock in the second callchain above belongs to virtual MDIO bus, the
second mdio_lock belongs to physical MDIO bus.

Consequent annotation in lan9303_mdio_{read|write} as nested lock
(similar to lan9303_mdio_phy_{read|write}, it's the same physical MDIO bus)
prevents the following splat:

WARNING: possible circular locking dependency detected
5.15.71 OpenNuvoton#1 Not tainted
------------------------------------------------------
kworker/u4:3/609 is trying to acquire lock:
ffff000011531c68 (lan9303_mdio:131:(&lan9303_mdio_regmap_config)->lock){+.+.}-{3:3}, at: regmap_lock_mutex
but task is already holding lock:
ffff0000114c44d8 (&bus->mdio_lock){+.+.}-{3:3}, at: mdiobus_read
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> OpenNuvoton#1 (&bus->mdio_lock){+.+.}-{3:3}:
       lock_acquire
       __mutex_lock
       mutex_lock_nested
       lan9303_mdio_read
       _regmap_read
       regmap_read
       lan9303_probe
       lan9303_mdio_probe
       mdio_probe
       really_probe
       __driver_probe_device
       driver_probe_device
       __device_attach_driver
       bus_for_each_drv
       __device_attach
       device_initial_probe
       bus_probe_device
       deferred_probe_work_func
       process_one_work
       worker_thread
       kthread
       ret_from_fork
-> #0 (lan9303_mdio:131:(&lan9303_mdio_regmap_config)->lock){+.+.}-{3:3}:
       __lock_acquire
       lock_acquire.part.0
       lock_acquire
       __mutex_lock
       mutex_lock_nested
       regmap_lock_mutex
       regmap_read
       lan9303_phy_read
       dsa_slave_phy_read
       __mdiobus_read
       mdiobus_read
       get_phy_device
       mdiobus_scan
       __mdiobus_register
       dsa_register_switch
       lan9303_probe
       lan9303_mdio_probe
       mdio_probe
       really_probe
       __driver_probe_device
       driver_probe_device
       __device_attach_driver
       bus_for_each_drv
       __device_attach
       device_initial_probe
       bus_probe_device
       deferred_probe_work_func
       process_one_work
       worker_thread
       kthread
       ret_from_fork
other info that might help us debug this:
 Possible unsafe locking scenario:
       CPU0                    CPU1
       ----                    ----
  lock(&bus->mdio_lock);
                               lock(lan9303_mdio:131:(&lan9303_mdio_regmap_config)->lock);
                               lock(&bus->mdio_lock);
  lock(lan9303_mdio:131:(&lan9303_mdio_regmap_config)->lock);
*** DEADLOCK ***
5 locks held by kworker/u4:3/609:
 #0: ffff000002842938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work
 OpenNuvoton#1: ffff80000bacbd60 (deferred_probe_work){+.+.}-{0:0}, at: process_one_work
 OpenNuvoton#2: ffff000007645178 (&dev->mutex){....}-{3:3}, at: __device_attach
 OpenNuvoton#3: ffff8000096e6e78 (dsa2_mutex){+.+.}-{3:3}, at: dsa_register_switch
 OpenNuvoton#4: ffff0000114c44d8 (&bus->mdio_lock){+.+.}-{3:3}, at: mdiobus_read
stack backtrace:
CPU: 1 PID: 609 Comm: kworker/u4:3 Not tainted 5.15.71 OpenNuvoton#1
Workqueue: events_unbound deferred_probe_work_func
Call trace:
 dump_backtrace
 show_stack
 dump_stack_lvl
 dump_stack
 print_circular_bug
 check_noncircular
 __lock_acquire
 lock_acquire.part.0
 lock_acquire
 __mutex_lock
 mutex_lock_nested
 regmap_lock_mutex
 regmap_read
 lan9303_phy_read
 dsa_slave_phy_read
 __mdiobus_read
 mdiobus_read
 get_phy_device
 mdiobus_scan
 __mdiobus_register
 dsa_register_switch
 lan9303_probe
 lan9303_mdio_probe
...

Cc: [email protected]
Fixes: dc70058 ("net: dsa: LAN9303: add MDIO managed mode support")
Signed-off-by: Alexander Sverdlin <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
jserv pushed a commit to jserv/linux-ma35d1 that referenced this issue Dec 27, 2024
[ Upstream commit 15319a4e8ee4b098118591c6ccbd17237f841613 ]

As &card->tx_queue_lock is acquired under softirq context along the
following call chain from solos_bh(), other acquisition of the same
lock inside process context should disable at least bh to avoid double
lock.

<deadlock OpenNuvoton#2>
pclose()
--> spin_lock(&card->tx_queue_lock)
<interrupt>
   --> solos_bh()
   --> fpga_tx()
   --> spin_lock(&card->tx_queue_lock)

This flaw was found by an experimental static analysis tool I am
developing for irq-related deadlock.

To prevent the potential deadlock, the patch uses spin_lock_bh()
on &card->tx_queue_lock under process context code consistently to
prevent the possible deadlock scenario.

Fixes: 213e85d ("solos-pci: clean up pclose() function")
Signed-off-by: Chengfeng Ye <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
jserv pushed a commit to jserv/linux-ma35d1 that referenced this issue Dec 27, 2024
[ Upstream commit 3a42709fa909e22b0be4bb1e2795aa04ada732a3 ]

Validate @ioctl_rsp->OutputOffset and @ioctl_rsp->OutputCount so that
their sum does not wrap to a number that is smaller than @reparse_buf
and we end up with a wild pointer as follows:

  BUG: unable to handle page fault for address: ffff88809c5cd45f
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 4a01067 P4D 4a01067 PUD 0
  Oops: 0000 [OpenNuvoton#1] PREEMPT SMP NOPTI
  CPU: 2 PID: 1260 Comm: mount.cifs Not tainted 6.7.0-rc4 OpenNuvoton#2
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
  rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
  RIP: 0010:smb2_query_reparse_point+0x3e0/0x4c0 [cifs]
  Code: ff ff e8 f3 51 fe ff 41 89 c6 58 5a 45 85 f6 0f 85 14 fe ff ff
  49 8b 57 48 8b 42 60 44 8b 42 64 42 8d 0c 00 49 39 4f 50 72 40 <8b>
  04 02 48 8b 9d f0 fe ff ff 49 8b 57 50 89 03 48 8b 9d e8 fe ff
  RSP: 0018:ffffc90000347a90 EFLAGS: 00010212
  RAX: 000000008000001f RBX: ffff88800ae11000 RCX: 00000000000000ec
  RDX: ffff88801c5cd440 RSI: 0000000000000000 RDI: ffffffff82004aa4
  RBP: ffffc90000347bb0 R08: 00000000800000cd R09: 0000000000000001
  R10: 0000000000000000 R11: 0000000000000024 R12: ffff8880114d4100
  R13: ffff8880114d4198 R14: 0000000000000000 R15: ffff8880114d4000
  FS: 00007f02c07babc0(0000) GS:ffff88806ba00000(0000)
  knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffff88809c5cd45f CR3: 0000000011750000 CR4: 0000000000750ef0
  PKRU: 55555554
  Call Trace:
   <TASK>
   ? __die+0x23/0x70
   ? page_fault_oops+0x181/0x480
   ? search_module_extables+0x19/0x60
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? exc_page_fault+0x1b6/0x1c0
   ? asm_exc_page_fault+0x26/0x30
   ? _raw_spin_unlock_irqrestore+0x44/0x60
   ? smb2_query_reparse_point+0x3e0/0x4c0 [cifs]
   cifs_get_fattr+0x16e/0xa50 [cifs]
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? lock_acquire+0xbf/0x2b0
   cifs_root_iget+0x163/0x5f0 [cifs]
   cifs_smb3_do_mount+0x5bd/0x780 [cifs]
   smb3_get_tree+0xd9/0x290 [cifs]
   vfs_get_tree+0x2c/0x100
   ? capable+0x37/0x70
   path_mount+0x2d7/0xb80
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? _raw_spin_unlock_irqrestore+0x44/0x60
   __x64_sys_mount+0x11a/0x150
   do_syscall_64+0x47/0xf0
   entry_SYSCALL_64_after_hwframe+0x6f/0x77
  RIP: 0033:0x7f02c08d5b1e

Fixes: 2e4564b ("smb3: add support for stat of WSL reparse points for special file types")
Cc: [email protected]
Reported-by: Robert Morris <[email protected]>
Signed-off-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Steve French <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
jserv pushed a commit to jserv/linux-ma35d1 that referenced this issue Dec 27, 2024
[ Upstream commit 14694179e561b5f2f7e56a0f590e2cb49a9cc7ab ]

Trying to suspend to RAM on SAMA5D27 EVK leads to the following lockdep
warning:

 ============================================
 WARNING: possible recursive locking detected
 6.7.0-rc5-wt+ #532 Not tainted
 --------------------------------------------
 sh/92 is trying to acquire lock:
 c3cf306c (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0xe8/0x100

 but task is already holding lock:
 c3d7c46c (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0xe8/0x100

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&irq_desc_lock_class);
   lock(&irq_desc_lock_class);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 6 locks held by sh/92:
  #0: c3aa0258 (sb_writers#6){.+.+}-{0:0}, at: ksys_write+0xd8/0x178
  OpenNuvoton#1: c4c2df44 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x138/0x284
  OpenNuvoton#2: c32684a0 (kn->active){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x148/0x284
  OpenNuvoton#3: c232b6d4 (system_transition_mutex){+.+.}-{3:3}, at: pm_suspend+0x13c/0x4e8
  OpenNuvoton#4: c387b088 (&dev->mutex){....}-{3:3}, at: __device_suspend+0x1e8/0x91c
  OpenNuvoton#5: c3d7c46c (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0xe8/0x100

 stack backtrace:
 CPU: 0 PID: 92 Comm: sh Not tainted 6.7.0-rc5-wt+ #532
 Hardware name: Atmel SAMA5
  unwind_backtrace from show_stack+0x18/0x1c
  show_stack from dump_stack_lvl+0x34/0x48
  dump_stack_lvl from __lock_acquire+0x19ec/0x3a0c
  __lock_acquire from lock_acquire.part.0+0x124/0x2d0
  lock_acquire.part.0 from _raw_spin_lock_irqsave+0x5c/0x78
  _raw_spin_lock_irqsave from __irq_get_desc_lock+0xe8/0x100
  __irq_get_desc_lock from irq_set_irq_wake+0xa8/0x204
  irq_set_irq_wake from atmel_gpio_irq_set_wake+0x58/0xb4
  atmel_gpio_irq_set_wake from irq_set_irq_wake+0x100/0x204
  irq_set_irq_wake from gpio_keys_suspend+0xec/0x2b8
  gpio_keys_suspend from dpm_run_callback+0xe4/0x248
  dpm_run_callback from __device_suspend+0x234/0x91c
  __device_suspend from dpm_suspend+0x224/0x43c
  dpm_suspend from dpm_suspend_start+0x9c/0xa8
  dpm_suspend_start from suspend_devices_and_enter+0x1e0/0xa84
  suspend_devices_and_enter from pm_suspend+0x460/0x4e8
  pm_suspend from state_store+0x78/0xe4
  state_store from kernfs_fop_write_iter+0x1a0/0x284
  kernfs_fop_write_iter from vfs_write+0x38c/0x6f4
  vfs_write from ksys_write+0xd8/0x178
  ksys_write from ret_fast_syscall+0x0/0x1c
 Exception stack(0xc52b3fa8 to 0xc52b3ff0)
 3fa0:                   00000004 005a0ae8 00000001 005a0ae8 00000004 00000001
 3fc0: 00000004 005a0ae8 00000001 00000004 00000004 b6c616c0 00000020 0059d190
 3fe0: 00000004 b6c61678 aec5a041 aebf1a26

This warning is raised because pinctrl-at91-pio4 uses chained IRQ. Whenever
a wake up source configures an IRQ through irq_set_irq_wake, it will
lock the corresponding IRQ desc, and then call irq_set_irq_wake on "parent"
IRQ which will do the same on its own IRQ desc, but since those two locks
share the same class, lockdep reports this as an issue.

Fix lockdep false positive by setting a different class for parent and
children IRQ

Fixes: 7761808 ("pinctrl: introduce driver for Atmel PIO4 controller")
Signed-off-by: Alexis Lothoré <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Linus Walleij <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
jserv pushed a commit to jserv/linux-ma35d1 that referenced this issue Dec 27, 2024
[ Upstream commit 90d025c2e953c11974e76637977c473200593a46 ]

If server replied SMB2_NEGOTIATE with a zero SecurityBufferOffset,
smb2_get_data_area() sets @len to non-zero but return NULL, so
decode_negTokeninit() ends up being called with a NULL @security_blob:

  BUG: kernel NULL pointer dereference, address: 0000000000000000
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [OpenNuvoton#1] PREEMPT SMP NOPTI
  CPU: 2 PID: 871 Comm: mount.cifs Not tainted 6.7.0-rc4 OpenNuvoton#2
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
  RIP: 0010:asn1_ber_decoder+0x173/0xc80
  Code: 01 4c 39 2c 24 75 09 45 84 c9 0f 85 2f 03 00 00 48 8b 14 24 4c 29 ea 48 83 fa 01 0f 86 1e 07 00 00 48 8b 74 24 28 4d 8d 5d 01 <42> 0f b6 3c 2e 89 fa 40 88 7c 24 5c f7 d2 83 e2 1f 0f 84 3d 07 00
  RSP: 0018:ffffc9000063f950 EFLAGS: 00010202
  RAX: 0000000000000002 RBX: 0000000000000000 RCX: 000000000000004a
  RDX: 000000000000004a RSI: 0000000000000000 RDI: 0000000000000000
  RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000002 R11: 0000000000000001 R12: 0000000000000000
  R13: 0000000000000000 R14: 000000000000004d R15: 0000000000000000
  FS:  00007fce52b0fbc0(0000) GS:ffff88806ba00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 000000001ae64000 CR4: 0000000000750ef0
  PKRU: 55555554
  Call Trace:
   <TASK>
   ? __die+0x23/0x70
   ? page_fault_oops+0x181/0x480
   ? __stack_depot_save+0x1e6/0x480
   ? exc_page_fault+0x6f/0x1c0
   ? asm_exc_page_fault+0x26/0x30
   ? asn1_ber_decoder+0x173/0xc80
   ? check_object+0x40/0x340
   decode_negTokenInit+0x1e/0x30 [cifs]
   SMB2_negotiate+0xc99/0x17c0 [cifs]
   ? smb2_negotiate+0x46/0x60 [cifs]
   ? srso_alias_return_thunk+0x5/0xfbef5
   smb2_negotiate+0x46/0x60 [cifs]
   cifs_negotiate_protocol+0xae/0x130 [cifs]
   cifs_get_smb_ses+0x517/0x1040 [cifs]
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? queue_delayed_work_on+0x5d/0x90
   cifs_mount_get_session+0x78/0x200 [cifs]
   dfs_mount_share+0x13a/0x9f0 [cifs]
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? lock_acquire+0xbf/0x2b0
   ? find_nls+0x16/0x80
   ? srso_alias_return_thunk+0x5/0xfbef5
   cifs_mount+0x7e/0x350 [cifs]
   cifs_smb3_do_mount+0x128/0x780 [cifs]
   smb3_get_tree+0xd9/0x290 [cifs]
   vfs_get_tree+0x2c/0x100
   ? capable+0x37/0x70
   path_mount+0x2d7/0xb80
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? _raw_spin_unlock_irqrestore+0x44/0x60
   __x64_sys_mount+0x11a/0x150
   do_syscall_64+0x47/0xf0
   entry_SYSCALL_64_after_hwframe+0x6f/0x77
  RIP: 0033:0x7fce52c2ab1e

Fix this by setting @len to zero when @off == 0 so callers won't
attempt to dereference non-existing data areas.

Reported-by: Robert Morris <[email protected]>
Cc: [email protected]
Signed-off-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Steve French <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants