-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Driver for BL808 Watchdog. #8
base: linux-next/mboxic
Are you sure you want to change the base?
Driver for BL808 Watchdog. #8
Conversation
a888032
to
926f00d
Compare
|
||
wdt: wdt@2000a500 { | ||
compatible = "bflb,bflb808-wdt"; | ||
reg = <0x2000a500 0x100>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This MMIO range is for the TIMER0 device, not just for the watchdog portion, so it's not really correct to label the whole device with a "wdt" compatible string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my first device driver. Happy to make any change but I'm not sure what to change. I'm also not sure what happens if/when the other timers are implemented in a different driver module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The call to devm_ioremap_resource
(or devm_platform_ioremap_resource
) claims/reserves the MMIO range for use by the driver. You can see these claimed MMIO ranges in /proc/iomem
. So if both a timer driver and a watchdog driver declare the same MMIO range, the second-loaded driver will fail to probe.
Supporting multiple subsystems with overlapping MMIO ranges requires cooperation between the drivers. There are multiple ways of doing this, and it's not obvious which is best in this situation:
- Use a single driver for both subsystems
- Use a MFD
- Use a syscon
- Use the auxiliary device bus
So I'm also not sure what to change here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately the TCCR clock source register is shared between both general purpose timers and the WDT, so the drivers also need to consider coordinating sharing the same register, not just sets of register in the same MMIO space.
drivers/watchdog/bflb_wdt.c
Outdated
static inline int bflb_unlock_watchdog(struct bflb_watchdog_device *bflb_wdd) | ||
{ | ||
writew(BFLB_VAL_WFAR, bflb_wdd->regs + BFLB_REG_WFAR); | ||
writew(BFLB_VAL_WSAR, bflb_wdd->regs + BFLB_REG_WSAR); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The peripheral exposes 32-bit registers, so you should always use writel
, even if some bits are unused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was mimicking the raw metal bl_mcu_sdk example here, assuming there was an intentional reason to do things this way. I'll change it and test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Google internal bug 265639009 we've received an (as yet) unreproducible crash report from an aarch64 GKI 5.10.149-android13 running device. AFAICT the source code is at: https://android.googlesource.com/kernel/common/+/refs/tags/ASB-2022-12-05_13-5.10 The call stack is: ncm_close() -> ncm_notify() -> ncm_do_notify() with the crash at: ncm_do_notify+0x98/0x270 Code: 79000d0b b9000a6c f940012a f9400269 (b9405d4b) Which I believe disassembles to (I don't know ARM assembly, but it looks sane enough to me...): // halfword (16-bit) store presumably to event->wLength (at offset 6 of struct usb_cdc_notification) 0B 0D 00 79 strh w11, [x8, #6] // word (32-bit) store presumably to req->Length (at offset 8 of struct usb_request) 6C 0A 00 B9 str w12, [x19, #8] // x10 (NULL) was read here from offset 0 of valid pointer x9 // IMHO we're reading 'cdev->gadget' and getting NULL // gadget is indeed at offset 0 of struct usb_composite_dev 2A 01 40 F9 ldr x10, [x9] // loading req->buf pointer, which is at offset 0 of struct usb_request 69 02 40 F9 ldr x9, [x19] // x10 is null, crash, appears to be attempt to read cdev->gadget->max_speed 4B 5D 40 B9 ldr w11, [x10, #0x5c] which seems to line up with ncm_do_notify() case NCM_NOTIFY_SPEED code fragment: event->wLength = cpu_to_le16(8); req->length = NCM_STATUS_BYTECOUNT; /* SPEED_CHANGE data is up/down speeds in bits/sec */ data = req->buf + sizeof *event; data[0] = cpu_to_le32(ncm_bitrate(cdev->gadget)); My analysis of registers and NULL ptr deref crash offset (Unable to handle kernel NULL pointer dereference at virtual address 000000000000005c) heavily suggests that the crash is due to 'cdev->gadget' being NULL when executing: data[0] = cpu_to_le32(ncm_bitrate(cdev->gadget)); which calls: ncm_bitrate(NULL) which then calls: gadget_is_superspeed(NULL) which reads ((struct usb_gadget *)NULL)->max_speed and hits a panic. AFAICT, if I'm counting right, the offset of max_speed is indeed 0x5C. (remember there's a GKI KABI reservation of 16 bytes in struct work_struct) It's not at all clear to me how this is all supposed to work... but returning 0 seems much better than panic-ing... Cc: Felipe Balbi <[email protected]> Cc: Lorenzo Colitti <[email protected]> Cc: Carlos Llamas <[email protected]> Cc: [email protected] Signed-off-by: Maciej Żenczykowski <[email protected]> Cc: stable <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
The driver shutdown callback (which sends EDL_SOC_RESET to the device over serdev) should not be invoked when HCI device is not open (e.g. if hci_dev_open_sync() failed), because the serdev and its TTY are not open either. Also skip this step if device is powered off (qca_power_shutdown()). The shutdown callback causes use-after-free during system reboot with Qualcomm Atheros Bluetooth: Unable to handle kernel paging request at virtual address 0072662f67726fd7 ... CPU: 6 PID: 1 Comm: systemd-shutdow Tainted: G W 6.1.0-rt5-00325-g8a5f56bcfcca #8 Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT) Call trace: tty_driver_flush_buffer+0x4/0x30 serdev_device_write_flush+0x24/0x34 qca_serdev_shutdown+0x80/0x130 [hci_uart] device_shutdown+0x15c/0x260 kernel_restart+0x48/0xac KASAN report: BUG: KASAN: use-after-free in tty_driver_flush_buffer+0x1c/0x50 Read of size 8 at addr ffff16270c2e0018 by task systemd-shutdow/1 CPU: 7 PID: 1 Comm: systemd-shutdow Not tainted 6.1.0-next-20221220-00014-gb85aaf97fb01-dirty torvalds#28 Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT) Call trace: dump_backtrace.part.0+0xdc/0xf0 show_stack+0x18/0x30 dump_stack_lvl+0x68/0x84 print_report+0x188/0x488 kasan_report+0xa4/0xf0 __asan_load8+0x80/0xac tty_driver_flush_buffer+0x1c/0x50 ttyport_write_flush+0x34/0x44 serdev_device_write_flush+0x48/0x60 qca_serdev_shutdown+0x124/0x274 device_shutdown+0x1e8/0x350 kernel_restart+0x48/0xb0 __do_sys_reboot+0x244/0x2d0 __arm64_sys_reboot+0x54/0x70 invoke_syscall+0x60/0x190 el0_svc_common.constprop.0+0x7c/0x160 do_el0_svc+0x44/0xf0 el0_svc+0x2c/0x6c el0t_64_sync_handler+0xbc/0x140 el0t_64_sync+0x190/0x194 Fixes: 7e7bbdd ("Bluetooth: hci_qca: Fix qca6390 enable failure after warm reboot") Cc: <[email protected]> Signed-off-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]>
The inline assembly for arm64's cmpxchg_double*() implementations use a +Q constraint to hazard against other accesses to the memory location being exchanged. However, the pointer passed to the constraint is a pointer to unsigned long, and thus the hazard only applies to the first 8 bytes of the location. GCC can take advantage of this, assuming that other portions of the location are unchanged, leading to a number of potential problems. This is similar to what we fixed back in commit: fee960b ("arm64: xchg: hazard against entire exchange variable") ... but we forgot to adjust cmpxchg_double*() similarly at the same time. The same problem applies, as demonstrated with the following test: | struct big { | u64 lo, hi; | } __aligned(128); | | unsigned long foo(struct big *b) | { | u64 hi_old, hi_new; | | hi_old = b->hi; | cmpxchg_double_local(&b->lo, &b->hi, 0x12, 0x34, 0x56, 0x78); | hi_new = b->hi; | | return hi_old ^ hi_new; | } ... which GCC 12.1.0 compiles as: | 0000000000000000 <foo>: | 0: d503233f paciasp | 4: aa0003e4 mov x4, x0 | 8: 1400000e b 40 <foo+0x40> | c: d2800240 mov x0, #0x12 // torvalds#18 | 10: d2800681 mov x1, #0x34 // #52 | 14: aa0003e5 mov x5, x0 | 18: aa0103e6 mov x6, x1 | 1c: d2800ac2 mov x2, #0x56 // torvalds#86 | 20: d2800f03 mov x3, #0x78 // torvalds#120 | 24: 48207c82 casp x0, x1, x2, x3, [x4] | 28: ca050000 eor x0, x0, x5 | 2c: ca060021 eor x1, x1, x6 | 30: aa010000 orr x0, x0, x1 | 34: d2800000 mov x0, #0x0 // #0 <--- BANG | 38: d50323bf autiasp | 3c: d65f03c0 ret | 40: d2800240 mov x0, #0x12 // torvalds#18 | 44: d2800681 mov x1, #0x34 // #52 | 48: d2800ac2 mov x2, #0x56 // torvalds#86 | 4c: d2800f03 mov x3, #0x78 // torvalds#120 | 50: f9800091 prfm pstl1strm, [x4] | 54: c87f1885 ldxp x5, x6, [x4] | 58: ca0000a5 eor x5, x5, x0 | 5c: ca0100c6 eor x6, x6, x1 | 60: aa0600a6 orr x6, x5, x6 | 64: b5000066 cbnz x6, 70 <foo+0x70> | 68: c8250c82 stxp w5, x2, x3, [x4] | 6c: 35ffff45 cbnz w5, 54 <foo+0x54> | 70: d2800000 mov x0, #0x0 // #0 <--- BANG | 74: d50323bf autiasp | 78: d65f03c0 ret Notice that at the lines with "BANG" comments, GCC has assumed that the higher 8 bytes are unchanged by the cmpxchg_double() call, and that `hi_old ^ hi_new` can be reduced to a constant zero, for both LSE and LL/SC versions of cmpxchg_double(). This patch fixes the issue by passing a pointer to __uint128_t into the +Q constraint, ensuring that the compiler hazards against the entire 16 bytes being modified. With this change, GCC 12.1.0 compiles the above test as: | 0000000000000000 <foo>: | 0: f9400407 ldr x7, [x0, arm000#8] | 4: d503233f paciasp | 8: aa0003e4 mov x4, x0 | c: 1400000f b 48 <foo+0x48> | 10: d2800240 mov x0, #0x12 // torvalds#18 | 14: d2800681 mov x1, #0x34 // #52 | 18: aa0003e5 mov x5, x0 | 1c: aa0103e6 mov x6, x1 | 20: d2800ac2 mov x2, #0x56 // torvalds#86 | 24: d2800f03 mov x3, #0x78 // torvalds#120 | 28: 48207c82 casp x0, x1, x2, x3, [x4] | 2c: ca050000 eor x0, x0, x5 | 30: ca060021 eor x1, x1, x6 | 34: aa010000 orr x0, x0, x1 | 38: f9400480 ldr x0, [x4, arm000#8] | 3c: d50323bf autiasp | 40: ca0000e0 eor x0, x7, x0 | 44: d65f03c0 ret | 48: d2800240 mov x0, #0x12 // torvalds#18 | 4c: d2800681 mov x1, #0x34 // #52 | 50: d2800ac2 mov x2, #0x56 // torvalds#86 | 54: d2800f03 mov x3, #0x78 // torvalds#120 | 58: f9800091 prfm pstl1strm, [x4] | 5c: c87f1885 ldxp x5, x6, [x4] | 60: ca0000a5 eor x5, x5, x0 | 64: ca0100c6 eor x6, x6, x1 | 68: aa0600a6 orr x6, x5, x6 | 6c: b5000066 cbnz x6, 78 <foo+0x78> | 70: c8250c82 stxp w5, x2, x3, [x4] | 74: 35ffff45 cbnz w5, 5c <foo+0x5c> | 78: f9400480 ldr x0, [x4, arm000#8] | 7c: d50323bf autiasp | 80: ca0000e0 eor x0, x7, x0 | 84: d65f03c0 ret ... sampling the high 8 bytes before and after the cmpxchg, and performing an EOR, as we'd expect. For backporting, I've tested this atop linux-4.9.y with GCC 5.5.0. Note that linux-4.9.y is oldest currently supported stable release, and mandates GCC 5.1+. Unfortunately I couldn't get a GCC 5.1 binary to run on my machines due to library incompatibilities. I've also used a standalone test to check that we can use a __uint128_t pointer in a +Q constraint at least as far back as GCC 4.8.5 and LLVM 3.9.1. Fixes: 5284e1b ("arm64: xchg: Implement cmpxchg_double") Fixes: e9a4b79 ("arm64: cmpxchg_dbl: patch in lse instructions when supported by the CPU") Reported-by: Boqun Feng <[email protected]> Link: https://lore.kernel.org/lkml/Y6DEfQXymYVgL3oJ@boqun-archlinux/ Reported-by: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]/ Signed-off-by: Mark Rutland <[email protected]> Cc: [email protected] Cc: Arnd Bergmann <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Steve Capper <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looking pretty good, just a few minor comments. If you can squash the patches into the following it would help for review / maintainance:
patch 1) device tree changes
patch 2) add driver/Kconfig/Makefile
patch 3) enable driver in bl808_defconfig
9b9c8fd
to
2b0b282
Compare
Commits squashed. |
This adds support for the BL808 watchdog. The implementation has a resolution of 1 second and the timeout range can be set from 1 - 2^14 seconds. To test with busybox's watchdog:
watchdog -T 10 -t 1 /dev/watchdog -F
This should run just fine.watchdog -T 1 -t 10 /dev/watchdog -F
.