4.9-1.0.x-imx stable 4.9.67 merge #21

agners · 2017-12-13T20:04:16Z

This merges the Linux 4.9.67 stable release into 4.9-1.0.x-imx.

[ Upstream commit 666b5d1 ] Remove spinlock and use the "rtc->ops_lock" from RTC subsystem instead. spin_lock_irqsave() is not needed here because we do not have hard IRQs. This patch fixes the following issue: root@GE004097290448 b850v3:~# hwclock --systohc root@GE004097290448 b850v3:~# hwclock --systohc root@GE004097290448 b850v3:~# hwclock --systohc root@GE004097290448 b850v3:~# hwclock --systohc root@GE004097290448 b850v3:~# hwclock --systohc [ 82.108175] BUG: spinlock wrong CPU on CPU#0, hwclock/855 [ 82.113660] lock: 0xedb4899c, .magic: dead4ead, .owner: hwclock/855, .owner_cpu: 1 [ 82.121329] CPU: 0 PID: 855 Comm: hwclock Not tainted 4.8.0-00042-g09d5410-dirty Freescale#20 [ 82.129078] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) [ 82.135609] Backtrace: [ 82.138090] [<8010d378>] (dump_backtrace) from [<8010d5c0>] (show_stack+0x20/0x24) [ 82.145664] r7:ec936000 r6:600a0013 r5:00000000 r4:81031680 [ 82.151402] [<8010d5a0>] (show_stack) from [<80401518>] (dump_stack+0xb4/0xe8) [ 82.158636] [<80401464>] (dump_stack) from [<8017b8b0>] (spin_dump+0x84/0xcc) [ 82.165775] r10:00000000 r9:ec936000 r8:81056090 r7:600a0013 r6:edb4899c r5:edb4899c [ 82.173691] r4:e5033e00 r3:00000000 [ 82.177308] [<8017b82c>] (spin_dump) from [<8017bcb0>] (do_raw_spin_unlock+0x108/0x130) [ 82.185314] r5:edb4899c r4:edb4899c [ 82.188938] [<8017bba8>] (do_raw_spin_unlock) from [<8094b93c>] (_raw_spin_unlock_irqrestore+0x34/0x54) [ 82.198333] r5:edb4899c r4:600a0013 [ 82.201953] [<8094b908>] (_raw_spin_unlock_irqrestore) from [<8065b090>] (rx8010_set_time+0x14c/0x188) [ 82.211261] r5:00000020 r4:edb48990 [ 82.214882] [<8065af44>] (rx8010_set_time) from [<80653fe4>] (rtc_set_time+0x70/0x104) [ 82.222801] r7:00000051 r6:edb39da0 r5:edb39c00 r4:ec937e8c [ 82.228535] [<80653f74>] (rtc_set_time) from [<80655774>] (rtc_dev_ioctl+0x3c4/0x674) [ 82.236368] r7:00000051 r6:7ecf1b74 r5:00000000 r4:edb39c00 [ 82.242106] [<806553b0>] (rtc_dev_ioctl) from [<80284034>] (do_vfs_ioctl+0xa4/0xa6c) [ 82.249851] r8:00000003 r7:80284a40 r6:ed1e9c80 r5:edb44e60 r4:7ecf1b74 [ 82.256642] [<80283f90>] (do_vfs_ioctl) from [<80284a40>] (SyS_ioctl+0x44/0x6c) [ 82.263953] r10:00000000 r9:ec936000 r8:7ecf1b74 r7:4024700a r6:ed1e9c80 r5:00000003 [ 82.271869] r4:ed1e9c80 [ 82.274432] [<802849fc>] (SyS_ioctl) from [<80108520>] (ret_fast_syscall+0x0/0x1c) [ 82.282005] r9:ec936000 r8:801086c4 r7:00000036 r6:00000000 r5:00000003 r4:0008e1bc root@GE004097290448 b850v3:~# Message from syslogd@GE004097290448 at Dec 3 11:17:08 ... kernel:[ 82.108175] BUG: spinlock wrong CPU on CPU#0, hwclock/855 Message from syslogd@GE004097290448 at Dec 3 11:17:08 ... kernel:[ 82.113660] lock: 0xedb4899c, .magic: dead4ead, .owner: hwclock/855, .owner_cpu: 1 hwclock --systohc root@GE004097290448 b850v3:~# Signed-off-by: Fabien Lahoudere <[email protected]> Signed-off-by: Alexandre Belloni <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

… time [ Upstream commit 4949fc5 ] In order for the MSB -> LSB latching to work correctly we must read the 2 8 bit registers of a 15 bit value in one consecutive read. This fixes charge_full reporting 3498768 on some reads and 3354624 one other reads on my tablet (for the 3354624 value the raw LSB is 0x00). Signed-off-by: Hans de Goede <[email protected]> Signed-off-by: Sebastian Reichel <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

… time [ Upstream commit 248efcf ] In order for the MSB -> LSB latching to work correctly we must read the 2 8 bit registers of a 12 bit value in one consecutive read. This fixes voltage_ocv reporting inconsistent values on my tablet. Signed-off-by: Hans de Goede <[email protected]> Signed-off-by: Sebastian Reichel <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit ed3c177 ] The update of stream costs significantly, and we should avoid it unless the stream really has started. Check pipe->running flag instead of pipe->prepared. Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 874e1f6 ] The pseudo DMA transfer codes in VX222 and VX-pocket driver have a slight bug where they check the buffer boundary wrongly, and may overflow. Also, the zero sample count might be handled badly for the playback (although it shouldn't happen in theory). This patch addresses these issues. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=141541 Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 42f7f3c ] Add module alias for Sony ACX565AKM LCD panel. This makes it probe on Nokia N900 when panel driver is built as a module. Signed-off-by: Jarkko Nikula <[email protected]> Signed-off-by: Tomi Valkeinen <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit cc21942 ] Once device_register is called for a device its attributes might be accessed. As the callbacks of a lcd device's attributes make use of the lcd_ops, the respective member must be setup before calling device_register. Signed-off-by: Uwe Kleine-König <[email protected]> Signed-off-by: Lee Jones <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 0eb3fba ] If adp5520_bl_setup() fails, sysfs group left unremoved. By the way, fix overcomplicated assignement of error code. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <[email protected]> Acked-by: Michael Hennerich <[email protected]> Signed-off-by: Lee Jones <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 4b0ea93 ] Here, pci_iomap can fail, handle this case and return -ENOMEM. Signed-off-by: Arvind Yadav <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 28f1f9b ] ALC299 was similar as ALC225. Add headset support for ALC299. ALC3271 was for Dell rename. Signed-off-by: Kailang Yang <[email protected]> Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 0cc878d ] Nitro firmware is loaded into memory by the bootloader at a specific location. Set this memory range aside to prevent the kernel from using it. Signed-off-by: Jon Mason <[email protected]> Signed-off-by: Florian Fainelli <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 54f6d4c ] This patch ensures that the advertised link speeds are configured for X553 KR/KX backplane. Without this patch the link remains at 1G when resuming from low power after being downshifted by LPLU. Signed-off-by: Don Skidmore <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 126db13 ] Make sure that we free the IRQs in ixgbe_io_error_detected() when responding to an PCIe AER error and also restore them when the interface recovers from it. Previously it was possible to trigger BUG_ON() check in free_msix_irqs() in the case where we call ixgbe_remove() after a failed recovery from AER error because the interrupts were not freed. Signed-off-by: Emil Tantilov <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit f7f37e7 ] When an interface is part of a namespace it is possible that ixgbe_close() may be called while __ixgbe_shutdown() is running which ends up in a double free WARN and/or a BUG in free_msi_irqs(). To handle this situation we extend the rtnl_lock() to protect the call to netif_device_detach() and ixgbe_clear_interrupt_scheme() in __ixgbe_shutdown() and check for netif_device_present() to avoid clearing the interrupts second time in ixgbe_close(); Also extend the rtnl lock in ixgbe_resume() to netif_device_attach(). Signed-off-by: Emil Tantilov <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit f215266 ] BaseT adapters that are capable of supporting 100Mb are not reporting this capability. This patch corrects the reporting so that 100Mb is shown as supported on those adapters. Signed-off-by: Tony Nguyen <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 3f0d646 ] A retry count of 10 is likely to run into problems on X550 devices that have to detect and reset unresponsive CS4227 devices. So, reduce the I2C retry count to 3 for X550 and above. This should avoid any possible regressions in existing devices. Signed-off-by: Tony Nguyen <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 2bf1a87 ] The indirection table was reported incorrectly for X550 and newer where we can support up to 64 RSS queues. Reported-by Krishneil Singh <[email protected]> Signed-off-by: Emil Tantilov <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 1fe954b ] FEC is configured by the NVM and the driver should not be overriding it. Signed-off-by: Emil Tantilov <[email protected]> Tested-by: Krishneil Singh <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 502c807 ] Fixed a sparse warning. Using function le16_to_cpus() to avoid double assignment. Signed-off-by: Jannik Becher <[email protected]> Tested-by: Larry Finger <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 94500d5 ] drivers/staging/wilc1000/linux_wlan.c:995:18: warning: restricted __be16 degrades to integer Signed-off-by: Mike Kofron <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit f05a88a ] Add sanity checks for cport_quiesce and cport_clear before invoking the callbacks as these function pointers are not required during the host device registration. This follows the logic implemented elsewhere for various other function pointers. Signed-off-by: Jason Hrycay <[email protected]> Reviewed-by: Bryan O'Donoghue <[email protected]> Acked-by: Johan Hovold <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 0888958 ] When building a kernel targeting a microMIPS ISA, recent GNU linkers will fail the link if they cannot determine that the target of a branch or jump is microMIPS code, with errors such as the following: mips-img-linux-gnu-ld: arch/mips/built-in.o: .text+0x542c: Unsupported jump between ISA modes; consider recompiling with interlinking enabled. mips-img-linux-gnu-ld: final link failed: Bad value or: ./arch/mips/include/asm/uaccess.h:1017: warning: JALX to a non-word-aligned address Placing anything other than an instruction at the start of a function written in assembly appears to trigger such errors. In order to prepare for allowing us to follow function prologue macros with an EXPORT_SYMBOL invocation, end the prologue macros (LEAD, NESTED & FEXPORT) with a .insn directive. This ensures that the start of the function is marked as code, which always makes sense for functions & safely prevents us from hitting the link errors described above. Signed-off-by: Paul Burton <[email protected]> Reviewed-by: Maciej W. Rozycki <[email protected]> Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/14508/ Signed-off-by: Ralf Baechle <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit d9b5b65 ] Current init code initialises bootmem allocator with all of the low memory that it assumes is available, but does not check for reserved memory block, which can lead to corruption of data that may be stored there. Move bootmem's allocation map to a location that does not cross any reserved regions Signed-off-by: Marcin Nowakowski <[email protected]> Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/14609/ Signed-off-by: Ralf Baechle <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit e89ef66 ] Memories managed through boot_mem_map are generally expected to define non-crossing areas. However, if part of a larger memory block is marked as reserved, it would still be added to bootmem allocator as an available block and could end up being overwritten by the allocator. Prevent this by explicitly marking the memory as reserved it if exists in the range used by bootmem allocator. Signed-off-by: Marcin Nowakowski <[email protected]> Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/14608/ Signed-off-by: Ralf Baechle <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 35e6de3 ] On systems with CM3, we must ensure that the L1 & L2 ECC enables are set to the same value. This is presumed by the hardware & cache corruption can occur when it is not the case. Support enabling & disabling the L2 ECC checking on CM3 systems where this is controlled via a GCR, and ensure that it matches the state of L1 ECC checking. Remove I6400 from the switch statement it will no longer hit, and which was incorrect since the L2 ECC enable bit isn't in the CP0 ErrCtl register. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/14413/ Signed-off-by: Ralf Baechle <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 9799270 ] Code in arch/mips/netlogic/common/irq.c which handles the XLP PIC fails to build in XLR configurations due to cpu_is_xlp9xx not being defined, leading to the following build failure: arch/mips/netlogic/common/irq.c: In function ‘xlp_of_pic_init’: arch/mips/netlogic/common/irq.c:298:2: error: implicit declaration of function ‘cpu_is_xlp9xx’ [-Werror=implicit-function-declaration] if (cpu_is_xlp9xx()) { ^ Although the code was conditional upon CONFIG_OF which is indirectly selected by CONFIG_NLM_XLP_BOARD but not CONFIG_NLM_XLR_BOARD, the failing XLR with CONFIG_OF configuration can be configured manually or by randconfig. Fix the build failure by making the affected XLP PIC code conditional upon CONFIG_CPU_XLP which is used to guard the inclusion of asm/netlogic/xlp-hal/xlp.h that provides the required cpu_is_xlp9xx function. [[email protected]: Fixed up as per Jayachandran's suggestion.] Signed-off-by: Paul Burton <[email protected]> Cc: Jayachandran C <[email protected]> Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/14524/ Signed-off-by: Ralf Baechle <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

This reverts commit 6145171. The commit fixes a bug that was only introduced in 4.10, thus is irrelevant for <=4.9. Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

This reverts commit ad50561. There was a mixup with the commit message for two upstream commit that have the same subject line. This revert will be followed by the two commits with proper commit messages. Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 1786dbf ] On the kernel side, sockaddr_storage is #define'd to __kernel_sockaddr_storage. Replacing struct sockaddr_storage with struct __kernel_sockaddr_storage defined by <linux/socket.h> fixes the following linux/rds.h userspace compilation error: /usr/include/linux/rds.h:226:26: error: field 'dest_addr' has incomplete type struct sockaddr_storage dest_addr; Signed-off-by: Dmitry V. Levin <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit feb0869 ] Consistently use types from linux/types.h to fix the following linux/rds.h userspace compilation errors: /usr/include/linux/rds.h:106:2: error: unknown type name 'uint8_t' uint8_t name[32]; /usr/include/linux/rds.h:107:2: error: unknown type name 'uint64_t' uint64_t value; /usr/include/linux/rds.h:117:2: error: unknown type name 'uint64_t' uint64_t next_tx_seq; /usr/include/linux/rds.h:118:2: error: unknown type name 'uint64_t' uint64_t next_rx_seq; /usr/include/linux/rds.h:121:2: error: unknown type name 'uint8_t' uint8_t transport[TRANSNAMSIZ]; /* null term ascii */ /usr/include/linux/rds.h:122:2: error: unknown type name 'uint8_t' uint8_t flags; /usr/include/linux/rds.h:129:2: error: unknown type name 'uint64_t' uint64_t seq; /usr/include/linux/rds.h:130:2: error: unknown type name 'uint32_t' uint32_t len; /usr/include/linux/rds.h:135:2: error: unknown type name 'uint8_t' uint8_t flags; /usr/include/linux/rds.h:139:2: error: unknown type name 'uint32_t' uint32_t sndbuf; /usr/include/linux/rds.h:144:2: error: unknown type name 'uint32_t' uint32_t rcvbuf; /usr/include/linux/rds.h:145:2: error: unknown type name 'uint64_t' uint64_t inum; /usr/include/linux/rds.h:153:2: error: unknown type name 'uint64_t' uint64_t hdr_rem; /usr/include/linux/rds.h:154:2: error: unknown type name 'uint64_t' uint64_t data_rem; /usr/include/linux/rds.h:155:2: error: unknown type name 'uint32_t' uint32_t last_sent_nxt; /usr/include/linux/rds.h:156:2: error: unknown type name 'uint32_t' uint32_t last_expected_una; /usr/include/linux/rds.h:157:2: error: unknown type name 'uint32_t' uint32_t last_seen_una; /usr/include/linux/rds.h:164:2: error: unknown type name 'uint8_t' uint8_t src_gid[RDS_IB_GID_LEN]; /usr/include/linux/rds.h:165:2: error: unknown type name 'uint8_t' uint8_t dst_gid[RDS_IB_GID_LEN]; /usr/include/linux/rds.h:167:2: error: unknown type name 'uint32_t' uint32_t max_send_wr; /usr/include/linux/rds.h:168:2: error: unknown type name 'uint32_t' uint32_t max_recv_wr; /usr/include/linux/rds.h:169:2: error: unknown type name 'uint32_t' uint32_t max_send_sge; /usr/include/linux/rds.h:170:2: error: unknown type name 'uint32_t' uint32_t rdma_mr_max; /usr/include/linux/rds.h:171:2: error: unknown type name 'uint32_t' uint32_t rdma_mr_size; /usr/include/linux/rds.h:212:9: error: unknown type name 'uint64_t' typedef uint64_t rds_rdma_cookie_t; /usr/include/linux/rds.h:215:2: error: unknown type name 'uint64_t' uint64_t addr; /usr/include/linux/rds.h:216:2: error: unknown type name 'uint64_t' uint64_t bytes; /usr/include/linux/rds.h:221:2: error: unknown type name 'uint64_t' uint64_t cookie_addr; /usr/include/linux/rds.h:222:2: error: unknown type name 'uint64_t' uint64_t flags; /usr/include/linux/rds.h:228:2: error: unknown type name 'uint64_t' uint64_t cookie_addr; /usr/include/linux/rds.h:229:2: error: unknown type name 'uint64_t' uint64_t flags; /usr/include/linux/rds.h:234:2: error: unknown type name 'uint64_t' uint64_t flags; /usr/include/linux/rds.h:240:2: error: unknown type name 'uint64_t' uint64_t local_vec_addr; /usr/include/linux/rds.h:241:2: error: unknown type name 'uint64_t' uint64_t nr_local; /usr/include/linux/rds.h:242:2: error: unknown type name 'uint64_t' uint64_t flags; /usr/include/linux/rds.h:243:2: error: unknown type name 'uint64_t' uint64_t user_token; /usr/include/linux/rds.h:248:2: error: unknown type name 'uint64_t' uint64_t local_addr; /usr/include/linux/rds.h:249:2: error: unknown type name 'uint64_t' uint64_t remote_addr; /usr/include/linux/rds.h:252:4: error: unknown type name 'uint64_t' uint64_t compare; /usr/include/linux/rds.h:253:4: error: unknown type name 'uint64_t' uint64_t swap; /usr/include/linux/rds.h:256:4: error: unknown type name 'uint64_t' uint64_t add; /usr/include/linux/rds.h:259:4: error: unknown type name 'uint64_t' uint64_t compare; /usr/include/linux/rds.h:260:4: error: unknown type name 'uint64_t' uint64_t swap; /usr/include/linux/rds.h:261:4: error: unknown type name 'uint64_t' uint64_t compare_mask; /usr/include/linux/rds.h:262:4: error: unknown type name 'uint64_t' uint64_t swap_mask; /usr/include/linux/rds.h:265:4: error: unknown type name 'uint64_t' uint64_t add; /usr/include/linux/rds.h:266:4: error: unknown type name 'uint64_t' uint64_t nocarry_mask; /usr/include/linux/rds.h:269:2: error: unknown type name 'uint64_t' uint64_t flags; /usr/include/linux/rds.h:270:2: error: unknown type name 'uint64_t' uint64_t user_token; /usr/include/linux/rds.h:274:2: error: unknown type name 'uint64_t' uint64_t user_token; /usr/include/linux/rds.h:275:2: error: unknown type name 'int32_t' int32_t status; Signed-off-by: Dmitry V. Levin <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 0fbcc5b ] xt_rateest_net_exit() was added to check whether rules are flushed successfully. but ->net_exit() callback is called earlier than ->destroy() callback. So that ->net_exit() callback can't check that. test commands: %ip netns add vm1 %ip netns exec vm1 iptables -t mangle -I PREROUTING -p udp \ --dport 1111 -j RATEEST --rateest-name ap \ --rateest-interval 250ms --rateest-ewma 0.5s %ip netns del vm1 splat looks like: [ 668.813518] WARNING: CPU: 0 PID: 87 at net/netfilter/xt_RATEEST.c:210 xt_rateest_net_exit+0x210/0x340 [xt_RATEEST] [ 668.813518] Modules linked in: xt_RATEEST xt_tcpudp iptable_mangle bpfilter ip_tables x_tables [ 668.813518] CPU: 0 PID: 87 Comm: kworker/u4:2 Not tainted 4.19.0-rc7+ Freescale#21 [ 668.813518] Workqueue: netns cleanup_net [ 668.813518] RIP: 0010:xt_rateest_net_exit+0x210/0x340 [xt_RATEEST] [ 668.813518] Code: 00 48 8b 85 30 ff ff ff 4c 8b 23 80 38 00 0f 85 24 01 00 00 48 8b 85 30 ff ff ff 4d 85 e4 4c 89 a5 58 ff ff ff c6 00 f8 74 b2 <0f> 0b 48 83 c3 08 4c 39 f3 75 b0 48 b8 00 00 00 00 00 fc ff df 49 [ 668.813518] RSP: 0018:ffff8801156c73f8 EFLAGS: 00010282 [ 668.813518] RAX: ffffed0022ad8e85 RBX: ffff880118928e98 RCX: 5db8012a00000000 [ 668.813518] RDX: ffff8801156c7428 RSI: 00000000cb1d185f RDI: ffff880115663b74 [ 668.813518] RBP: ffff8801156c74d0 R08: ffff8801156633c0 R09: 1ffff100236440be [ 668.813518] R10: 0000000000000001 R11: ffffed002367d852 R12: ffff880115142b08 [ 668.813518] R13: 1ffff10022ad8e81 R14: ffff880118928ea8 R15: dffffc0000000000 [ 668.813518] FS: 0000000000000000(0000) GS:ffff88011b200000(0000) knlGS:0000000000000000 [ 668.813518] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 668.813518] CR2: 0000563aa69f4f28 CR3: 0000000105a16000 CR4: 00000000001006f0 [ 668.813518] Call Trace: [ 668.813518] ? unregister_netdevice_many+0xe0/0xe0 [ 668.813518] ? xt_rateest_net_init+0x2c0/0x2c0 [xt_RATEEST] [ 668.813518] ? default_device_exit+0x1ca/0x270 [ 668.813518] ? remove_proc_entry+0x1cd/0x390 [ 668.813518] ? dev_change_net_namespace+0xd00/0xd00 [ 668.813518] ? __init_waitqueue_head+0x130/0x130 [ 668.813518] ops_exit_list.isra.10+0x94/0x140 [ 668.813518] cleanup_net+0x45b/0x900 [ 668.813518] ? net_drop_ns+0x110/0x110 [ 668.813518] ? swapgs_restore_regs_and_return_to_usermode+0x3c/0x80 [ 668.813518] ? save_trace+0x300/0x300 [ 668.813518] ? lock_acquire+0x196/0x470 [ 668.813518] ? lock_acquire+0x196/0x470 [ 668.813518] ? process_one_work+0xb60/0x1de0 [ 668.813518] ? _raw_spin_unlock_irq+0x29/0x40 [ 668.813518] ? _raw_spin_unlock_irq+0x29/0x40 [ 668.813518] ? __lock_acquire+0x4500/0x4500 [ 668.813518] ? __lock_is_held+0xb4/0x140 [ 668.813518] process_one_work+0xc13/0x1de0 [ 668.813518] ? pwq_dec_nr_in_flight+0x3c0/0x3c0 [ 668.813518] ? set_load_weight+0x270/0x270 [ ... ] Fixes: 3427b2a ("netfilter: make xt_rateest hash table per net") Signed-off-by: Taehee Yoo <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

commit 36eb8ff upstream. Crash dump shows following instructions crash> bt PID: 0 TASK: ffffffffbe412480 CPU: 0 COMMAND: "swapper/0" #0 [ffff891ee0003868] machine_kexec at ffffffffbd063ef1 Freescale#1 [ffff891ee00038c8] __crash_kexec at ffffffffbd12b6f2 Freescale#2 [ffff891ee0003998] crash_kexec at ffffffffbd12c84c Freescale#3 [ffff891ee00039b8] oops_end at ffffffffbd030f0a Freescale#4 [ffff891ee00039e0] no_context at ffffffffbd074643 Freescale#5 [ffff891ee0003a40] __bad_area_nosemaphore at ffffffffbd07496e Freescale#6 [ffff891ee0003a90] bad_area_nosemaphore at ffffffffbd074a64 Freescale#7 [ffff891ee0003aa0] __do_page_fault at ffffffffbd074b0a Freescale#8 [ffff891ee0003b18] do_page_fault at ffffffffbd074fc8 Freescale#9 [ffff891ee0003b50] page_fault at ffffffffbda01925 [exception RIP: qlt_schedule_sess_for_deletion+15] RIP: ffffffffc02e526f RSP: ffff891ee0003c08 RFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffc0307847 RDX: 00000000000020e6 RSI: ffff891edbc377c8 RDI: 0000000000000000 RBP: ffff891ee0003c18 R8: ffffffffc02f0b20 R9: 0000000000000250 R10: 0000000000000258 R11: 000000000000b780 R12: ffff891ed9b43000 R13: 00000000000000f0 R14: 0000000000000006 R15: ffff891edbc377c8 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 Freescale#10 [ffff891ee0003c20] qla2x00_fcport_event_handler at ffffffffc02853d3 [qla2xxx] Freescale#11 [ffff891ee0003cf0] __dta_qla24xx_async_gnl_sp_done_333 at ffffffffc0285a1d [qla2xxx] Freescale#12 [ffff891ee0003de8] qla24xx_process_response_queue at ffffffffc02a2eb5 [qla2xxx] Freescale#13 [ffff891ee0003e88] qla24xx_msix_rsp_q at ffffffffc02a5403 [qla2xxx] Freescale#14 [ffff891ee0003ec0] __handle_irq_event_percpu at ffffffffbd0f4c59 Freescale#15 [ffff891ee0003f10] handle_irq_event_percpu at ffffffffbd0f4e02 Freescale#16 [ffff891ee0003f40] handle_irq_event at ffffffffbd0f4e90 Freescale#17 [ffff891ee0003f68] handle_edge_irq at ffffffffbd0f8984 Freescale#18 [ffff891ee0003f88] handle_irq at ffffffffbd0305d5 Freescale#19 [ffff891ee0003fb8] do_IRQ at ffffffffbda02a18 --- <IRQ stack> --- Freescale#20 [ffffffffbe403d30] ret_from_intr at ffffffffbda0094e [exception RIP: unknown or invalid address] RIP: 000000000000001f RSP: 0000000000000000 RFLAGS: fff3b8c2091ebb3f RAX: ffffbba5a0000200 RBX: 0000be8cdfa8f9fa RCX: 0000000000000018 RDX: 0000000000000101 RSI: 000000000000015d RDI: 0000000000000193 RBP: 0000000000000083 R8: ffffffffbe403e38 R9: 0000000000000002 R10: 0000000000000000 R11: ffffffffbe56b820 R12: ffff891ee001cf00 R13: ffffffffbd11c0a4 R14: ffffffffbe403d60 R15: 0000000000000001 ORIG_RAX: ffff891ee0022ac0 CS: 0000 SS: ffffffffffffffb9 bt: WARNING: possibly bogus exception frame Freescale#21 [ffffffffbe403dd8] cpuidle_enter_state at ffffffffbd67c6fd Freescale#22 [ffffffffbe403e40] cpuidle_enter at ffffffffbd67c907 Freescale#23 [ffffffffbe403e50] call_cpuidle at ffffffffbd0d98f3 Freescale#24 [ffffffffbe403e60] do_idle at ffffffffbd0d9b42 Freescale#25 [ffffffffbe403e98] cpu_startup_entry at ffffffffbd0d9da3 Freescale#26 [ffffffffbe403ec0] rest_init at ffffffffbd81d4aa Freescale#27 [ffffffffbe403ed0] start_kernel at ffffffffbe67d2ca Freescale#28 [ffffffffbe403f28] x86_64_start_reservations at ffffffffbe67c675 Freescale#29 [ffffffffbe403f38] x86_64_start_kernel at ffffffffbe67c6eb Freescale#30 [ffffffffbe403f50] secondary_startup_64 at ffffffffbd0000d5 Fixes: 040036b ("scsi: qla2xxx: Delay loop id allocation at login") Cc: <[email protected]> # v4.17+ Signed-off-by: Chuck Anderson <[email protected]> Signed-off-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 3998e6b upstream. When the final cifsFileInfo_put() is called from cifsiod and an oplock break work is queued, lockdep complains loudly: ============================================= [ INFO: possible recursive locking detected ] 4.11.0+ Freescale#21 Not tainted --------------------------------------------- kworker/0:2/78 is trying to acquire lock: ("cifsiod"){++++.+}, at: flush_work+0x215/0x350 but task is already holding lock: ("cifsiod"){++++.+}, at: process_one_work+0x255/0x8e0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock("cifsiod"); lock("cifsiod"); *** DEADLOCK *** May be due to missing lock nesting notation 2 locks held by kworker/0:2/78: #0: ("cifsiod"){++++.+}, at: process_one_work+0x255/0x8e0 Freescale#1: ((&wdata->work)){+.+...}, at: process_one_work+0x255/0x8e0 stack backtrace: CPU: 0 PID: 78 Comm: kworker/0:2 Not tainted 4.11.0+ Freescale#21 Workqueue: cifsiod cifs_writev_complete Call Trace: dump_stack+0x85/0xc2 __lock_acquire+0x17dd/0x2260 ? match_held_lock+0x20/0x2b0 ? trace_hardirqs_off_caller+0x86/0x130 ? mark_lock+0xa6/0x920 lock_acquire+0xcc/0x260 ? lock_acquire+0xcc/0x260 ? flush_work+0x215/0x350 flush_work+0x236/0x350 ? flush_work+0x215/0x350 ? destroy_worker+0x170/0x170 __cancel_work_timer+0x17d/0x210 ? ___preempt_schedule+0x16/0x18 cancel_work_sync+0x10/0x20 cifsFileInfo_put+0x338/0x7f0 cifs_writedata_release+0x2a/0x40 ? cifs_writedata_release+0x2a/0x40 cifs_writev_complete+0x29d/0x850 ? preempt_count_sub+0x18/0xd0 process_one_work+0x304/0x8e0 worker_thread+0x9b/0x6a0 kthread+0x1b2/0x200 ? process_one_work+0x8e0/0x8e0 ? kthread_create_on_node+0x40/0x40 ret_from_fork+0x31/0x40 This is a real warning. Since the oplock is queued on the same workqueue this can deadlock if there is only one worker thread active for the workqueue (which will be the case during memory pressure when the rescuer thread is handling it). Furthermore, there is at least one other kind of hang possible due to the oplock break handling if there is only worker. (This can be reproduced without introducing memory pressure by having passing 1 for the max_active parameter of cifsiod.) cifs_oplock_break() can wait indefintely in the filemap_fdatawait() while the cifs_writev_complete() work is blocked: sysrq: SysRq : Show Blocked State task PC stack pid father kworker/0:1 D 0 16 2 0x00000000 Workqueue: cifsiod cifs_oplock_break Call Trace: __schedule+0x562/0xf40 ? mark_held_locks+0x4a/0xb0 schedule+0x57/0xe0 io_schedule+0x21/0x50 wait_on_page_bit+0x143/0x190 ? add_to_page_cache_lru+0x150/0x150 __filemap_fdatawait_range+0x134/0x190 ? do_writepages+0x51/0x70 filemap_fdatawait_range+0x14/0x30 filemap_fdatawait+0x3b/0x40 cifs_oplock_break+0x651/0x710 ? preempt_count_sub+0x18/0xd0 process_one_work+0x304/0x8e0 worker_thread+0x9b/0x6a0 kthread+0x1b2/0x200 ? process_one_work+0x8e0/0x8e0 ? kthread_create_on_node+0x40/0x40 ret_from_fork+0x31/0x40 dd D 0 683 171 0x00000000 Call Trace: __schedule+0x562/0xf40 ? mark_held_locks+0x29/0xb0 schedule+0x57/0xe0 io_schedule+0x21/0x50 wait_on_page_bit+0x143/0x190 ? add_to_page_cache_lru+0x150/0x150 __filemap_fdatawait_range+0x134/0x190 ? do_writepages+0x51/0x70 filemap_fdatawait_range+0x14/0x30 filemap_fdatawait+0x3b/0x40 filemap_write_and_wait+0x4e/0x70 cifs_flush+0x6a/0xb0 filp_close+0x52/0xa0 __close_fd+0xdc/0x150 SyS_close+0x33/0x60 entry_SYSCALL_64_fastpath+0x1f/0xbe Showing all locks held in the system: 2 locks held by kworker/0:1/16: #0: ("cifsiod"){.+.+.+}, at: process_one_work+0x255/0x8e0 Freescale#1: ((&cfile->oplock_break)){+.+.+.}, at: process_one_work+0x255/0x8e0 Showing busy workqueues and worker pools: workqueue cifsiod: flags=0xc pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/1 in-flight: 16:cifs_oplock_break delayed: cifs_writev_complete, cifs_echo_request pool 0: cpus=0 node=0 flags=0x0 nice=0 hung=0s workers=3 idle: 750 3 Fix these problems by creating a a new workqueue (with a rescuer) for the oplock break work. Signed-off-by: Rabin Vincent <[email protected]> Signed-off-by: Steve French <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 45caeaa ] As Eric Dumazet pointed out this also needs to be fixed in IPv6. v2: Contains the IPv6 tcp/Ipv6 dccp patches as well. We have seen a few incidents lately where a dst_enty has been freed with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that dst_entry. If the conditions/timings are right a crash then ensues when the freed dst_entry is referenced later on. A Common crashing back trace is: Freescale#8 [] page_fault at ffffffff8163e648 [exception RIP: __tcp_ack_snd_check+74] . . Freescale#9 [] tcp_rcv_established at ffffffff81580b64 Freescale#10 [] tcp_v4_do_rcv at ffffffff8158b54a Freescale#11 [] tcp_v4_rcv at ffffffff8158cd02 Freescale#12 [] ip_local_deliver_finish at ffffffff815668f4 Freescale#13 [] ip_local_deliver at ffffffff81566bd9 Freescale#14 [] ip_rcv_finish at ffffffff8156656d Freescale#15 [] ip_rcv at ffffffff81566f06 Freescale#16 [] __netif_receive_skb_core at ffffffff8152b3a2 Freescale#17 [] __netif_receive_skb at ffffffff8152b608 Freescale#18 [] netif_receive_skb at ffffffff8152b690 Freescale#19 [] vmxnet3_rq_rx_complete at ffffffffa015eeaf [vmxnet3] Freescale#20 [] vmxnet3_poll_rx_only at ffffffffa015f32a [vmxnet3] Freescale#21 [] net_rx_action at ffffffff8152bac2 Freescale#22 [] __do_softirq at ffffffff81084b4f Freescale#23 [] call_softirq at ffffffff8164845c Freescale#24 [] do_softirq at ffffffff81016fc5 Freescale#25 [] irq_exit at ffffffff81084ee5 Freescale#26 [] do_IRQ at ffffffff81648ff8 Of course it may happen with other NIC drivers as well. It's found the freed dst_entry here: 224 static bool tcp_in_quickack_mode(struct sock *sk)↩ 225 {↩ 226 ▹ const struct inet_connection_sock *icsk = inet_csk(sk);↩ 227 ▹ const struct dst_entry *dst = __sk_dst_get(sk);↩ 228 ↩ 229 ▹ return (dst && dst_metric(dst, RTAX_QUICKACK)) ||↩ 230 ▹ ▹ (icsk->icsk_ack.quick && !icsk->icsk_ack.pingpong);↩ 231 }↩ But there are other backtraces attributed to the same freed dst_entry in netfilter code as well. All the vmcores showed 2 significant clues: - Remote hosts behind the default gateway had always been redirected to a different gateway. A rtable/dst_entry will be added for that host. Making more dst_entrys with lower reference counts. Making this more probable. - All vmcores showed a postitive LockDroppedIcmps value, e.g: LockDroppedIcmps 267 A closer look at the tcp_v4_err() handler revealed that do_redirect() will run regardless of whether user space has the socket locked. This can result in a race condition where the same dst_entry cached in sk->sk_dst_entry can be decremented twice for the same socket via: do_redirect()->__sk_dst_check()-> dst_release(). Which leads to the dst_entry being prematurely freed with another socket pointing to it via sk->sk_dst_cache and a subsequent crash. To fix this skip do_redirect() if usespace has the socket locked. Instead let the redirect take place later when user space does not have the socket locked. The dccp/IPv6 code is very similar in this respect, so fixing it there too. As Eric Garver pointed out the following commit now invalidates routes. Which can set the dst->obsolete flag so that ipv4_dst_check() returns null and triggers the dst_release(). Fixes: ceb3320 ("ipv4: Kill routes during PMTU/redirect updates.") Cc: Eric Garver <[email protected]> Cc: Hannes Sowa <[email protected]> Signed-off-by: Jon Maxwell <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 85744e9 ] A bad job is the one triggered TDR(In the current amdgpu's implementation, actually all the jobs in the current joq-queue will be treated as bad jobs). In the recovery process, its fence will be fake signaled and as a result, the work behind will be scheduled to delete it from the mirror list, but if the TDR process is invoked before the work's execution, then this bad job might be processed again and the call dma_fence_set_error to its fence in TDR process will lead to kernel warning trace: [ 143.033605] WARNING: CPU: 2 PID: 53 at ./include/linux/dma-fence.h:437 amddrm_sched_job_recovery+0x1af/0x1c0 [amd_sched] kernel: [ 143.033606] Modules linked in: amdgpu(OE) amdchash(OE) amdttm(OE) amd_sched(OE) amdkcl(OE) amd_iommu_v2 drm_kms_helper drm i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 snd_hda_codec_generic crypto_simd glue_helper cryptd snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq joydev snd_seq_device snd_timer snd soundcore binfmt_misc input_leds mac_hid serio_raw nfsd auth_rpcgss nfs_acl lockd grace sunrpc sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 8139too floppy psmouse 8139cp mii i2c_piix4 pata_acpi [ 143.033649] CPU: 2 PID: 53 Comm: kworker/2:1 Tainted: G OE 4.15.0-20-generic #21-Ubuntu [ 143.033650] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 143.033653] Workqueue: events drm_sched_job_timedout [amd_sched] [ 143.033656] RIP: 0010:amddrm_sched_job_recovery+0x1af/0x1c0 [amd_sched] [ 143.033657] RSP: 0018:ffffa9f880fe7d48 EFLAGS: 00010202 [ 143.033659] RAX: 0000000000000007 RBX: ffff9b98f2b24c00 RCX: ffff9b98efef4f08 [ 143.033660] RDX: ffff9b98f2b27400 RSI: ffff9b98f2b24c50 RDI: ffff9b98efef4f18 [ 143.033660] RBP: ffffa9f880fe7d98 R08: 0000000000000001 R09: 00000000000002b6 [ 143.033661] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9b98efef3430 [ 143.033662] R13: ffff9b98efef4d80 R14: ffff9b98efef4e98 R15: ffff9b98eaf91c00 [ 143.033663] FS: 0000000000000000(0000) GS:ffff9b98ffd00000(0000) knlGS:0000000000000000 [ 143.033664] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 143.033665] CR2: 00007fc49c96d470 CR3: 000000001400a005 CR4: 00000000003606e0 [ 143.033669] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 143.033669] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 143.033670] Call Trace: [ 143.033744] amdgpu_device_gpu_recover+0x144/0x820 [amdgpu] [ 143.033788] amdgpu_job_timedout+0x9b/0xa0 [amdgpu] [ 143.033791] drm_sched_job_timedout+0xcc/0x150 [amd_sched] [ 143.033795] process_one_work+0x1de/0x410 [ 143.033797] worker_thread+0x32/0x410 [ 143.033799] kthread+0x121/0x140 [ 143.033801] ? process_one_work+0x410/0x410 [ 143.033803] ? kthread_create_worker_on_cpu+0x70/0x70 [ 143.033806] ret_from_fork+0x35/0x40 So just delete the bad job from mirror list directly Changes in v3: - Add a helper function to delete the bad jobs from mirror list and call it directly *before* the job's fence is signaled Changes in v2: - delete the useless list node check - also delete bad jobs in drm_sched_main because: kthread_unpark(ring->sched.thread) will be invoked very early before amdgpu_device_gpu_recover's return, then drm_sched_main will have chance to pick up a new job from the job queue. This new job will be added into the mirror list and processed by amdgpu_job_run, but may not be deleted from the mirror list on time due to the same reason. And finally re-processed by drm_sched_job_recovery Signed-off-by: Trigger Huang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit b058809 ] A bug is present in GDB which causes early string termination when parsing variables. This has been reported [0], but we should ensure that we can support at least basic printing of the core kernel strings. For current gdb version (has been tested with 7.3 and 8.1), 'lx-version' only prints one character. (gdb) lx-version L(gdb) This can be fixed by casting 'linux_banner' as (char *). (gdb) lx-version Linux version 4.19.0-rc1+ (changbin@acer) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #21 SMP Sat Sep 1 21:43:30 CST 2018 [0] https://sourceware.org/bugzilla/show_bug.cgi?id=20077 [[email protected]: add detail to commit message] Link: http://lkml.kernel.org/r/[email protected] Fixes: 2d061d9 ("scripts/gdb: add version command") Signed-off-by: Du Changbin <[email protected]> Signed-off-by: Kieran Bingham <[email protected]> Acked-by: Jan Kiszka <[email protected]> Cc: Jan Kiszka <[email protected]> Cc: Jason Wessel <[email protected]> Cc: Daniel Thompson <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 2542604 ] When a matchall classifier is added, there is a small time interval in which tp->root is NULL. If we receive a packet in this small time slice a NULL pointer dereference will happen, leading to a kernel panic: # tc qdisc replace dev eth0 ingress # tc filter add dev eth0 parent ffff: matchall action gact drop Unable to handle kernel NULL pointer dereference at virtual address 0000000000000034 Mem abort info: ESR = 0x96000005 Exception class = DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000005 CM = 0, WnR = 0 user pgtable: 4k pages, 39-bit VAs, pgdp = 00000000a623d530 [0000000000000034] pgd=0000000000000000, pud=0000000000000000 Internal error: Oops: 96000005 [#1] SMP Modules linked in: cls_matchall sch_ingress nls_iso8859_1 nls_cp437 vfat fat m25p80 spi_nor mtd xhci_plat_hcd xhci_hcd phy_generic sfp mdio_i2c usbcore i2c_mv64xxx marvell10g mvpp2 usb_common spi_orion mvmdio i2c_core sbsa_gwdt phylink ip_tables x_tables autofs4 Process ksoftirqd/0 (pid: 9, stack limit = 0x0000000009de7d62) CPU: 0 PID: 9 Comm: ksoftirqd/0 Not tainted 5.1.0-rc6 #21 Hardware name: Marvell 8040 MACCHIATOBin Double-shot (DT) pstate: 40000005 (nZcv daif -PAN -UAO) pc : mall_classify+0x28/0x78 [cls_matchall] lr : tcf_classify+0x78/0x138 sp : ffffff80109db9d0 x29: ffffff80109db9d0 x28: ffffffc426058800 x27: 0000000000000000 x26: ffffffc425b0dd00 x25: 0000000020000000 x24: 0000000000000000 x23: ffffff80109dbac0 x22: 0000000000000001 x21: ffffffc428ab5100 x20: ffffffc425b0dd00 x19: ffffff80109dbac0 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: 0000000000000000 x13: ffffffbf108ad288 x12: dead000000000200 x11: 00000000f0000000 x10: 0000000000000001 x9 : ffffffbf1089a220 x8 : 0000000000000001 x7 : ffffffbebffaa950 x6 : 0000000000000000 x5 : 000000442d6ba000 x4 : 0000000000000000 x3 : ffffff8008735ad8 x2 : ffffff80109dbac0 x1 : ffffffc425b0dd00 x0 : ffffff8010592078 Call trace: mall_classify+0x28/0x78 [cls_matchall] tcf_classify+0x78/0x138 __netif_receive_skb_core+0x29c/0xa20 __netif_receive_skb_one_core+0x34/0x60 __netif_receive_skb+0x28/0x78 netif_receive_skb_internal+0x2c/0xc0 napi_gro_receive+0x1a0/0x1d8 mvpp2_poll+0x928/0xb18 [mvpp2] net_rx_action+0x108/0x378 __do_softirq+0x128/0x320 run_ksoftirqd+0x44/0x60 smpboot_thread_fn+0x168/0x1b0 kthread+0x12c/0x130 ret_from_fork+0x10/0x1c Code: aa0203f3 aa1e03e0 d503201f f9400684 (b9403480) ---[ end trace fc71e2ef7b8ab5a5 ]--- Kernel panic - not syncing: Fatal exception in interrupt SMP: stopping secondary CPUs Kernel Offset: disabled CPU features: 0x002,00002000 Memory Limit: none Rebooting in 1 seconds.. Fix this by adding a NULL check in mall_classify(). Fixes: ed76f5e ("net: sched: protect filter_chain list with filter_chain_lock mutex") Signed-off-by: Matteo Croce <[email protected]> Acked-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 1bc7896 ] When experimenting with bpf_send_signal() helper in our production environment (5.2 based), we experienced a deadlock in NMI mode: Freescale#5 [ffffc9002219f770] queued_spin_lock_slowpath at ffffffff8110be24 Freescale#6 [ffffc9002219f770] _raw_spin_lock_irqsave at ffffffff81a43012 Freescale#7 [ffffc9002219f780] try_to_wake_up at ffffffff810e7ecd Freescale#8 [ffffc9002219f7e0] signal_wake_up_state at ffffffff810c7b55 Freescale#9 [ffffc9002219f7f0] __send_signal at ffffffff810c8602 Freescale#10 [ffffc9002219f830] do_send_sig_info at ffffffff810ca31a Freescale#11 [ffffc9002219f868] bpf_send_signal at ffffffff8119d227 Freescale#12 [ffffc9002219f988] bpf_overflow_handler at ffffffff811d4140 Freescale#13 [ffffc9002219f9e0] __perf_event_overflow at ffffffff811d68cf Freescale#14 [ffffc9002219fa10] perf_swevent_overflow at ffffffff811d6a09 Freescale#15 [ffffc9002219fa38] ___perf_sw_event at ffffffff811e0f47 Freescale#16 [ffffc9002219fc30] __schedule at ffffffff81a3e04d Freescale#17 [ffffc9002219fc90] schedule at ffffffff81a3e219 Freescale#18 [ffffc9002219fca0] futex_wait_queue_me at ffffffff8113d1b9 Freescale#19 [ffffc9002219fcd8] futex_wait at ffffffff8113e529 Freescale#20 [ffffc9002219fdf0] do_futex at ffffffff8113ffbc Freescale#21 [ffffc9002219fec0] __x64_sys_futex at ffffffff81140d1c Freescale#22 [ffffc9002219ff38] do_syscall_64 at ffffffff81002602 Freescale#23 [ffffc9002219ff50] entry_SYSCALL_64_after_hwframe at ffffffff81c00068 The above call stack is actually very similar to an issue reported by Commit eac9153 ("bpf/stackmap: Fix deadlock with rq_lock in bpf_get_stack()") by Song Liu. The only difference is bpf_send_signal() helper instead of bpf_get_stack() helper. The above deadlock is triggered with a perf_sw_event. Similar to Commit eac9153, the below almost identical reproducer used tracepoint point sched/sched_switch so the issue can be easily caught. /* stress_test.c */ #include <stdio.h> #include <stdlib.h> #include <sys/mman.h> #include <pthread.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #define THREAD_COUNT 1000 char *filename; void *worker(void *p) { void *ptr; int fd; char *pptr; fd = open(filename, O_RDONLY); if (fd < 0) return NULL; while (1) { struct timespec ts = {0, 1000 + rand() % 2000}; ptr = mmap(NULL, 4096 * 64, PROT_READ, MAP_PRIVATE, fd, 0); usleep(1); if (ptr == MAP_FAILED) { printf("failed to mmap\n"); break; } munmap(ptr, 4096 * 64); usleep(1); pptr = malloc(1); usleep(1); pptr[0] = 1; usleep(1); free(pptr); usleep(1); nanosleep(&ts, NULL); } close(fd); return NULL; } int main(int argc, char *argv[]) { void *ptr; int i; pthread_t threads[THREAD_COUNT]; if (argc < 2) return 0; filename = argv[1]; for (i = 0; i < THREAD_COUNT; i++) { if (pthread_create(threads + i, NULL, worker, NULL)) { fprintf(stderr, "Error creating thread\n"); return 0; } } for (i = 0; i < THREAD_COUNT; i++) pthread_join(threads[i], NULL); return 0; } and the following command: 1. run `stress_test /bin/ls` in one windown 2. hack bcc trace.py with the following change: # --- a/tools/trace.py # +++ b/tools/trace.py @@ -513,6 +513,7 @@ BPF_PERF_OUTPUT(%s); __data.tgid = __tgid; __data.pid = __pid; bpf_get_current_comm(&__data.comm, sizeof(__data.comm)); + bpf_send_signal(10); %s %s %s.perf_submit(%s, &__data, sizeof(__data)); 3. in a different window run ./trace.py -p $(pidof stress_test) t:sched:sched_switch The deadlock can be reproduced in our production system. Similar to Song's fix, the fix is to delay sending signal if irqs is disabled to avoid deadlocks involving with rq_lock. With this change, my above stress-test in our production system won't cause deadlock any more. I also implemented a scale-down version of reproducer in the selftest (a subsequent commit). With latest bpf-next, it complains for the following potential deadlock. [ 32.832450] -> Freescale#1 (&p->pi_lock){-.-.}: [ 32.833100] _raw_spin_lock_irqsave+0x44/0x80 [ 32.833696] task_rq_lock+0x2c/0xa0 [ 32.834182] task_sched_runtime+0x59/0xd0 [ 32.834721] thread_group_cputime+0x250/0x270 [ 32.835304] thread_group_cputime_adjusted+0x2e/0x70 [ 32.835959] do_task_stat+0x8a7/0xb80 [ 32.836461] proc_single_show+0x51/0xb0 ... [ 32.839512] -> #0 (&(&sighand->siglock)->rlock){....}: [ 32.840275] __lock_acquire+0x1358/0x1a20 [ 32.840826] lock_acquire+0xc7/0x1d0 [ 32.841309] _raw_spin_lock_irqsave+0x44/0x80 [ 32.841916] __lock_task_sighand+0x79/0x160 [ 32.842465] do_send_sig_info+0x35/0x90 [ 32.842977] bpf_send_signal+0xa/0x10 [ 32.843464] bpf_prog_bc13ed9e4d3163e3_send_signal_tp_sched+0x465/0x1000 [ 32.844301] trace_call_bpf+0x115/0x270 [ 32.844809] perf_trace_run_bpf_submit+0x4a/0xc0 [ 32.845411] perf_trace_sched_switch+0x10f/0x180 [ 32.846014] __schedule+0x45d/0x880 [ 32.846483] schedule+0x5f/0xd0 ... [ 32.853148] Chain exists of: [ 32.853148] &(&sighand->siglock)->rlock --> &p->pi_lock --> &rq->lock [ 32.853148] [ 32.854451] Possible unsafe locking scenario: [ 32.854451] [ 32.855173] CPU0 CPU1 [ 32.855745] ---- ---- [ 32.856278] lock(&rq->lock); [ 32.856671] lock(&p->pi_lock); [ 32.857332] lock(&rq->lock); [ 32.857999] lock(&(&sighand->siglock)->rlock); Deadlock happens on CPU0 when it tries to acquire &sighand->siglock but it has been held by CPU1 and CPU1 tries to grab &rq->lock and cannot get it. This is not exactly the callstack in our production environment, but sympotom is similar and both locks are using spin_lock_irqsave() to acquire the lock, and both involves rq_lock. The fix to delay sending signal when irq is disabled also fixed this issue. Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Cc: Song Liu <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit e24c644 ] I compiled with AddressSanitizer and I had these memory leaks while I was using the tep_parse_format function: Direct leak of 28 byte(s) in 4 object(s) allocated from: #0 0x7fb07db49ffe in __interceptor_realloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dffe) Freescale#1 0x7fb07a724228 in extend_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:985 Freescale#2 0x7fb07a724c21 in __read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1140 Freescale#3 0x7fb07a724f78 in read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1206 Freescale#4 0x7fb07a725191 in __read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1291 Freescale#5 0x7fb07a7251df in read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1299 Freescale#6 0x7fb07a72e6c8 in process_dynamic_array_len /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:2849 Freescale#7 0x7fb07a7304b8 in process_function /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3161 Freescale#8 0x7fb07a730900 in process_arg_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3207 Freescale#9 0x7fb07a727c0b in process_arg /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1786 Freescale#10 0x7fb07a731080 in event_read_print_args /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3285 Freescale#11 0x7fb07a731722 in event_read_print /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3369 Freescale#12 0x7fb07a740054 in __tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6335 Freescale#13 0x7fb07a74047a in __parse_event /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6389 Freescale#14 0x7fb07a740536 in tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6431 Freescale#15 0x7fb07a785acf in parse_event ../../../src/fs-src/fs.c:251 Freescale#16 0x7fb07a785ccd in parse_systems ../../../src/fs-src/fs.c:284 Freescale#17 0x7fb07a786fb3 in read_metadata ../../../src/fs-src/fs.c:593 Freescale#18 0x7fb07a78760e in ftrace_fs_source_init ../../../src/fs-src/fs.c:727 Freescale#19 0x7fb07d90c19c in add_component_with_init_method_data ../../../../src/lib/graph/graph.c:1048 Freescale#20 0x7fb07d90c87b in add_source_component_with_initialize_method_data ../../../../src/lib/graph/graph.c:1127 Freescale#21 0x7fb07d90c92a in bt_graph_add_source_component ../../../../src/lib/graph/graph.c:1152 Freescale#22 0x55db11aa632e in cmd_run_ctx_create_components_from_config_components ../../../src/cli/babeltrace2.c:2252 Freescale#23 0x55db11aa6fda in cmd_run_ctx_create_components ../../../src/cli/babeltrace2.c:2347 Freescale#24 0x55db11aa780c in cmd_run ../../../src/cli/babeltrace2.c:2461 Freescale#25 0x55db11aa8a7d in main ../../../src/cli/babeltrace2.c:2673 Freescale#26 0x7fb07d5460b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2) The token variable in the process_dynamic_array_len function is allocated in the read_expect_type function, but is not freed before calling the read_token function. Free the token variable before calling read_token in order to plug the leak. Signed-off-by: Philippe Duplessis-Guindon <[email protected]> Reviewed-by: Steven Rostedt (VMware) <[email protected]> Link: https://lore.kernel.org/linux-trace-devel/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

commit b058809 upstream A bug is present in GDB which causes early string termination when parsing variables. This has been reported [0], but we should ensure that we can support at least basic printing of the core kernel strings. For current gdb version (has been tested with 7.3 and 8.1), 'lx-version' only prints one character. (gdb) lx-version L(gdb) This can be fixed by casting 'linux_banner' as (char *). (gdb) lx-version Linux version 4.19.0-rc1+ (changbin@acer) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) Freescale#21 SMP Sat Sep 1 21:43:30 CST 2018 [0] https://sourceware.org/bugzilla/show_bug.cgi?id=20077 [[email protected]: add detail to commit message] Link: http://lkml.kernel.org/r/[email protected] Fixes: 2d061d9 ("scripts/gdb: add version command") Signed-off-by: Du Changbin <[email protected]> Signed-off-by: Kieran Bingham <[email protected]> Acked-by: Jan Kiszka <[email protected]> Cc: Jan Kiszka <[email protected]> Cc: Jason Wessel <[email protected]> Cc: Daniel Thompson <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Florian Fainelli <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 43e8f76 ] When using kprobe on powerpc booke series processor, Oops happens as show bellow: / # echo "p:myprobe do_nanosleep" > /sys/kernel/debug/tracing/kprobe_events / # echo 1 > /sys/kernel/debug/tracing/events/kprobes/myprobe/enable / # sleep 1 [ 50.076730] Oops: Exception in kernel mode, sig: 5 [Freescale#1] [ 50.077017] BE PAGE_SIZE=4K SMP NR_CPUS=24 QEMU e500 [ 50.077221] Modules linked in: [ 50.077462] CPU: 0 PID: 77 Comm: sleep Not tainted 5.14.0-rc4-00022-g251a1524293d Freescale#21 [ 50.077887] NIP: c0b9c4e0 LR: c00ebecc CTR: 00000000 [ 50.078067] REGS: c3883de0 TRAP: 0700 Not tainted (5.14.0-rc4-00022-g251a1524293d) [ 50.078349] MSR: 00029000 <CE,EE,ME> CR: 24000228 XER: 20000000 [ 50.078675] [ 50.078675] GPR00: c00ebdf0 c3883e90 c313e300 c3883ea0 00000001 00000000 c3883ecc 00000001 [ 50.078675] GPR08: c100598c c00ea250 00000004 00000000 24000222 102490c2 bff4180c 101e60d4 [ 50.078675] GPR16: 00000000 102454ac 00000040 10240000 10241100 102410f 10240000 00500000 [ 50.078675] GPR24: 00000002 00000000 c3883ea0 00000001 00000000 0000c350 3b9b8d50 00000000 [ 50.080151] NIP [c0b9c4e0] do_nanosleep+0x0/0x190 [ 50.080352] LR [c00ebecc] hrtimer_nanosleep+0x14c/0x1e0 [ 50.080638] Call Trace: [ 50.080801] [c3883e90] [c00ebdf0] hrtimer_nanosleep+0x70/0x1e0 (unreliable) [ 50.081110] [c3883f00] [c00ec004] sys_nanosleep_time32+0xa4/0x110 [ 50.081336] [c3883f40] [c001509c] ret_from_syscall+0x0/0x28 [ 50.081541] --- interrupt: c00 at 0x100a4d08 [ 50.081749] NIP: 100a4d08 LR: 101b5234 CTR: 00000003 [ 50.081931] REGS: c3883f50 TRAP: 0c00 Not tainted (5.14.0-rc4-00022-g251a1524293d) [ 50.082183] MSR: 0002f902 <CE,EE,PR,FP,ME> CR: 24000222 XER: 00000000 [ 50.082457] [ 50.082457] GPR00: 000000a2 bf980040 1024b4d0 bf980084 bf980084 64000000 00555345 fefefeff [ 50.082457] GPR08: 7f7f7f7f 101e0000 00000069 00000003 28000422 102490c2 bff4180c 101e60d4 [ 50.082457] GPR16: 00000000 102454ac 00000040 10240000 10241100 102410f 10240000 00500000 [ 50.082457] GPR24: 00000002 bf9803f4 10240000 00000000 00000000 100039e0 00000000 102444e8 [ 50.083789] NIP [100a4d08] 0x100a4d08 [ 50.083917] LR [101b5234] 0x101b5234 [ 50.084042] --- interrupt: c00 [ 50.084238] Instruction dump: [ 50.084483] 4bfffc40 60000000 60000000 60000000 9421fff0 39400402 914200c0 38210010 [ 50.084841] 4bfffc20 00000000 00000000 00000000 <7fe00008> 7c0802a6 7c892378 93c10048 [ 50.085487] ---[ end trace f6fffe98e2fa8f3e ]--- [ 50.085678] Trace/breakpoint trap There is no real mode for booke arch and the MMU translation is always on. The corresponding MSR_IS/MSR_DS bit in booke is used to switch the address space, but not for real mode judgment. Fixes: 21f8b2f ("powerpc/kprobes: Ignore traps that happened in real mode") Signed-off-by: Pu Lehui <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit a06c2e5 ] The id of slv_cnoc_mnoc_cfg node is mistakenly coded as id of slv_blsp_1. It causes the following warning on slv_blsp_1 node adding. Correct the id of slv_cnoc_mnoc_cfg node. [ 1.948180] ------------[ cut here ]------------ [ 1.954122] WARNING: CPU: 2 PID: 7 at drivers/interconnect/core.c:962 icc_node_add+0xe4/0xf8 [ 1.958994] Modules linked in: [ 1.967399] CPU: 2 PID: 7 Comm: kworker/u16:0 Not tainted 5.14.0-rc6-next-20210818 Freescale#21 [ 1.970275] Hardware name: Xiaomi Redmi Note 7 (DT) [ 1.978169] Workqueue: events_unbound deferred_probe_work_func [ 1.982945] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 1.988849] pc : icc_node_add+0xe4/0xf8 [ 1.995699] lr : qnoc_probe+0x350/0x438 [ 1.999519] sp : ffff80001008bb10 [ 2.003337] x29: ffff80001008bb10 x28: 000000000000001a x27: ffffb83ddc61ee28 [ 2.006818] x26: ffff2fe341d44080 x25: ffff2fe340f3aa80 x24: ffffb83ddc98f0e8 [ 2.013938] x23: 0000000000000024 x22: ffff2fe3408b7400 x21: 0000000000000000 [ 2.021054] x20: ffff2fe3408b7410 x19: ffff2fe341d44080 x18: 0000000000000010 [ 2.028173] x17: ffff2fe3bdd0aac0 x16: 0000000000000281 x15: ffff2fe3400f5528 [ 2.035290] x14: 000000000000013f x13: ffff2fe3400f5528 x12: 00000000ffffffea [ 2.042410] x11: ffffb83ddc9109d0 x10: ffffb83ddc8f8990 x9 : ffffb83ddc8f89e8 [ 2.049527] x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 0000000000000001 [ 2.056645] x5 : 0000000000057fa8 x4 : 0000000000000000 x3 : ffffb83ddc9903b0 [ 2.063764] x2 : 1a1f6fde34d45500 x1 : ffff2fe340f3a880 x0 : ffff2fe340f3a880 [ 2.070882] Call trace: [ 2.077989] icc_node_add+0xe4/0xf8 [ 2.080247] qnoc_probe+0x350/0x438 [ 2.083718] platform_probe+0x68/0xd8 [ 2.087191] really_probe+0xb8/0x300 [ 2.091011] __driver_probe_device+0x78/0xe0 [ 2.094659] driver_probe_device+0x80/0x110 [ 2.098911] __device_attach_driver+0x90/0xe0 [ 2.102818] bus_for_each_drv+0x78/0xc8 [ 2.107331] __device_attach+0xf0/0x150 [ 2.110977] device_initial_probe+0x14/0x20 [ 2.114796] bus_probe_device+0x9c/0xa8 [ 2.118963] deferred_probe_work_func+0x88/0xc0 [ 2.122784] process_one_work+0x1a4/0x338 [ 2.127296] worker_thread+0x1f8/0x420 [ 2.131464] kthread+0x150/0x160 [ 2.135107] ret_from_fork+0x10/0x20 [ 2.138495] ---[ end trace 5eea8768cb620e87 ]--- Signed-off-by: Shawn Guo <[email protected]> Reviewed-by: Bjorn Andersson <[email protected]> Fixes: f80a1d4 ("interconnect: qcom: Add SDM660 interconnect provider driver") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Georgi Djakov <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 4224cfd ] When bringing down the netdevice or system shutdown, a panic can be triggered while accessing the sysfs path because the device is already removed. [ 755.549084] mlx5_core 0000:12:00.1: Shutdown was called [ 756.404455] mlx5_core 0000:12:00.0: Shutdown was called ... [ 757.937260] BUG: unable to handle kernel NULL pointer dereference at (null) [ 758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280 crash> bt ... PID: 12649 TASK: ffff8924108f2100 CPU: 1 COMMAND: "amsd" ... Freescale#9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778 [exception RIP: dma_pool_alloc+0x1ab] RIP: ffffffff8ee11acb RSP: ffff89240e1a3968 RFLAGS: 00010046 RAX: 0000000000000246 RBX: ffff89243d874100 RCX: 0000000000001000 RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff89243d874090 RBP: ffff89240e1a39c0 R8: 000000000001f080 R9: ffff8905ffc03c00 R10: ffffffffc04680d4 R11: ffffffff8edde9fd R12: 00000000000080d0 R13: ffff89243d874090 R14: ffff89243d874080 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 Freescale#10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core] Freescale#11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core] Freescale#12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core] Freescale#13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core] Freescale#14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core] Freescale#15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core] Freescale#16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core] Freescale#17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46 Freescale#18 [ffff89240e1a3d48] speed_show at ffffffff8f277208 Freescale#19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3 Freescale#20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf Freescale#21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596 Freescale#22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10 Freescale#23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5 Freescale#24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff Freescale#25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f Freescale#26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92 crash> net_device.state ffff89443b0c0000 state = 0x5 (__LINK_STATE_START| __LINK_STATE_NOCARRIER) To prevent this scenario, we also make sure that the netdevice is present. Signed-off-by: suresh kumar <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 9de255c ] Commit b743512 ("uio: fix a sleep-in-atomic-context bug in uio_dmem_genirq_irqcontrol()") started calling disable_irq() without holding the spinlock because it can sleep. However, that fix introduced another bug: if interrupt is already disabled and a new disable request comes in, then the spinlock is not unlocked: root@localhost:~# printf '\x00\x00\x00\x00' > /dev/uio0 root@localhost:~# printf '\x00\x00\x00\x00' > /dev/uio0 root@localhost:~# [ 14.851538] BUG: scheduling while atomic: bash/223/0x00000002 [ 14.851991] Modules linked in: uio_dmem_genirq uio myfpga(OE) bochs drm_vram_helper drm_ttm_helper ttm drm_kms_helper drm snd_pcm ppdev joydev psmouse snd_timer snd e1000fb_sys_fops syscopyarea parport sysfillrect soundcore sysimgblt input_leds pcspkr i2c_piix4 serio_raw floppy evbug qemu_fw_cfg mac_hid pata_acpi ip_tables x_tables autofs4 [last unloaded: parport_pc] [ 14.854206] CPU: 0 PID: 223 Comm: bash Tainted: G OE 6.0.0-rc7 #21 [ 14.854786] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 [ 14.855664] Call Trace: [ 14.855861] <TASK> [ 14.856025] dump_stack_lvl+0x4d/0x67 [ 14.856325] dump_stack+0x14/0x1a [ 14.856583] __schedule_bug.cold+0x4b/0x5c [ 14.856915] __schedule+0xe81/0x13d0 [ 14.857199] ? idr_find+0x13/0x20 [ 14.857456] ? get_work_pool+0x2d/0x50 [ 14.857756] ? __flush_work+0x233/0x280 [ 14.858068] ? __schedule+0xa95/0x13d0 [ 14.858307] ? idr_find+0x13/0x20 [ 14.858519] ? get_work_pool+0x2d/0x50 [ 14.858798] schedule+0x6c/0x100 [ 14.859009] schedule_hrtimeout_range_clock+0xff/0x110 [ 14.859335] ? tty_write_room+0x1f/0x30 [ 14.859598] ? n_tty_poll+0x1ec/0x220 [ 14.859830] ? tty_ldisc_deref+0x1a/0x20 [ 14.860090] schedule_hrtimeout_range+0x17/0x20 [ 14.860373] do_select+0x596/0x840 [ 14.860627] ? __kernel_text_address+0x16/0x50 [ 14.860954] ? poll_freewait+0xb0/0xb0 [ 14.861235] ? poll_freewait+0xb0/0xb0 [ 14.861517] ? rpm_resume+0x49d/0x780 [ 14.861798] ? common_interrupt+0x59/0xa0 [ 14.862127] ? asm_common_interrupt+0x2b/0x40 [ 14.862511] ? __uart_start.isra.0+0x61/0x70 [ 14.862902] ? __check_object_size+0x61/0x280 [ 14.863255] core_sys_select+0x1c6/0x400 [ 14.863575] ? vfs_write+0x1c9/0x3d0 [ 14.863853] ? vfs_write+0x1c9/0x3d0 [ 14.864121] ? _copy_from_user+0x45/0x70 [ 14.864526] do_pselect.constprop.0+0xb3/0xf0 [ 14.864893] ? do_syscall_64+0x6d/0x90 [ 14.865228] ? do_syscall_64+0x6d/0x90 [ 14.865556] __x64_sys_pselect6+0x76/0xa0 [ 14.865906] do_syscall_64+0x60/0x90 [ 14.866214] ? syscall_exit_to_user_mode+0x2a/0x50 [ 14.866640] ? do_syscall_64+0x6d/0x90 [ 14.866972] ? do_syscall_64+0x6d/0x90 [ 14.867286] ? do_syscall_64+0x6d/0x90 [ 14.867626] entry_SYSCALL_64_after_hwframe+0x63/0xcd [...] stripped [ 14.872959] </TASK> ('myfpga' is a simple 'uio_dmem_genirq' driver I wrote to test this) The implementation of "uio_dmem_genirq" was based on "uio_pdrv_genirq" and it is used in a similar manner to the "uio_pdrv_genirq" driver with respect to interrupt configuration and handling. At the time "uio_dmem_genirq" was introduced, both had the same implementation of the 'uio_info' handlers irqcontrol() and handler(). Then commit 34cb275 ("UIO: Fix concurrency issue"), which was only applied to "uio_pdrv_genirq", ended up making them a little different. That commit, among other things, changed disable_irq() to disable_irq_nosync() in the implementation of irqcontrol(). The motivation there was to avoid a deadlock between irqcontrol() and handler(), since it added a spinlock in the irq handler, and disable_irq() waits for the completion of the irq handler. By changing disable_irq() to disable_irq_nosync() in irqcontrol(), we also avoid the sleeping-while-atomic bug that commit b743512 ("uio: fix a sleep-in-atomic-context bug in uio_dmem_genirq_irqcontrol()") was trying to fix. Thus, this fixes the missing unlock in irqcontrol() by importing the implementation of irqcontrol() handler from the "uio_pdrv_genirq" driver. In the end, it reverts commit b743512 ("uio: fix a sleep-in-atomic-context bug in uio_dmem_genirq_irqcontrol()") and change disable_irq() to disable_irq_nosync(). It is worth noting that this still does not address the concurrency issue fixed by commit 34cb275 ("UIO: Fix concurrency issue"). It will be addressed separately in the next commits. Split out from commit 34cb275 ("UIO: Fix concurrency issue"). Fixes: b743512 ("uio: fix a sleep-in-atomic-context bug in uio_dmem_genirq_irqcontrol()") Signed-off-by: Rafael Mendonca <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 9de255c ] Commit b743512 ("uio: fix a sleep-in-atomic-context bug in uio_dmem_genirq_irqcontrol()") started calling disable_irq() without holding the spinlock because it can sleep. However, that fix introduced another bug: if interrupt is already disabled and a new disable request comes in, then the spinlock is not unlocked: root@localhost:~# printf '\x00\x00\x00\x00' > /dev/uio0 root@localhost:~# printf '\x00\x00\x00\x00' > /dev/uio0 root@localhost:~# [ 14.851538] BUG: scheduling while atomic: bash/223/0x00000002 [ 14.851991] Modules linked in: uio_dmem_genirq uio myfpga(OE) bochs drm_vram_helper drm_ttm_helper ttm drm_kms_helper drm snd_pcm ppdev joydev psmouse snd_timer snd e1000fb_sys_fops syscopyarea parport sysfillrect soundcore sysimgblt input_leds pcspkr i2c_piix4 serio_raw floppy evbug qemu_fw_cfg mac_hid pata_acpi ip_tables x_tables autofs4 [last unloaded: parport_pc] [ 14.854206] CPU: 0 PID: 223 Comm: bash Tainted: G OE 6.0.0-rc7 Freescale#21 [ 14.854786] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 [ 14.855664] Call Trace: [ 14.855861] <TASK> [ 14.856025] dump_stack_lvl+0x4d/0x67 [ 14.856325] dump_stack+0x14/0x1a [ 14.856583] __schedule_bug.cold+0x4b/0x5c [ 14.856915] __schedule+0xe81/0x13d0 [ 14.857199] ? idr_find+0x13/0x20 [ 14.857456] ? get_work_pool+0x2d/0x50 [ 14.857756] ? __flush_work+0x233/0x280 [ 14.858068] ? __schedule+0xa95/0x13d0 [ 14.858307] ? idr_find+0x13/0x20 [ 14.858519] ? get_work_pool+0x2d/0x50 [ 14.858798] schedule+0x6c/0x100 [ 14.859009] schedule_hrtimeout_range_clock+0xff/0x110 [ 14.859335] ? tty_write_room+0x1f/0x30 [ 14.859598] ? n_tty_poll+0x1ec/0x220 [ 14.859830] ? tty_ldisc_deref+0x1a/0x20 [ 14.860090] schedule_hrtimeout_range+0x17/0x20 [ 14.860373] do_select+0x596/0x840 [ 14.860627] ? __kernel_text_address+0x16/0x50 [ 14.860954] ? poll_freewait+0xb0/0xb0 [ 14.861235] ? poll_freewait+0xb0/0xb0 [ 14.861517] ? rpm_resume+0x49d/0x780 [ 14.861798] ? common_interrupt+0x59/0xa0 [ 14.862127] ? asm_common_interrupt+0x2b/0x40 [ 14.862511] ? __uart_start.isra.0+0x61/0x70 [ 14.862902] ? __check_object_size+0x61/0x280 [ 14.863255] core_sys_select+0x1c6/0x400 [ 14.863575] ? vfs_write+0x1c9/0x3d0 [ 14.863853] ? vfs_write+0x1c9/0x3d0 [ 14.864121] ? _copy_from_user+0x45/0x70 [ 14.864526] do_pselect.constprop.0+0xb3/0xf0 [ 14.864893] ? do_syscall_64+0x6d/0x90 [ 14.865228] ? do_syscall_64+0x6d/0x90 [ 14.865556] __x64_sys_pselect6+0x76/0xa0 [ 14.865906] do_syscall_64+0x60/0x90 [ 14.866214] ? syscall_exit_to_user_mode+0x2a/0x50 [ 14.866640] ? do_syscall_64+0x6d/0x90 [ 14.866972] ? do_syscall_64+0x6d/0x90 [ 14.867286] ? do_syscall_64+0x6d/0x90 [ 14.867626] entry_SYSCALL_64_after_hwframe+0x63/0xcd [...] stripped [ 14.872959] </TASK> ('myfpga' is a simple 'uio_dmem_genirq' driver I wrote to test this) The implementation of "uio_dmem_genirq" was based on "uio_pdrv_genirq" and it is used in a similar manner to the "uio_pdrv_genirq" driver with respect to interrupt configuration and handling. At the time "uio_dmem_genirq" was introduced, both had the same implementation of the 'uio_info' handlers irqcontrol() and handler(). Then commit 34cb275 ("UIO: Fix concurrency issue"), which was only applied to "uio_pdrv_genirq", ended up making them a little different. That commit, among other things, changed disable_irq() to disable_irq_nosync() in the implementation of irqcontrol(). The motivation there was to avoid a deadlock between irqcontrol() and handler(), since it added a spinlock in the irq handler, and disable_irq() waits for the completion of the irq handler. By changing disable_irq() to disable_irq_nosync() in irqcontrol(), we also avoid the sleeping-while-atomic bug that commit b743512 ("uio: fix a sleep-in-atomic-context bug in uio_dmem_genirq_irqcontrol()") was trying to fix. Thus, this fixes the missing unlock in irqcontrol() by importing the implementation of irqcontrol() handler from the "uio_pdrv_genirq" driver. In the end, it reverts commit b743512 ("uio: fix a sleep-in-atomic-context bug in uio_dmem_genirq_irqcontrol()") and change disable_irq() to disable_irq_nosync(). It is worth noting that this still does not address the concurrency issue fixed by commit 34cb275 ("UIO: Fix concurrency issue"). It will be addressed separately in the next commits. Split out from commit 34cb275 ("UIO: Fix concurrency issue"). Fixes: b743512 ("uio: fix a sleep-in-atomic-context bug in uio_dmem_genirq_irqcontrol()") Signed-off-by: Rafael Mendonca <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 4e264be ] When a system with E810 with existing VFs gets rebooted the following hang may be observed. Pid 1 is hung in iavf_remove(), part of a network driver: PID: 1 TASK: ffff965400e5a340 CPU: 24 COMMAND: "systemd-shutdow" #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb #1 [ffffaad04005fae8] schedule at ffffffff8b323e2d #2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc #3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930 #4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf] #5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513 #6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa #7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc #8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e #9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429 #10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4 #11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice] #12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice] #13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice] #14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1 #15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386 #16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870 #17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6 #18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159 #19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc #20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d #21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169 #22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b RIP: 00007f1baa5c13d7 RSP: 00007fffbcc55a98 RFLAGS: 00000202 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1baa5c13d7 RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead RBP: 00007fffbcc55ca0 R8: 0000000000000000 R9: 00007fffbcc54e90 R10: 00007fffbcc55050 R11: 0000000000000202 R12: 0000000000000005 R13: 0000000000000000 R14: 00007fffbcc55af0 R15: 0000000000000000 ORIG_RAX: 00000000000000a9 CS: 0033 SS: 002b During reboot all drivers PM shutdown callbacks are invoked. In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE. In ice_shutdown() the call chain above is executed, which at some point calls iavf_remove(). However iavf_remove() expects the VF to be in one of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If that's not the case it sleeps forever. So if iavf_shutdown() gets invoked before iavf_remove() the system will hang indefinitely because the adapter is already in state __IAVF_REMOVE. Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE, as we already went through iavf_shutdown(). Fixes: 9745780 ("iavf: Add waiting so the port is initialized in remove") Fixes: a841733 ("iavf: Fix race condition between iavf_shutdown and iavf_remove") Reported-by: Marius Cornea <[email protected]> Signed-off-by: Stefan Assmann <[email protected]> Reviewed-by: Michal Kubiak <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 4e264be ] When a system with E810 with existing VFs gets rebooted the following hang may be observed. Pid 1 is hung in iavf_remove(), part of a network driver: PID: 1 TASK: ffff965400e5a340 CPU: 24 COMMAND: "systemd-shutdow" #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb Freescale#1 [ffffaad04005fae8] schedule at ffffffff8b323e2d Freescale#2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc Freescale#3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930 Freescale#4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf] Freescale#5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513 Freescale#6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa Freescale#7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc Freescale#8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e Freescale#9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429 Freescale#10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4 Freescale#11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice] Freescale#12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice] Freescale#13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice] Freescale#14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1 Freescale#15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386 Freescale#16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870 Freescale#17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6 Freescale#18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159 Freescale#19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc Freescale#20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d Freescale#21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169 Freescale#22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b RIP: 00007f1baa5c13d7 RSP: 00007fffbcc55a98 RFLAGS: 00000202 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1baa5c13d7 RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead RBP: 00007fffbcc55ca0 R8: 0000000000000000 R9: 00007fffbcc54e90 R10: 00007fffbcc55050 R11: 0000000000000202 R12: 0000000000000005 R13: 0000000000000000 R14: 00007fffbcc55af0 R15: 0000000000000000 ORIG_RAX: 00000000000000a9 CS: 0033 SS: 002b During reboot all drivers PM shutdown callbacks are invoked. In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE. In ice_shutdown() the call chain above is executed, which at some point calls iavf_remove(). However iavf_remove() expects the VF to be in one of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If that's not the case it sleeps forever. So if iavf_shutdown() gets invoked before iavf_remove() the system will hang indefinitely because the adapter is already in state __IAVF_REMOVE. Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE, as we already went through iavf_shutdown(). Fixes: 9745780 ("iavf: Add waiting so the port is initialized in remove") Fixes: a841733 ("iavf: Fix race condition between iavf_shutdown and iavf_remove") Reported-by: Marius Cornea <[email protected]> Signed-off-by: Stefan Assmann <[email protected]> Reviewed-by: Michal Kubiak <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

commit 0b0747d upstream. The following processes run into a deadlock. CPU 41 was waiting for CPU 29 to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29 was hung by that spinlock with IRQs disabled. PID: 17360 TASK: ffff95c1090c5c40 CPU: 41 COMMAND: "mrdiagd" !# 0 [ffffb80edbf37b58] __read_once_size at ffffffff9b871a40 include/linux/compiler.h:185:0 !# 1 [ffffb80edbf37b58] atomic_read at ffffffff9b871a40 arch/x86/include/asm/atomic.h:27:0 !# 2 [ffffb80edbf37b58] dump_stack at ffffffff9b871a40 lib/dump_stack.c:54:0 # 3 [ffffb80edbf37b78] csd_lock_wait_toolong at ffffffff9b131ad5 kernel/smp.c:364:0 # 4 [ffffb80edbf37b78] __csd_lock_wait at ffffffff9b131ad5 kernel/smp.c:384:0 # 5 [ffffb80edbf37bf8] csd_lock_wait at ffffffff9b13267a kernel/smp.c:394:0 # 6 [ffffb80edbf37bf8] smp_call_function_many at ffffffff9b13267a kernel/smp.c:843:0 # 7 [ffffb80edbf37c50] smp_call_function at ffffffff9b13279d kernel/smp.c:867:0 # 8 [ffffb80edbf37c50] on_each_cpu at ffffffff9b13279d kernel/smp.c:976:0 # 9 [ffffb80edbf37c78] flush_tlb_kernel_range at ffffffff9b085c4b arch/x86/mm/tlb.c:742:0 Freescale#10 [ffffb80edbf37cb8] __purge_vmap_area_lazy at ffffffff9b23a1e0 mm/vmalloc.c:701:0 Freescale#11 [ffffb80edbf37ce0] try_purge_vmap_area_lazy at ffffffff9b23a2cc mm/vmalloc.c:722:0 Freescale#12 [ffffb80edbf37ce0] free_vmap_area_noflush at ffffffff9b23a2cc mm/vmalloc.c:754:0 Freescale#13 [ffffb80edbf37cf8] free_unmap_vmap_area at ffffffff9b23bb3b mm/vmalloc.c:764:0 Freescale#14 [ffffb80edbf37cf8] remove_vm_area at ffffffff9b23bb3b mm/vmalloc.c:1509:0 Freescale#15 [ffffb80edbf37d18] __vunmap at ffffffff9b23bb8a mm/vmalloc.c:1537:0 Freescale#16 [ffffb80edbf37d40] vfree at ffffffff9b23bc85 mm/vmalloc.c:1612:0 Freescale#17 [ffffb80edbf37d58] megasas_free_host_crash_buffer [megaraid_sas] at ffffffffc020b7f2 drivers/scsi/megaraid/megaraid_sas_fusion.c:3932:0 Freescale#18 [ffffb80edbf37d80] fw_crash_state_store [megaraid_sas] at ffffffffc01f804d drivers/scsi/megaraid/megaraid_sas_base.c:3291:0 Freescale#19 [ffffb80edbf37dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 Freescale#20 [ffffb80edbf37dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 Freescale#21 [ffffb80edbf37de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 Freescale#22 [ffffb80edbf37e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 Freescale#23 [ffffb80edbf37ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 Freescale#24 [ffffb80edbf37ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 Freescale#25 [ffffb80edbf37ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 Freescale#26 [ffffb80edbf37f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 Freescale#27 [ffffb80edbf37f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 PID: 17355 TASK: ffff95c1090c3d80 CPU: 29 COMMAND: "mrdiagd" !# 0 [ffffb80f2d3c7d30] __read_once_size at ffffffff9b0f2ab0 include/linux/compiler.h:185:0 !# 1 [ffffb80f2d3c7d30] native_queued_spin_lock_slowpath at ffffffff9b0f2ab0 kernel/locking/qspinlock.c:368:0 # 2 [ffffb80f2d3c7d58] pv_queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/paravirt.h:674:0 # 3 [ffffb80f2d3c7d58] queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/qspinlock.h:53:0 # 4 [ffffb80f2d3c7d68] queued_spin_lock at ffffffff9b8961a6 include/asm-generic/qspinlock.h:90:0 # 5 [ffffb80f2d3c7d68] do_raw_spin_lock_flags at ffffffff9b8961a6 include/linux/spinlock.h:173:0 # 6 [ffffb80f2d3c7d68] __raw_spin_lock_irqsave at ffffffff9b8961a6 include/linux/spinlock_api_smp.h:122:0 # 7 [ffffb80f2d3c7d68] _raw_spin_lock_irqsave at ffffffff9b8961a6 kernel/locking/spinlock.c:160:0 # 8 [ffffb80f2d3c7d88] fw_crash_buffer_store [megaraid_sas] at ffffffffc01f8129 drivers/scsi/megaraid/megaraid_sas_base.c:3205:0 # 9 [ffffb80f2d3c7dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 Freescale#10 [ffffb80f2d3c7dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 Freescale#11 [ffffb80f2d3c7de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 Freescale#12 [ffffb80f2d3c7e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 Freescale#13 [ffffb80f2d3c7ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 Freescale#14 [ffffb80f2d3c7ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 Freescale#15 [ffffb80f2d3c7ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 Freescale#16 [ffffb80f2d3c7f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 Freescale#17 [ffffb80f2d3c7f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 The lock is used to synchronize different sysfs operations, it doesn't protect any resource that will be touched by an interrupt. Consequently it's not required to disable IRQs. Replace the spinlock with a mutex to fix the deadlock. Signed-off-by: Junxiao Bi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Mike Christie <[email protected]> Cc: [email protected] Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit e3e82fc ] When creating ceq_0 during probing irdma, cqp.sc_cqp will be sent as a cqp_request to cqp->sc_cqp.sq_ring. If the request is pending when removing the irdma driver or unplugging its aux device, cqp.sc_cqp will be dereferenced as wrong struct in irdma_free_pending_cqp_request(). PID: 3669 TASK: ffff88aef892c000 CPU: 28 COMMAND: "kworker/28:0" #0 [fffffe0000549e38] crash_nmi_callback at ffffffff810e3a34 Freescale#1 [fffffe0000549e40] nmi_handle at ffffffff810788b2 Freescale#2 [fffffe0000549ea0] default_do_nmi at ffffffff8107938f Freescale#3 [fffffe0000549eb8] do_nmi at ffffffff81079582 Freescale#4 [fffffe0000549ef0] end_repeat_nmi at ffffffff82e016b4 [exception RIP: native_queued_spin_lock_slowpath+1291] RIP: ffffffff8127e72b RSP: ffff88aa841ef778 RFLAGS: 00000046 RAX: 0000000000000000 RBX: ffff88b01f849700 RCX: ffffffff8127e47e RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff83857ec0 RBP: ffff88afe3e4efc8 R8: ffffed15fc7c9dfa R9: ffffed15fc7c9dfa R10: 0000000000000001 R11: ffffed15fc7c9df9 R12: 0000000000740000 R13: ffff88b01f849708 R14: 0000000000000003 R15: ffffed1603f092e1 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000 -- <NMI exception stack> -- Freescale#5 [ffff88aa841ef778] native_queued_spin_lock_slowpath at ffffffff8127e72b Freescale#6 [ffff88aa841ef7b0] _raw_spin_lock_irqsave at ffffffff82c22aa4 Freescale#7 [ffff88aa841ef7c8] __wake_up_common_lock at ffffffff81257363 Freescale#8 [ffff88aa841ef888] irdma_free_pending_cqp_request at ffffffffa0ba12cc [irdma] Freescale#9 [ffff88aa841ef958] irdma_cleanup_pending_cqp_op at ffffffffa0ba1469 [irdma] Freescale#10 [ffff88aa841ef9c0] irdma_ctrl_deinit_hw at ffffffffa0b2989f [irdma] Freescale#11 [ffff88aa841efa28] irdma_remove at ffffffffa0b252df [irdma] Freescale#12 [ffff88aa841efae8] auxiliary_bus_remove at ffffffff8219afdb Freescale#13 [ffff88aa841efb00] device_release_driver_internal at ffffffff821882e6 Freescale#14 [ffff88aa841efb38] bus_remove_device at ffffffff82184278 Freescale#15 [ffff88aa841efb88] device_del at ffffffff82179d23 Freescale#16 [ffff88aa841efc48] ice_unplug_aux_dev at ffffffffa0eb1c14 [ice] Freescale#17 [ffff88aa841efc68] ice_service_task at ffffffffa0d88201 [ice] Freescale#18 [ffff88aa841efde8] process_one_work at ffffffff811c589a Freescale#19 [ffff88aa841efe60] worker_thread at ffffffff811c71ff Freescale#20 [ffff88aa841eff10] kthread at ffffffff811d87a0 Freescale#21 [ffff88aa841eff50] ret_from_fork at ffffffff82e0022f Fixes: 44d9e52 ("RDMA/irdma: Implement device initialization definitions") Link: https://lore.kernel.org/r/[email protected] Suggested-by: "Ismail, Mustafa" <[email protected]> Signed-off-by: Shifeng Li <[email protected]> Reviewed-by: Shiraz Saleem <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 769e6a1 ] ui_browser__show() is capturing the input title that is stack allocated memory in hist_browser__run(). Avoid a use after return by strdup-ing the string. Committer notes: Further explanation from Ian Rogers: My command line using tui is: $ sudo bash -c 'rm /tmp/asan.log*; export ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a sleep 1; /tmp/perf/perf mem report' I then go to the perf annotate view and quit. This triggers the asan error (from the log file): ``` ==1254591==ERROR: AddressSanitizer: stack-use-after-return on address 0x7f2813331920 at pc 0x7f28180 65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10 READ of size 80 at 0x7f2813331920 thread T0 #0 0x7f2818065990 in __interceptor_strlen ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461 #1 0x7f2817698251 in SLsmg_write_wrapped_string (/lib/x86_64-linux-gnu/libslang.so.2+0x98251) #2 0x7f28176984b9 in SLsmg_write_nstring (/lib/x86_64-linux-gnu/libslang.so.2+0x984b9) #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60 #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266 #5 0x55c94045c776 in ui_browser__show ui/browser.c:288 #6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206 #7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458 #8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412 #9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527 #10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613 #11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661 #12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671 #13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141 #14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805 #15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374 #16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516 #17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350 #18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403 #19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447 #20 0x55c9400e53ad in main tools/perf/perf.c:561 #21 0x7f28170456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 #22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360 #23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId: 84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93) Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746 This frame has 1 object(s): [32, 192) 'title' (line 747) <== Memory access at offset 32 is inside this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork ``` hist_browser__run isn't on the stack so the asan error looks legit. There's no clean init/exit on struct ui_browser so I may be trading a use-after-return for a memory leak, but that seems look a good trade anyway. Fixes: 05e8b08 ("perf ui browser: Stop using 'self'") Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Li Dong <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Tim Chen <[email protected]> Cc: Yanteng Si <[email protected]> Cc: Yicong Yang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

commit be346c1 upstream. The code in ocfs2_dio_end_io_write() estimates number of necessary transaction credits using ocfs2_calc_extend_credits(). This however does not take into account that the IO could be arbitrarily large and can contain arbitrary number of extents. Extent tree manipulations do often extend the current transaction but not in all of the cases. For example if we have only single block extents in the tree, ocfs2_mark_extent_written() will end up calling ocfs2_replace_extent_rec() all the time and we will never extend the current transaction and eventually exhaust all the transaction credits if the IO contains many single block extents. Once that happens a WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to this error. This was actually triggered by one of our customers on a heavily fragmented OCFS2 filesystem. To fix the issue make sure the transaction always has enough credits for one extent insert before each call of ocfs2_mark_extent_written(). Heming Zhao said: ------ PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error" PID: xxx TASK: xxxx CPU: 5 COMMAND: "SubmitThread-CA" #0 machine_kexec at ffffffff8c069932 Freescale#1 __crash_kexec at ffffffff8c1338fa Freescale#2 panic at ffffffff8c1d69b9 Freescale#3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2] Freescale#4 __ocfs2_abort at ffffffffc0c88387 [ocfs2] Freescale#5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2] Freescale#6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2] Freescale#7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2] Freescale#8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2] Freescale#9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2] Freescale#10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2] Freescale#11 dio_complete at ffffffff8c2b9fa7 Freescale#12 do_blockdev_direct_IO at ffffffff8c2bc09f Freescale#13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2] Freescale#14 generic_file_direct_write at ffffffff8c1dcf14 Freescale#15 __generic_file_write_iter at ffffffff8c1dd07b Freescale#16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2] Freescale#17 aio_write at ffffffff8c2cc72e Freescale#18 kmem_cache_alloc at ffffffff8c248dde Freescale#19 do_io_submit at ffffffff8c2ccada Freescale#20 do_syscall_64 at ffffffff8c004984 Freescale#21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io") Signed-off-by: Jan Kara <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Reviewed-by: Heming Zhao <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

…s_lock [ Upstream commit be26ba9 ] For storing a value to a queue attribute, the queue_attr_store function first freezes the queue (->q_usage_counter(io)) and then acquire ->sysfs_lock. This seems not correct as the usual ordering should be to acquire ->sysfs_lock before freezing the queue. This incorrect ordering causes the following lockdep splat which we are able to reproduce always simply by accessing /sys/kernel/debug file using ls command: [ 57.597146] WARNING: possible circular locking dependency detected [ 57.597154] 6.12.0-10553-gb86545e02e8c #20 Tainted: G W [ 57.597162] ------------------------------------------------------ [ 57.597168] ls/4605 is trying to acquire lock: [ 57.597176] c00000003eb56710 (&mm->mmap_lock){++++}-{4:4}, at: __might_fault+0x58/0xc0 [ 57.597200] but task is already holding lock: [ 57.597207] c0000018e27c6810 (&sb->s_type->i_mutex_key#3){++++}-{4:4}, at: iterate_dir+0x94/0x1d4 [ 57.597226] which lock already depends on the new lock. [ 57.597233] the existing dependency chain (in reverse order) is: [ 57.597241] -> #5 (&sb->s_type->i_mutex_key#3){++++}-{4:4}: [ 57.597255] down_write+0x6c/0x18c [ 57.597264] start_creating+0xb4/0x24c [ 57.597274] debugfs_create_dir+0x2c/0x1e8 [ 57.597283] blk_register_queue+0xec/0x294 [ 57.597292] add_disk_fwnode+0x2e4/0x548 [ 57.597302] brd_alloc+0x2c8/0x338 [ 57.597309] brd_init+0x100/0x178 [ 57.597317] do_one_initcall+0x88/0x3e4 [ 57.597326] kernel_init_freeable+0x3cc/0x6e0 [ 57.597334] kernel_init+0x34/0x1cc [ 57.597342] ret_from_kernel_user_thread+0x14/0x1c [ 57.597350] -> #4 (&q->debugfs_mutex){+.+.}-{4:4}: [ 57.597362] __mutex_lock+0xfc/0x12a0 [ 57.597370] blk_register_queue+0xd4/0x294 [ 57.597379] add_disk_fwnode+0x2e4/0x548 [ 57.597388] brd_alloc+0x2c8/0x338 [ 57.597395] brd_init+0x100/0x178 [ 57.597402] do_one_initcall+0x88/0x3e4 [ 57.597410] kernel_init_freeable+0x3cc/0x6e0 [ 57.597418] kernel_init+0x34/0x1cc [ 57.597426] ret_from_kernel_user_thread+0x14/0x1c [ 57.597434] -> #3 (&q->sysfs_lock){+.+.}-{4:4}: [ 57.597446] __mutex_lock+0xfc/0x12a0 [ 57.597454] queue_attr_store+0x9c/0x110 [ 57.597462] sysfs_kf_write+0x70/0xb0 [ 57.597471] kernfs_fop_write_iter+0x1b0/0x2ac [ 57.597480] vfs_write+0x3dc/0x6e8 [ 57.597488] ksys_write+0x84/0x140 [ 57.597495] system_call_exception+0x130/0x360 [ 57.597504] system_call_common+0x160/0x2c4 [ 57.597516] -> #2 (&q->q_usage_counter(io)#21){++++}-{0:0}: [ 57.597530] __submit_bio+0x5ec/0x828 [ 57.597538] submit_bio_noacct_nocheck+0x1e4/0x4f0 [ 57.597547] iomap_readahead+0x2a0/0x448 [ 57.597556] xfs_vm_readahead+0x28/0x3c [ 57.597564] read_pages+0x88/0x41c [ 57.597571] page_cache_ra_unbounded+0x1ac/0x2d8 [ 57.597580] filemap_get_pages+0x188/0x984 [ 57.597588] filemap_read+0x13c/0x4bc [ 57.597596] xfs_file_buffered_read+0x88/0x17c [ 57.597605] xfs_file_read_iter+0xac/0x158 [ 57.597614] vfs_read+0x2d4/0x3b4 [ 57.597622] ksys_read+0x84/0x144 [ 57.597629] system_call_exception+0x130/0x360 [ 57.597637] system_call_common+0x160/0x2c4 [ 57.597647] -> #1 (mapping.invalidate_lock#2){++++}-{4:4}: [ 57.597661] down_read+0x6c/0x220 [ 57.597669] filemap_fault+0x870/0x100c [ 57.597677] xfs_filemap_fault+0xc4/0x18c [ 57.597684] __do_fault+0x64/0x164 [ 57.597693] __handle_mm_fault+0x1274/0x1dac [ 57.597702] handle_mm_fault+0x248/0x484 [ 57.597711] ___do_page_fault+0x428/0xc0c [ 57.597719] hash__do_page_fault+0x30/0x68 [ 57.597727] do_hash_fault+0x90/0x35c [ 57.597736] data_access_common_virt+0x210/0x220 [ 57.597745] _copy_from_user+0xf8/0x19c [ 57.597754] sel_write_load+0x178/0xd54 [ 57.597762] vfs_write+0x108/0x6e8 [ 57.597769] ksys_write+0x84/0x140 [ 57.597777] system_call_exception+0x130/0x360 [ 57.597785] system_call_common+0x160/0x2c4 [ 57.597794] -> #0 (&mm->mmap_lock){++++}-{4:4}: [ 57.597806] __lock_acquire+0x17cc/0x2330 [ 57.597814] lock_acquire+0x138/0x400 [ 57.597822] __might_fault+0x7c/0xc0 [ 57.597830] filldir64+0xe8/0x390 [ 57.597839] dcache_readdir+0x80/0x2d4 [ 57.597846] iterate_dir+0xd8/0x1d4 [ 57.597855] sys_getdents64+0x88/0x2d4 [ 57.597864] system_call_exception+0x130/0x360 [ 57.597872] system_call_common+0x160/0x2c4 [ 57.597881] other info that might help us debug this: [ 57.597888] Chain exists of: &mm->mmap_lock --> &q->debugfs_mutex --> &sb->s_type->i_mutex_key#3 [ 57.597905] Possible unsafe locking scenario: [ 57.597911] CPU0 CPU1 [ 57.597917] ---- ---- [ 57.597922] rlock(&sb->s_type->i_mutex_key#3); [ 57.597932] lock(&q->debugfs_mutex); [ 57.597940] lock(&sb->s_type->i_mutex_key#3); [ 57.597950] rlock(&mm->mmap_lock); [ 57.597958] *** DEADLOCK *** [ 57.597965] 2 locks held by ls/4605: [ 57.597971] #0: c0000000137c12f8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0xcc/0x154 [ 57.597989] #1: c0000018e27c6810 (&sb->s_type->i_mutex_key#3){++++}-{4:4}, at: iterate_dir+0x94/0x1d4 Prevent the above lockdep warning by acquiring ->sysfs_lock before freezing the queue while storing a queue attribute in queue_attr_store function. Later, we also found[1] another function __blk_mq_update_nr_ hw_queues where we first freeze queue and then acquire the ->sysfs_lock. So we've also updated lock ordering in __blk_mq_update_nr_hw_queues function and ensured that in all code paths we follow the correct lock ordering i.e. acquire ->sysfs_lock before freezing the queue. [1] https://lore.kernel.org/all/CAFj5m9Ke8+EHKQBs_Nk6hqd=LGXtk4mUxZUN5==ZcCjnZSBwHw@mail.gmail.com/ Reported-by: [email protected] Fixes: af28141 ("block: freeze the queue in queue_attr_store") Tested-by: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Nilay Shroff <[email protected]> Reviewed-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

Fabien Lahoudere and others added 30 commits November 21, 2017 09:23

Revert "crypto: xts - Add ECB dependency"

d8ce2b0

This reverts commit 6145171. The commit fixes a bug that was only introduced in 4.10, thus is irrelevant for <=4.9. Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

4.9-1.0.x-imx stable 4.9.67 merge #21

4.9-1.0.x-imx stable 4.9.67 merge #21

agners commented Dec 13, 2017

4.9-1.0.x-imx stable 4.9.67 merge #21

4.9-1.0.x-imx stable 4.9.67 merge #21

Conversation

agners commented Dec 13, 2017