perf menuconfig #1

mitake · 2012-06-30T14:47:16Z

menuconfig for perf tools

commit 5d299f3 (net: ipv6: fix TCP early demux) added a regression for ipv6_mapped case. [ 67.422369] SELinux: initialized (dev autofs, type autofs), uses genfs_contexts [ 67.449678] SELinux: initialized (dev autofs, type autofs), uses genfs_contexts [ 92.631060] BUG: unable to handle kernel NULL pointer dereference at (null) [ 92.631435] IP: [< (null)>] (null) [ 92.631645] PGD 0 [ 92.631846] Oops: 0010 [#1] SMP [ 92.632095] Modules linked in: autofs4 sunrpc ipv6 dm_mirror dm_region_hash dm_log dm_multipath dm_mod video sbs sbshc battery ac lp parport sg snd_hda_intel snd_hda_codec snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device pcspkr snd_pcm_oss snd_mixer_oss snd_pcm snd_timer serio_raw button floppy snd i2c_i801 i2c_core soundcore snd_page_alloc shpchp ide_cd_mod cdrom microcode ehci_hcd ohci_hcd uhci_hcd [ 92.634294] CPU 0 [ 92.634294] Pid: 4469, comm: sendmail Not tainted 3.6.0-rc1 #3 [ 92.634294] RIP: 0010:[<0000000000000000>] [< (null)>] (null) [ 92.634294] RSP: 0018:ffff880245fc7cb0 EFLAGS: 00010282 [ 92.634294] RAX: ffffffffa01985f0 RBX: ffff88024827ad00 RCX: 0000000000000000 [ 92.634294] RDX: 0000000000000218 RSI: ffff880254735380 RDI: ffff88024827ad00 [ 92.634294] RBP: ffff880245fc7cc8 R08: 0000000000000001 R09: 0000000000000000 [ 92.634294] R10: 0000000000000000 R11: ffff880245fc7bf8 R12: ffff880254735380 [ 92.634294] R13: ffff880254735380 R14: 0000000000000000 R15: 7fffffffffff0218 [ 92.634294] FS: 00007f4516ccd6f0(0000) GS:ffff880256600000(0000) knlGS:0000000000000000 [ 92.634294] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 92.634294] CR2: 0000000000000000 CR3: 0000000245ed1000 CR4: 00000000000007f0 [ 92.634294] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 92.634294] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 92.634294] Process sendmail (pid: 4469, threadinfo ffff880245fc6000, task ffff880254b8cac0) [ 92.634294] Stack: [ 92.634294] ffffffff813837a7 ffff88024827ad00 ffff880254b6b0e8 ffff880245fc7d68 [ 92.634294] ffffffff81385083 00000000001d2680 ffff8802547353a8 ffff880245fc7d18 [ 92.634294] ffffffff8105903a ffff88024827ad60 0000000000000002 00000000000000ff [ 92.634294] Call Trace: [ 92.634294] [<ffffffff813837a7>] ? tcp_finish_connect+0x2c/0xfa [ 92.634294] [<ffffffff81385083>] tcp_rcv_state_process+0x2b6/0x9c6 [ 92.634294] [<ffffffff8105903a>] ? sched_clock_cpu+0xc3/0xd1 [ 92.634294] [<ffffffff81059073>] ? local_clock+0x2b/0x3c [ 92.634294] [<ffffffff8138caf3>] tcp_v4_do_rcv+0x63a/0x670 [ 92.634294] [<ffffffff8133278e>] release_sock+0x128/0x1bd [ 92.634294] [<ffffffff8139f060>] __inet_stream_connect+0x1b1/0x352 [ 92.634294] [<ffffffff813325f5>] ? lock_sock_nested+0x74/0x7f [ 92.634294] [<ffffffff8104b333>] ? wake_up_bit+0x25/0x25 [ 92.634294] [<ffffffff813325f5>] ? lock_sock_nested+0x74/0x7f [ 92.634294] [<ffffffff8139f223>] ? inet_stream_connect+0x22/0x4b [ 92.634294] [<ffffffff8139f234>] inet_stream_connect+0x33/0x4b [ 92.634294] [<ffffffff8132e8cf>] sys_connect+0x78/0x9e [ 92.634294] [<ffffffff813fd407>] ? sysret_check+0x1b/0x56 [ 92.634294] [<ffffffff81088503>] ? __audit_syscall_entry+0x195/0x1c8 [ 92.634294] [<ffffffff811cc26e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 92.634294] [<ffffffff813fd3e2>] system_call_fastpath+0x16/0x1b [ 92.634294] Code: Bad RIP value. [ 92.634294] RIP [< (null)>] (null) [ 92.634294] RSP <ffff880245fc7cb0> [ 92.634294] CR2: 0000000000000000 [ 92.648982] ---[ end trace 24e2bed94314c8d9 ]--- [ 92.649146] Kernel panic - not syncing: Fatal exception in interrupt Fix this using inet_sk_rx_dst_set(), and export this function in case IPv6 is modular. Reported-by: Andrew Morton <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>

Doing port specific cleanup in the .port_remove hook is a lot simpler and safer than doing it in the USB driver .release or .disconnect methods. The removal of the port from the usb-serial bus will happen before the USB driver cleanup, so we must be careful about accessing port specific driver data from any USB driver functions. This problem surfaced after the commit 0998d06 device-core: Ensure drvdata = NULL when no driver is bound which turned the previous unsafe access into a reliable NULL pointer dereference. Fixes the following Oops: [ 243.148471] BUG: unable to handle kernel NULL pointer dereference at (null) [ 243.148508] IP: [<ffffffffa0468527>] stop_read_write_urbs+0x37/0x80 [usb_wwan] [ 243.148556] PGD 79d60067 PUD 79d61067 PMD 0 [ 243.148590] Oops: 0000 [#1] SMP [ 243.148617] Modules linked in: sr_mod cdrom qmi_wwan usbnet option cdc_wdm usb_wwan usbserial usb_storage uas fuse af_packet ip6table_filter ip6_tables iptable_filter ip_tables x_tables tun edd cpufreq_conservative cpufreq_userspace cpufreq_powersave snd_pcm_oss snd_mixer_oss acpi_cpufreq snd_seq mperf snd_seq_device coretemp arc4 sg hp_wmi sparse_keymap uvcvideo videobuf2_core videodev videobuf2_vmalloc videobuf2_memops rtl8192ce rtl8192c_common rtlwifi joydev pcspkr microcode mac80211 i2c_i801 lpc_ich r8169 snd_hda_codec_idt cfg80211 snd_hda_intel snd_hda_codec rfkill snd_hwdep snd_pcm wmi snd_timer ac snd soundcore snd_page_alloc battery uhci_hcd i915 drm_kms_helper drm i2c_algo_bit ehci_hcd thermal usbcore video usb_common button processor thermal_sys [ 243.149007] CPU 1 [ 243.149027] Pid: 135, comm: khubd Not tainted 3.5.0-rc7-next-20120720-1-vanilla #1 Hewlett-Packard HP Mini 110-3700 /1584 [ 243.149072] RIP: 0010:[<ffffffffa0468527>] [<ffffffffa0468527>] stop_read_write_urbs+0x37/0x80 [usb_wwan] [ 243.149118] RSP: 0018:ffff880037e75b30 EFLAGS: 00010286 [ 243.149133] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88005912aa28 [ 243.149150] RDX: ffff88005e95f028 RSI: 0000000000000000 RDI: ffff88005f7c1a10 [ 243.149166] RBP: ffff880037e75b60 R08: 0000000000000000 R09: ffffffff812cea90 [ 243.149182] R10: 0000000000000000 R11: 0000000000000001 R12: ffff88006539b440 [ 243.149198] R13: ffff88006539b440 R14: 0000000000000000 R15: 0000000000000000 [ 243.149216] FS: 0000000000000000(0000) GS:ffff88007ee80000(0000) knlGS:0000000000000000 [ 243.149233] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 243.149248] CR2: 0000000000000000 CR3: 0000000079fe0000 CR4: 00000000000007e0 [ 243.149264] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 243.149280] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 243.149298] Process khubd (pid: 135, threadinfo ffff880037e74000, task ffff880037d40600) [ 243.149313] Stack: [ 243.149323] ffff880037e75b40 ffff88006539b440 ffff8800799bc830 ffff88005f7c1800 [ 243.149348] 0000000000000001 ffff88006539b448 ffff880037e75b70 ffffffffa04685e9 [ 243.149371] ffff880037e75bc0 ffffffffa0473765 ffff880037354988 ffff88007b594800 [ 243.149395] Call Trace: [ 243.149419] [<ffffffffa04685e9>] usb_wwan_disconnect+0x9/0x10 [usb_wwan] [ 243.149447] [<ffffffffa0473765>] usb_serial_disconnect+0xd5/0x120 [usbserial] [ 243.149511] [<ffffffffa0046b48>] usb_unbind_interface+0x58/0x1a0 [usbcore] [ 243.149545] [<ffffffff8139ebd7>] __device_release_driver+0x77/0xe0 [ 243.149567] [<ffffffff8139ec67>] device_release_driver+0x27/0x40 [ 243.149587] [<ffffffff8139e5cf>] bus_remove_device+0xdf/0x150 [ 243.149608] [<ffffffff8139bc78>] device_del+0x118/0x1a0 [ 243.149661] [<ffffffffa0044590>] usb_disable_device+0xb0/0x280 [usbcore] [ 243.149718] [<ffffffffa003c6fd>] usb_disconnect+0x9d/0x140 [usbcore] [ 243.149770] [<ffffffffa003da7d>] hub_port_connect_change+0xad/0x8a0 [usbcore] [ 243.149825] [<ffffffffa0043bf5>] ? usb_control_msg+0xe5/0x110 [usbcore] [ 243.149878] [<ffffffffa003e6e3>] hub_events+0x473/0x760 [usbcore] [ 243.149931] [<ffffffffa003ea05>] hub_thread+0x35/0x1d0 [usbcore] [ 243.149955] [<ffffffff81061960>] ? add_wait_queue+0x60/0x60 [ 243.150004] [<ffffffffa003e9d0>] ? hub_events+0x760/0x760 [usbcore] [ 243.150026] [<ffffffff8106133e>] kthread+0x8e/0xa0 [ 243.150047] [<ffffffff8157ec04>] kernel_thread_helper+0x4/0x10 [ 243.150068] [<ffffffff810612b0>] ? flush_kthread_work+0x120/0x120 [ 243.150088] [<ffffffff8157ec00>] ? gs_change+0xb/0xb [ 243.150101] Code: fd 41 54 53 48 83 ec 08 80 7f 1a 00 74 57 49 89 fc 31 db 90 49 8b 7c 24 20 45 31 f6 48 81 c7 10 02 00 00 e8 bc 64 f3 e0 49 89 c7 <4b> 8b 3c 37 49 83 c6 08 e8 4c a5 bd ff 49 83 fe 20 75 ed 45 30 [ 243.150257] RIP [<ffffffffa0468527>] stop_read_write_urbs+0x37/0x80 [usb_wwan] [ 243.150282] RSP <ffff880037e75b30> [ 243.150294] CR2: 0000000000000000 [ 243.177170] ---[ end trace fba433d9015ffb8c ]--- Reported-by: Dan Carpenter <[email protected]> Reported-by: Thomas Schäfer <[email protected]> Suggested-by: Alan Stern <[email protected]> Signed-off-by: Bjørn Mork <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

On architectures where cputime_t is 64 bit type, is possible to trigger divide by zero on do_div(temp, (__force u32) total) line, if total is a non zero number but has lower 32 bit's zeroed. Removing casting is not a good solution since some do_div() implementations do cast to u32 internally. This problem can be triggered in practice on very long lived processes: PID: 2331 TASK: ffff880472814b00 CPU: 2 COMMAND: "oraagent.bin" #0 [ffff880472a51b70] machine_kexec at ffffffff8103214b #1 [ffff880472a51bd0] crash_kexec at ffffffff810b91c2 #2 [ffff880472a51ca0] oops_end at ffffffff814f0b00 #3 [ffff880472a51cd0] die at ffffffff8100f26b #4 [ffff880472a51d00] do_trap at ffffffff814f03f4 #5 [ffff880472a51d60] do_divide_error at ffffffff8100cfff torvalds#6 [ffff880472a51e00] divide_error at ffffffff8100be7b [exception RIP: thread_group_times+0x56] RIP: ffffffff81056a16 RSP: ffff880472a51eb8 RFLAGS: 00010046 RAX: bc3572c9fe12d194 RBX: ffff880874150800 RCX: 0000000110266fad RDX: 0000000000000000 RSI: ffff880472a51eb8 RDI: 001038ae7d9633dc RBP: ffff880472a51ef8 R8: 00000000b10a3a64 R9: ffff880874150800 R10: 00007fcba27ab680 R11: 0000000000000202 R12: ffff880472a51f08 R13: ffff880472a51f10 R14: 0000000000000000 R15: 0000000000000007 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 torvalds#7 [ffff880472a51f00] do_sys_times at ffffffff8108845d torvalds#8 [ffff880472a51f40] sys_times at ffffffff81088524 torvalds#9 [ffff880472a51f80] system_call_fastpath at ffffffff8100b0f2 RIP: 0000003808caac3a RSP: 00007fcba27ab6d8 RFLAGS: 00000202 RAX: 0000000000000064 RBX: ffffffff8100b0f2 RCX: 0000000000000000 RDX: 00007fcba27ab6e0 RSI: 000000000076d58e RDI: 00007fcba27ab6e0 RBP: 00007fcba27ab700 R8: 0000000000000020 R9: 000000000000091b R10: 00007fcba27ab680 R11: 0000000000000202 R12: 00007fff9ca41940 R13: 0000000000000000 R14: 00007fcba27ac9c0 R15: 00007fff9ca41940 ORIG_RAX: 0000000000000064 CS: 0033 SS: 002b Cc: [email protected] Signed-off-by: Stanislaw Gruszka <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>

…lock Cong Wang reports that lockdep detected suspicious RCU usage while enabling IPV6 forwarding: [ 1123.310275] =============================== [ 1123.442202] [ INFO: suspicious RCU usage. ] [ 1123.558207] 3.6.0-rc1+ torvalds#109 Not tainted [ 1123.665204] ------------------------------- [ 1123.768254] include/linux/rcupdate.h:430 Illegal context switch in RCU read-side critical section! [ 1123.992320] [ 1123.992320] other info that might help us debug this: [ 1123.992320] [ 1124.307382] [ 1124.307382] rcu_scheduler_active = 1, debug_locks = 0 [ 1124.522220] 2 locks held by sysctl/5710: [ 1124.648364] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff81768498>] rtnl_trylock+0x15/0x17 [ 1124.882211] #1: (rcu_read_lock){.+.+.+}, at: [<ffffffff81871df8>] rcu_lock_acquire+0x0/0x29 [ 1125.085209] [ 1125.085209] stack backtrace: [ 1125.332213] Pid: 5710, comm: sysctl Not tainted 3.6.0-rc1+ torvalds#109 [ 1125.441291] Call Trace: [ 1125.545281] [<ffffffff8109d915>] lockdep_rcu_suspicious+0x109/0x112 [ 1125.667212] [<ffffffff8107c240>] rcu_preempt_sleep_check+0x45/0x47 [ 1125.781838] [<ffffffff8107c260>] __might_sleep+0x1e/0x19b [...] [ 1127.445223] [<ffffffff81757ac5>] call_netdevice_notifiers+0x4a/0x4f [...] [ 1127.772188] [<ffffffff8175e125>] dev_disable_lro+0x32/0x6b [ 1127.885174] [<ffffffff81872d26>] dev_forward_change+0x30/0xcb [ 1128.013214] [<ffffffff818738c4>] addrconf_forward_change+0x85/0xc5 [...] addrconf_forward_change() uses RCU iteration over the netdev list, which is unnecessary since it already holds the RTNL lock. We also cannot reasonably require netdevice notifier functions not to sleep. Reported-by: Cong Wang <[email protected]> Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: David S. Miller <[email protected]>

Commit 6f458df (tcp: improve latencies of timer triggered events) added bug leading to following trace : [ 2866.131281] IPv4: Attempt to release TCP socket in state 1 ffff880019ec0000 [ 2866.131726] [ 2866.132188] ========================= [ 2866.132281] [ BUG: held lock freed! ] [ 2866.132281] 3.6.0-rc1+ torvalds#622 Not tainted [ 2866.132281] ------------------------- [ 2866.132281] kworker/0:1/652 is freeing memory ffff880019ec0000-ffff880019ec0a1f, with a lock still held there! [ 2866.132281] (sk_lock-AF_INET-RPC){+.+...}, at: [<ffffffff81903619>] tcp_sendmsg+0x29/0xcc6 [ 2866.132281] 4 locks held by kworker/0:1/652: [ 2866.132281] #0: (rpciod){.+.+.+}, at: [<ffffffff81083567>] process_one_work+0x1de/0x47f [ 2866.132281] #1: ((&task->u.tk_work)){+.+.+.}, at: [<ffffffff81083567>] process_one_work+0x1de/0x47f [ 2866.132281] #2: (sk_lock-AF_INET-RPC){+.+...}, at: [<ffffffff81903619>] tcp_sendmsg+0x29/0xcc6 [ 2866.132281] #3: (&icsk->icsk_retransmit_timer){+.-...}, at: [<ffffffff81078017>] run_timer_softirq+0x1ad/0x35f [ 2866.132281] [ 2866.132281] stack backtrace: [ 2866.132281] Pid: 652, comm: kworker/0:1 Not tainted 3.6.0-rc1+ torvalds#622 [ 2866.132281] Call Trace: [ 2866.132281] <IRQ> [<ffffffff810bc527>] debug_check_no_locks_freed+0x112/0x159 [ 2866.132281] [<ffffffff818a0839>] ? __sk_free+0xfd/0x114 [ 2866.132281] [<ffffffff811549fa>] kmem_cache_free+0x6b/0x13a [ 2866.132281] [<ffffffff818a0839>] __sk_free+0xfd/0x114 [ 2866.132281] [<ffffffff818a08c0>] sk_free+0x1c/0x1e [ 2866.132281] [<ffffffff81911e1c>] tcp_write_timer+0x51/0x56 [ 2866.132281] [<ffffffff81078082>] run_timer_softirq+0x218/0x35f [ 2866.132281] [<ffffffff81078017>] ? run_timer_softirq+0x1ad/0x35f [ 2866.132281] [<ffffffff810f5831>] ? rb_commit+0x58/0x85 [ 2866.132281] [<ffffffff81911dcb>] ? tcp_write_timer_handler+0x148/0x148 [ 2866.132281] [<ffffffff81070bd6>] __do_softirq+0xcb/0x1f9 [ 2866.132281] [<ffffffff81a0a00c>] ? _raw_spin_unlock+0x29/0x2e [ 2866.132281] [<ffffffff81a1227c>] call_softirq+0x1c/0x30 [ 2866.132281] [<ffffffff81039f38>] do_softirq+0x4a/0xa6 [ 2866.132281] [<ffffffff81070f2b>] irq_exit+0x51/0xad [ 2866.132281] [<ffffffff81a129cd>] do_IRQ+0x9d/0xb4 [ 2866.132281] [<ffffffff81a0a3ef>] common_interrupt+0x6f/0x6f [ 2866.132281] <EOI> [<ffffffff8109d006>] ? sched_clock_cpu+0x58/0xd1 [ 2866.132281] [<ffffffff81a0a172>] ? _raw_spin_unlock_irqrestore+0x4c/0x56 [ 2866.132281] [<ffffffff81078692>] mod_timer+0x178/0x1a9 [ 2866.132281] [<ffffffff818a00aa>] sk_reset_timer+0x19/0x26 [ 2866.132281] [<ffffffff8190b2cc>] tcp_rearm_rto+0x99/0xa4 [ 2866.132281] [<ffffffff8190dfba>] tcp_event_new_data_sent+0x6e/0x70 [ 2866.132281] [<ffffffff8190f7ea>] tcp_write_xmit+0x7de/0x8e4 [ 2866.132281] [<ffffffff818a565d>] ? __alloc_skb+0xa0/0x1a1 [ 2866.132281] [<ffffffff8190f952>] __tcp_push_pending_frames+0x2e/0x8a [ 2866.132281] [<ffffffff81904122>] tcp_sendmsg+0xb32/0xcc6 [ 2866.132281] [<ffffffff819229c2>] inet_sendmsg+0xaa/0xd5 [ 2866.132281] [<ffffffff81922918>] ? inet_autobind+0x5f/0x5f [ 2866.132281] [<ffffffff810ee7f1>] ? trace_clock_local+0x9/0xb [ 2866.132281] [<ffffffff8189adab>] sock_sendmsg+0xa3/0xc4 [ 2866.132281] [<ffffffff810f5de6>] ? rb_reserve_next_event+0x26f/0x2d5 [ 2866.132281] [<ffffffff8103e6a9>] ? native_sched_clock+0x29/0x6f [ 2866.132281] [<ffffffff8103e6f8>] ? sched_clock+0x9/0xd [ 2866.132281] [<ffffffff810ee7f1>] ? trace_clock_local+0x9/0xb [ 2866.132281] [<ffffffff8189ae03>] kernel_sendmsg+0x37/0x43 [ 2866.132281] [<ffffffff8199ce49>] xs_send_kvec+0x77/0x80 [ 2866.132281] [<ffffffff8199cec1>] xs_sendpages+0x6f/0x1a0 [ 2866.132281] [<ffffffff8107826d>] ? try_to_del_timer_sync+0x55/0x61 [ 2866.132281] [<ffffffff8199d0d2>] xs_tcp_send_request+0x55/0xf1 [ 2866.132281] [<ffffffff8199bb90>] xprt_transmit+0x89/0x1db [ 2866.132281] [<ffffffff81999bcd>] ? call_connect+0x3c/0x3c [ 2866.132281] [<ffffffff81999d92>] call_transmit+0x1c5/0x20e [ 2866.132281] [<ffffffff819a0d55>] __rpc_execute+0x6f/0x225 [ 2866.132281] [<ffffffff81999bcd>] ? call_connect+0x3c/0x3c [ 2866.132281] [<ffffffff819a0f33>] rpc_async_schedule+0x28/0x34 [ 2866.132281] [<ffffffff810835d6>] process_one_work+0x24d/0x47f [ 2866.132281] [<ffffffff81083567>] ? process_one_work+0x1de/0x47f [ 2866.132281] [<ffffffff819a0f0b>] ? __rpc_execute+0x225/0x225 [ 2866.132281] [<ffffffff81083a6d>] worker_thread+0x236/0x317 [ 2866.132281] [<ffffffff81083837>] ? process_scheduled_works+0x2f/0x2f [ 2866.132281] [<ffffffff8108b7b8>] kthread+0x9a/0xa2 [ 2866.132281] [<ffffffff81a12184>] kernel_thread_helper+0x4/0x10 [ 2866.132281] [<ffffffff81a0a4b0>] ? retint_restore_args+0x13/0x13 [ 2866.132281] [<ffffffff8108b71e>] ? __init_kthread_worker+0x5a/0x5a [ 2866.132281] [<ffffffff81a12180>] ? gs_change+0x13/0x13 [ 2866.308506] IPv4: Attempt to release TCP socket in state 1 ffff880019ec0000 [ 2866.309689] ============================================================================= [ 2866.310254] BUG TCP (Not tainted): Object already free [ 2866.310254] ----------------------------------------------------------------------------- [ 2866.310254] The bug comes from the fact that timer set in sk_reset_timer() can run before we actually do the sock_hold(). socket refcount reaches zero and we free the socket too soon. timer handler is not allowed to reduce socket refcnt if socket is owned by the user, or we need to change sk_reset_timer() implementation. We should take a reference on the socket in case TCP_DELACK_TIMER_DEFERRED or TCP_DELACK_TIMER_DEFERRED bit are set in tsq_flags Also fix a typo in tcp_delack_timer(), where TCP_WRITE_TIMER_DEFERRED was used instead of TCP_DELACK_TIMER_DEFERRED. For consistency, use same socket refcount change for TCP_MTU_REDUCED_DEFERRED, even if not fired from a timer. Reported-by: Fengguang Wu <[email protected]> Tested-by: Fengguang Wu <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>

Each page mapped in a process's address space must be correctly accounted for in _mapcount. Normally the rules for this are straightforward but hugetlbfs page table sharing is different. The page table pages at the PMD level are reference counted while the mapcount remains the same. If this accounting is wrong, it causes bugs like this one reported by Larry Woodman: kernel BUG at mm/filemap.c:135! invalid opcode: 0000 [#1] SMP CPU 22 Modules linked in: bridge stp llc sunrpc binfmt_misc dcdbas microcode pcspkr acpi_pad acpi] Pid: 18001, comm: mpitest Tainted: G W 3.3.0+ #4 Dell Inc. PowerEdge R620/07NDJ2 RIP: 0010:[<ffffffff8112cfed>] [<ffffffff8112cfed>] __delete_from_page_cache+0x15d/0x170 Process mpitest (pid: 18001, threadinfo ffff880428972000, task ffff880428b5cc20) Call Trace: delete_from_page_cache+0x40/0x80 truncate_hugepages+0x115/0x1f0 hugetlbfs_evict_inode+0x18/0x30 evict+0x9f/0x1b0 iput_final+0xe3/0x1e0 iput+0x3e/0x50 d_kill+0xf8/0x110 dput+0xe2/0x1b0 __fput+0x162/0x240 During fork(), copy_hugetlb_page_range() detects if huge_pte_alloc() shared page tables with the check dst_pte == src_pte. The logic is if the PMD page is the same, they must be shared. This assumes that the sharing is between the parent and child. However, if the sharing is with a different process entirely then this check fails as in this diagram: parent | ------------>pmd src_pte----------> data page ^ other--------->pmd--------------------| ^ child-----------| dst_pte For this situation to occur, it must be possible for Parent and Other to have faulted and failed to share page tables with each other. This is possible due to the following style of race. PROC A PROC B copy_hugetlb_page_range copy_hugetlb_page_range src_pte == huge_pte_offset src_pte == huge_pte_offset !src_pte so no sharing !src_pte so no sharing (time passes) hugetlb_fault hugetlb_fault huge_pte_alloc huge_pte_alloc huge_pmd_share huge_pmd_share LOCK(i_mmap_mutex) find nothing, no sharing UNLOCK(i_mmap_mutex) LOCK(i_mmap_mutex) find nothing, no sharing UNLOCK(i_mmap_mutex) pmd_alloc pmd_alloc LOCK(instantiation_mutex) fault UNLOCK(instantiation_mutex) LOCK(instantiation_mutex) fault UNLOCK(instantiation_mutex) These two processes are not poing to the same data page but are not sharing page tables because the opportunity was missed. When either process later forks, the src_pte == dst pte is potentially insufficient. As the check falls through, the wrong PTE information is copied in (harmless but wrong) and the mapcount is bumped for a page mapped by a shared page table leading to the BUG_ON. This patch addresses the issue by moving pmd_alloc into huge_pmd_share which guarantees that the shared pud is populated in the same critical section as pmd. This also means that huge_pte_offset test in huge_pmd_share is serialized correctly now which in turn means that the success of the sharing will be higher as the racing tasks see the pud and pmd populated together. Race identified and changelog written mostly by Mel Gorman. {[email protected]: attempt to make the huge_pmd_share() comment comprehensible, clean up coding style] Reported-by: Larry Woodman <[email protected]> Tested-by: Larry Woodman <[email protected]> Reviewed-by: Mel Gorman <[email protected]> Signed-off-by: Michal Hocko <[email protected]> Reviewed-by: Rik van Riel <[email protected]> Cc: David Gibson <[email protected]> Cc: Ken Chen <[email protected]> Cc: Cong Wang <[email protected]> Cc: Hillf Danton <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>

…tation We're hitting bug while trying to reinsert an already existing expectation: kernel BUG at kernel/timer.c:895! invalid opcode: 0000 [#1] SMP [...] Call Trace: <IRQ> [<ffffffffa0069563>] nf_ct_expect_related_report+0x4a0/0x57a [nf_conntrack] [<ffffffff812d423a>] ? in4_pton+0x72/0x131 [<ffffffffa00ca69e>] ip_nat_sdp_media+0xeb/0x185 [nf_nat_sip] [<ffffffffa00b5b9b>] set_expected_rtp_rtcp+0x32d/0x39b [nf_conntrack_sip] [<ffffffffa00b5f15>] process_sdp+0x30c/0x3ec [nf_conntrack_sip] [<ffffffff8103f1eb>] ? irq_exit+0x9a/0x9c [<ffffffffa00ca738>] ? ip_nat_sdp_media+0x185/0x185 [nf_nat_sip] We have to remove the RTP expectation if the RTCP expectation hits EBUSY since we keep trying with other ports until we succeed. Reported-by: Rafal Fitt <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>

Following lockdep splat was reported by Pavel Roskin : [ 1570.586223] =============================== [ 1570.586225] [ INFO: suspicious RCU usage. ] [ 1570.586228] 3.6.0-rc3-wl-main torvalds#98 Not tainted [ 1570.586229] ------------------------------- [ 1570.586231] /home/proski/src/linux/net/ipv4/route.c:645 suspicious rcu_dereference_check() usage! [ 1570.586233] [ 1570.586233] other info that might help us debug this: [ 1570.586233] [ 1570.586236] [ 1570.586236] rcu_scheduler_active = 1, debug_locks = 0 [ 1570.586238] 2 locks held by Chrome_IOThread/4467: [ 1570.586240] #0: (slock-AF_INET){+.-...}, at: [<ffffffff814f2c0c>] release_sock+0x2c/0xa0 [ 1570.586253] #1: (fnhe_lock){+.-...}, at: [<ffffffff815302fc>] update_or_create_fnhe+0x2c/0x270 [ 1570.586260] [ 1570.586260] stack backtrace: [ 1570.586263] Pid: 4467, comm: Chrome_IOThread Not tainted 3.6.0-rc3-wl-main torvalds#98 [ 1570.586265] Call Trace: [ 1570.586271] [<ffffffff810976ed>] lockdep_rcu_suspicious+0xfd/0x130 [ 1570.586275] [<ffffffff8153042c>] update_or_create_fnhe+0x15c/0x270 [ 1570.586278] [<ffffffff815305b3>] __ip_rt_update_pmtu+0x73/0xb0 [ 1570.586282] [<ffffffff81530619>] ip_rt_update_pmtu+0x29/0x90 [ 1570.586285] [<ffffffff815411dc>] inet_csk_update_pmtu+0x2c/0x80 [ 1570.586290] [<ffffffff81558d1e>] tcp_v4_mtu_reduced+0x2e/0xc0 [ 1570.586293] [<ffffffff81553bc4>] tcp_release_cb+0xa4/0xb0 [ 1570.586296] [<ffffffff814f2c35>] release_sock+0x55/0xa0 [ 1570.586300] [<ffffffff815442ef>] tcp_sendmsg+0x4af/0xf50 [ 1570.586305] [<ffffffff8156fc60>] inet_sendmsg+0x120/0x230 [ 1570.586308] [<ffffffff8156fb40>] ? inet_sk_rebuild_header+0x40/0x40 [ 1570.586312] [<ffffffff814f4bdd>] ? sock_update_classid+0xbd/0x3b0 [ 1570.586315] [<ffffffff814f4c50>] ? sock_update_classid+0x130/0x3b0 [ 1570.586320] [<ffffffff814ec435>] do_sock_write+0xc5/0xe0 [ 1570.586323] [<ffffffff814ec4a3>] sock_aio_write+0x53/0x80 [ 1570.586328] [<ffffffff8114bc83>] do_sync_write+0xa3/0xe0 [ 1570.586332] [<ffffffff8114c5a5>] vfs_write+0x165/0x180 [ 1570.586335] [<ffffffff8114c805>] sys_write+0x45/0x90 [ 1570.586340] [<ffffffff815d2722>] system_call_fastpath+0x16/0x1b Signed-off-by: Eric Dumazet <[email protected]> Reported-by: Pavel Roskin <[email protected]> Signed-off-by: David S. Miller <[email protected]>

After commit 26b8852 ("mmc: omap_hsmmc: remove private DMA API implementation"), the Nokia N800 here stopped booting: [ 2.086181] Waiting for root device /dev/mmcblk0p1... [ 2.324066] Unhandled fault: imprecise external abort (0x406) at 0x00000000 [ 2.331451] Internal error: : 406 [#1] ARM [ 2.335784] Modules linked in: [ 2.339050] CPU: 0 Not tainted (3.6.0-rc3 torvalds#60) [ 2.344146] PC is at default_idle+0x28/0x30 [ 2.348602] LR is at trace_hardirqs_on_caller+0x15c/0x1b0 ... This turned out to be due to memory corruption caused by long-broken PIO code in drivers/mmc/host/omap.c. (Previously, this driver had been using DMA; but the above commit caused the MMC driver to fall back to PIO mode with an unmodified Kconfig.) The PIO code, added with the rest of the driver in commit 730c9b7 ("[MMC] Add OMAP MMC host driver"), confused bytes with 16-bit words. This bug caused memory located after the PIO transfer buffer to be corrupted with transfers larger than 32 bytes. The driver also did not increment the buffer pointer after the transfer occurred. This bug resulted in data corruption during any transfer larger than 64 bytes. Signed-off-by: Paul Walmsley <[email protected]> Reviewed-by: Felipe Balbi <[email protected]> Tested-by: Tony Lindgren <[email protected]> Signed-off-by: Chris Ball <[email protected]>

[ 679.807480] BUG: unable to handle kernel NULL pointer dereference at 00000074 [ 679.814457] IP: [<de93b5bf>] picolcd_led_set_brightness+0x1f/0xb0 [hid_picolcd] [ 679.814457] *pde = 00000000 [ 679.814457] Oops: 0000 [#1] [ 679.814457] Modules linked in: hid_picolcd fb_sys_fops sysimgblt sysfillrect syscopyarea drm_kms_helper nfs lockd nfs_acl sunrpc [last unloaded: hid_picolcd] [ 679.814457] [ 679.814457] Pid: 272, comm: khubd Not tainted 3.5.0-jupiter-00006-g463a4c0 #1 NVIDIA Corporation. nFORCE-MCP/MS-6373 [ 679.814457] EIP: 0060:[<de93b5bf>] EFLAGS: 00010246 CPU: 0 [ 679.814457] EIP is at picolcd_led_set_brightness+0x1f/0xb0 [hid_picolcd] [ 679.814457] EAX: 00000000 EBX: d9f0c4e0 ECX: 00000000 EDX: 00000000 [ 679.814457] ESI: 00000000 EDI: dd6b79c0 EBP: dd4f7d90 ESP: dd4f7d80 [ 679.814457] DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068 [ 679.814457] CR0: 8005003b CR2: 00000074 CR3: 1d74e000 CR4: 000007d0 [ 679.814457] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [ 679.814457] DR6: ffff0ff0 DR7: 00000400 [ 679.814457] Process khubd (pid: 272, ti=dd4f6000 task=dd442470 task.ti=dd4f6000) [ 679.814457] Stack: [ 679.814457] 1d6c3300 d9f0c4e0 d9f0c4e0 dd6b79c0 dd4f7da0 c132912a 00000000 d9f0c4e0 [ 679.814457] dd4f7dac c132935d 00000000 dd4f7dc0 de93b847 dd6b79c0 00000282 c700ecc8 [ 679.814457] dd4f7ddc de93924f 00000004 c700ecc8 c700e060 c700ecbc c15ee300 dd4f7dec [ 679.814457] Call Trace: [ 679.814457] [<c132912a>] led_brightness_set+0x2a/0x30 [ 679.814457] [<c132935d>] led_classdev_unregister+0xd/0x50 [ 679.814457] [<de93b847>] picolcd_exit_leds+0x27/0x40 [hid_picolcd] [ 679.814457] [<de93924f>] picolcd_remove+0xbf/0x110 [hid_picolcd] [ 679.814457] [<c132c5dd>] hid_device_remove+0x3d/0x80 [ 679.814457] [<c1294126>] __device_release_driver+0x56/0xa0 [ 679.814457] [<c1294190>] device_release_driver+0x20/0x30 [ 679.814457] [<c1293bbf>] bus_remove_device+0x9f/0xc0 [ 679.814457] [<c1291a1d>] device_del+0xdd/0x150 [ 679.814457] [<c132c205>] hid_destroy_device+0x25/0x60 [ 679.814457] [<c13368cb>] usbhid_disconnect+0x1b/0x40 [ 679.814457] [<c12f4976>] usb_unbind_interface+0x46/0x170 [ 679.814457] [<c1294126>] __device_release_driver+0x56/0xa0 [ 679.814457] [<c1294190>] device_release_driver+0x20/0x30 [ 679.814457] [<c1293bbf>] bus_remove_device+0x9f/0xc0 [ 679.814457] [<c1291a1d>] device_del+0xdd/0x150 [ 679.814457] [<c12f2975>] usb_disable_device+0x85/0x1a0 [ 679.814457] [<c1053146>] ? __cond_resched+0x16/0x30 [ 679.814457] [<c12ebdb0>] usb_disconnect+0x80/0xf0 [ 679.814457] [<c12ed61f>] hub_thread+0x3df/0x1030 [ 679.814457] [<c10484a0>] ? wake_up_bit+0x30/0x30 [ 679.814457] [<c12ed240>] ? usb_remote_wakeup+0x40/0x40 [ 679.814457] [<c1047f94>] kthread+0x74/0x80 [ 679.814457] [<c1047f20>] ? flush_kthread_worker+0x90/0x90 [ 679.814457] [<c140e33e>] kernel_thread_helper+0x6/0xd [ 679.814457] Code: e0 25 00 e0 ff ff ff 48 14 eb 99 90 55 89 e5 83 ec 10 89 5d f4 89 75 f8 89 c3 89 7d fc 8b 40 1c 89 d6 8b 00 e8 13 89 95 e2 31 c9 <39> 5c 88 74 74 13 41 83 f9 08 75 f4 8b 5d f4 8b 75 f8 8b 7d fc [ 679.814457] EIP: [<de93b5bf>] picolcd_led_set_brightness+0x1f/0xb0 [hid_picolcd] SS:ESP 0068:dd4f7d80 [ 679.814457] CR2: 0000000000000074 [ 680.116438] ---[ end trace 6f0d9d63bff280ff ]--- Signed-off-by: Bruno Prémont <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>

Murali Nalajala reports a regression that ioremapping address zero results in an oops dump: Unable to handle kernel paging request at virtual address fa200000 pgd = d4f80000 [fa200000] *pgd=00000000 Internal error: Oops: 5 [#1] PREEMPT SMP ARM Modules linked in: CPU: 0 Tainted: G W (3.4.0-g3b5f728-00009-g638207a torvalds#13) PC is at msm_pm_config_rst_vector_before_pc+0x8/0x30 LR is at msm_pm_boot_config_before_pc+0x18/0x20 pc : [<c0078f84>] lr : [<c007903c>] psr: a0000093 sp : c0837ef0 ip : cfe00000 fp : 0000000d r10: da7efc17 r9 : 225c4278 r8 : 00000006 r7 : 0003c000 r6 : c085c824 r5 : 00000001 r4 : fa101000 r3 : fa200000 r2 : c095080c r1 : 002250fc r0 : 00000000 Flags: NzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 10c5387d Table: 25180059 DAC: 00000015 [<c0078f84>] (msm_pm_config_rst_vector_before_pc+0x8/0x30) from [<c007903c>] (msm_pm_boot_config_before_pc+0x18/0x20) [<c007903c>] (msm_pm_boot_config_before_pc+0x18/0x20) from [<c007a55c>] (msm_pm_power_collapse+0x410/0xb04) [<c007a55c>] (msm_pm_power_collapse+0x410/0xb04) from [<c007b17c>] (arch_idle+0x294/0x3e0) [<c007b17c>] (arch_idle+0x294/0x3e0) from [<c000eed8>] (default_idle+0x18/0x2c) [<c000eed8>] (default_idle+0x18/0x2c) from [<c000f254>] (cpu_idle+0x90/0xe4) [<c000f254>] (cpu_idle+0x90/0xe4) from [<c057231c>] (rest_init+0x88/0xa0) [<c057231c>] (rest_init+0x88/0xa0) from [<c07ff890>] (start_kernel+0x3a8/0x40c) Code: c0704256 e12fff1e e59f2020 e5923000 (e5930000) This is caused by the 'reserved' entries which we insert (see 19b52ab - ARM: 7438/1: fill possible PMD empty section gaps) which get matched for physical address zero. Resolve this by marking these reserved entries with a different flag. Cc: <[email protected]> Tested-by: Murali Nalajala <[email protected]> Acked-by: Nicolas Pitre <[email protected]> Signed-off-by: Russell King <[email protected]>

Fixes following lockdep splat : [ 1614.734896] ============================================= [ 1614.734898] [ INFO: possible recursive locking detected ] [ 1614.734901] 3.6.0-rc3+ torvalds#782 Not tainted [ 1614.734903] --------------------------------------------- [ 1614.734905] swapper/11/0 is trying to acquire lock: [ 1614.734907] (slock-AF_INET){+.-...}, at: [<ffffffffa0209d72>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core] [ 1614.734920] [ 1614.734920] but task is already holding lock: [ 1614.734922] (slock-AF_INET){+.-...}, at: [<ffffffff815fce23>] tcp_v4_err+0x163/0x6b0 [ 1614.734932] [ 1614.734932] other info that might help us debug this: [ 1614.734935] Possible unsafe locking scenario: [ 1614.734935] [ 1614.734937] CPU0 [ 1614.734938] ---- [ 1614.734940] lock(slock-AF_INET); [ 1614.734943] lock(slock-AF_INET); [ 1614.734946] [ 1614.734946] *** DEADLOCK *** [ 1614.734946] [ 1614.734949] May be due to missing lock nesting notation [ 1614.734949] [ 1614.734952] 7 locks held by swapper/11/0: [ 1614.734954] #0: (rcu_read_lock){.+.+..}, at: [<ffffffff81592801>] __netif_receive_skb+0x251/0xd00 [ 1614.734964] #1: (rcu_read_lock){.+.+..}, at: [<ffffffff815d319c>] ip_local_deliver_finish+0x4c/0x4e0 [ 1614.734972] #2: (rcu_read_lock){.+.+..}, at: [<ffffffff8160d116>] icmp_socket_deliver+0x46/0x230 [ 1614.734982] #3: (slock-AF_INET){+.-...}, at: [<ffffffff815fce23>] tcp_v4_err+0x163/0x6b0 [ 1614.734989] #4: (rcu_read_lock){.+.+..}, at: [<ffffffff815da240>] ip_queue_xmit+0x0/0x680 [ 1614.734997] #5: (rcu_read_lock_bh){.+....}, at: [<ffffffff815d9925>] ip_finish_output+0x135/0x890 [ 1614.735004] torvalds#6: (rcu_read_lock_bh){.+....}, at: [<ffffffff81595680>] dev_queue_xmit+0x0/0xe00 [ 1614.735012] [ 1614.735012] stack backtrace: [ 1614.735016] Pid: 0, comm: swapper/11 Not tainted 3.6.0-rc3+ torvalds#782 [ 1614.735018] Call Trace: [ 1614.735020] <IRQ> [<ffffffff810a50ac>] __lock_acquire+0x144c/0x1b10 [ 1614.735033] [<ffffffff810a334b>] ? check_usage+0x9b/0x4d0 [ 1614.735037] [<ffffffff810a6762>] ? mark_held_locks+0x82/0x130 [ 1614.735042] [<ffffffff810a5df0>] lock_acquire+0x90/0x200 [ 1614.735047] [<ffffffffa0209d72>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core] [ 1614.735051] [<ffffffff810a69ad>] ? trace_hardirqs_on+0xd/0x10 [ 1614.735060] [<ffffffff81749b31>] _raw_spin_lock+0x41/0x50 [ 1614.735065] [<ffffffffa0209d72>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core] [ 1614.735069] [<ffffffffa0209d72>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core] [ 1614.735075] [<ffffffffa014f7f2>] l2tp_eth_dev_xmit+0x32/0x60 [l2tp_eth] [ 1614.735079] [<ffffffff81595112>] dev_hard_start_xmit+0x502/0xa70 [ 1614.735083] [<ffffffff81594c6e>] ? dev_hard_start_xmit+0x5e/0xa70 [ 1614.735087] [<ffffffff815957c1>] ? dev_queue_xmit+0x141/0xe00 [ 1614.735093] [<ffffffff815b622e>] sch_direct_xmit+0xfe/0x290 [ 1614.735098] [<ffffffff81595865>] dev_queue_xmit+0x1e5/0xe00 [ 1614.735102] [<ffffffff81595680>] ? dev_hard_start_xmit+0xa70/0xa70 [ 1614.735106] [<ffffffff815b4daa>] ? eth_header+0x3a/0xf0 [ 1614.735111] [<ffffffff8161d33e>] ? fib_get_table+0x2e/0x280 [ 1614.735117] [<ffffffff8160a7e2>] arp_xmit+0x22/0x60 [ 1614.735121] [<ffffffff8160a863>] arp_send+0x43/0x50 [ 1614.735125] [<ffffffff8160b82f>] arp_solicit+0x18f/0x450 [ 1614.735132] [<ffffffff8159d9da>] neigh_probe+0x4a/0x70 [ 1614.735137] [<ffffffff815a191a>] __neigh_event_send+0xea/0x300 [ 1614.735141] [<ffffffff815a1c93>] neigh_resolve_output+0x163/0x260 [ 1614.735146] [<ffffffff815d9cf5>] ip_finish_output+0x505/0x890 [ 1614.735150] [<ffffffff815d9925>] ? ip_finish_output+0x135/0x890 [ 1614.735154] [<ffffffff815dae79>] ip_output+0x59/0xf0 [ 1614.735158] [<ffffffff815da1cd>] ip_local_out+0x2d/0xa0 [ 1614.735162] [<ffffffff815da403>] ip_queue_xmit+0x1c3/0x680 [ 1614.735165] [<ffffffff815da240>] ? ip_local_out+0xa0/0xa0 [ 1614.735172] [<ffffffff815f4402>] tcp_transmit_skb+0x402/0xa60 [ 1614.735177] [<ffffffff815f5a11>] tcp_retransmit_skb+0x1a1/0x620 [ 1614.735181] [<ffffffff815f7e93>] tcp_retransmit_timer+0x393/0x960 [ 1614.735185] [<ffffffff815fce23>] ? tcp_v4_err+0x163/0x6b0 [ 1614.735189] [<ffffffff815fd317>] tcp_v4_err+0x657/0x6b0 [ 1614.735194] [<ffffffff8160d116>] ? icmp_socket_deliver+0x46/0x230 [ 1614.735199] [<ffffffff8160d19e>] icmp_socket_deliver+0xce/0x230 [ 1614.735203] [<ffffffff8160d116>] ? icmp_socket_deliver+0x46/0x230 [ 1614.735208] [<ffffffff8160d464>] icmp_unreach+0xe4/0x2c0 [ 1614.735213] [<ffffffff8160e520>] icmp_rcv+0x350/0x4a0 [ 1614.735217] [<ffffffff815d3285>] ip_local_deliver_finish+0x135/0x4e0 [ 1614.735221] [<ffffffff815d319c>] ? ip_local_deliver_finish+0x4c/0x4e0 [ 1614.735225] [<ffffffff815d3ffa>] ip_local_deliver+0x4a/0x90 [ 1614.735229] [<ffffffff815d37b7>] ip_rcv_finish+0x187/0x730 [ 1614.735233] [<ffffffff815d425d>] ip_rcv+0x21d/0x300 [ 1614.735237] [<ffffffff81592a1b>] __netif_receive_skb+0x46b/0xd00 [ 1614.735241] [<ffffffff81592801>] ? __netif_receive_skb+0x251/0xd00 [ 1614.735245] [<ffffffff81593368>] process_backlog+0xb8/0x180 [ 1614.735249] [<ffffffff81593cf9>] net_rx_action+0x159/0x330 [ 1614.735257] [<ffffffff810491f0>] __do_softirq+0xd0/0x3e0 [ 1614.735264] [<ffffffff8109ed24>] ? tick_program_event+0x24/0x30 [ 1614.735270] [<ffffffff8175419c>] call_softirq+0x1c/0x30 [ 1614.735278] [<ffffffff8100425d>] do_softirq+0x8d/0xc0 [ 1614.735282] [<ffffffff8104983e>] irq_exit+0xae/0xe0 [ 1614.735287] [<ffffffff8175494e>] smp_apic_timer_interrupt+0x6e/0x99 [ 1614.735291] [<ffffffff81753a1c>] apic_timer_interrupt+0x6c/0x80 [ 1614.735293] <EOI> [<ffffffff810a14ad>] ? trace_hardirqs_off+0xd/0x10 [ 1614.735306] [<ffffffff81336f85>] ? intel_idle+0xf5/0x150 [ 1614.735310] [<ffffffff81336f7e>] ? intel_idle+0xee/0x150 [ 1614.735317] [<ffffffff814e6ea9>] cpuidle_enter+0x19/0x20 [ 1614.735321] [<ffffffff814e7538>] cpuidle_idle_call+0xa8/0x630 [ 1614.735327] [<ffffffff8100c1ba>] cpu_idle+0x8a/0xe0 [ 1614.735333] [<ffffffff8173762e>] start_secondary+0x220/0x222 Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>

In znet_probe(), strncmp() may access beyond 0x100000 and trigger the below oops in kvm. Fix it by limiting the loop under 0x100000-8. I suspect the limit could be further decreased to 0x100000-sizeof(struct netidblk), however no datasheet at hand.. [ 3.744312] BUG: unable to handle kernel paging request at 80100000 [ 3.746145] IP: [<8119d12a>] strncmp+0xc/0x20 [ 3.747446] *pde = 01d10067 *pte = 00100160 [ 3.747493] Oops: 0000 [#1] DEBUG_PAGEALLOC [ 3.747493] Pid: 1, comm: swapper Not tainted 3.6.0-rc1-00018-g57bfc0a torvalds#73 Bochs Bochs [ 3.747493] EIP: 0060:[<8119d12a>] EFLAGS: 00010206 CPU: 0 [ 3.747493] EIP is at strncmp+0xc/0x20 [ 3.747493] EAX: 800fff4e EBX: 00000006 ECX: 00000006 EDX: 814d2bb9 [ 3.747493] ESI: 80100000 EDI: 814d2bba EBP: 8e03dfa0 ESP: 8e03df98 [ 3.747493] DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068 [ 3.747493] CR0: 8005003b CR2: 80100000 CR3: 016f7000 CR4: 00000690 [ 3.747493] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [ 3.747493] DR6: ffff0ff0 DR7: 00000400 [ 3.747493] Process swapper (pid: 1, ti=8e03c000 task=8e040000 task.ti=8e03c000) [ 3.747493] Stack: [ 3.747493] 800fffff 00000000 8e03dfb4 816a1376 00000006 816a134a 00000000 8e03dfd0 [ 3.747493] 816819b5 816ed1c0 8e03dfe4 00000006 00000123 816ed604 8e03dfe4 81681b29 [ 3.747493] 00000000 81681a5b 00000000 00000000 8134e542 00000000 00000000 00000000 [ 3.747493] Call Trace: [ 3.747493] [<816a1376>] znet_probe+0x2c/0x26b [ 3.747493] [<816a134a>] ? dnet_driver_init+0xf/0xf [ 3.747493] [<816819b5>] do_one_initcall+0x6a/0x110 [ 3.747493] [<81681b29>] kernel_init+0xce/0x14b Signed-off-by: Fengguang Wu <[email protected]> Signed-off-by: David S. Miller <[email protected]>

48d480b [[MIPS] Malta: Fix off by one bug in interrupt handler.] did not take in account that irq_ffs() will also return 0 if for some reason the set of pending interrupts happens to be empty. This is trivial to trigger with a RM5261 CPU module running a 64-bit kernel and results in something like the following: CPU 0 Unable to handle kernel paging request at virtual address 0000000000000000, epc == ffffffff801772d0, ra == ffffffff8017ad24 Oops[#1]: Cpu 0 $ 0 : 0000000000000000 ffffffff9000a4e0 ffffffff9000a4e0 ffffffff9000a4e0 $ 4 : ffffffff80592be0 0000000000000000 00000000000000d6 ffffffff80322ed0 $ 8 : ffffffff805fe538 0000000000000000 ffffffff9000a4e0 ffffffff80590000 $12 : 00000000000000d6 0000000000000000 ffffffff80600000 ffffffff805fe538 $16 : 0000000000000000 0000000000000010 ffffffff80592be0 0000000000000010 $20 : 0000000000000000 0000000000500001 0000000000000000 ffffffff8051e078 $24 : 0000000000000028 ffffffff803226e8 $28 : 9800000003828000 980000000382b900 ffffffff8051e060 ffffffff8017ad24 Hi : 0000000000000000 Lo : 0000006388974000 epc : ffffffff801772d0 handle_irq_event_percpu+0x70/0x2f0 Not tainted ra : ffffffff8017ad24 handle_percpu_irq+0x54/0x88 Status: 9000a4e2 KX SX UX KERNEL EXL Cause : 00808008 BadVA : 0000000000000000 PrId : 000028a0 (Nevada) Modules linked in: Process init (pid: 1, threadinfo=9800000003828000, task=9800000003827968, tls=0000000077087490) Stack : ffffffff80592be0 ffffffff8058d248 0000000000000040 0000000000000000 ffffffff80613340 0000000000500001 ffffffff805a0000 0000000000000882 9800000003b89000 ffffffff8017ad24 00000000000000d5 0000000000000010 ffffffff9000a4e1 ffffffff801769f4 ffffffff9000a4e0 ffffffff801037f8 0000000000000000 ffffffff80101c44 0000000000000000 ffffffff9000a4e0 0000000000000000 9000000018000000 90000000180003f9 0000000000000001 0000000000000000 00000000000000ff 0000000000000018 0000000000000001 0000000000000001 00000000003fffff 0000000000000020 ffffffff802cf7ac ffffffff80208918 000000007fdadf08 ffffffff80612d88 ffffffff9000a4e1 0000000000000040 0000000000000000 ffffffff80613340 0000000000500001 ... Call Trace: [<ffffffff801772d0>] handle_irq_event_percpu+0x70/0x2f0 [<ffffffff8017ad24>] handle_percpu_irq+0x54/0x88 [<ffffffff801769f4>] generic_handle_irq+0x44/0x60 [<ffffffff801037f8>] do_IRQ+0x48/0x70 [<ffffffff80101c44>] ret_from_irq+0x0/0x4 [<ffffffff80326170>] serial8250_startup+0x310/0x870 [<ffffffff8032175c>] uart_startup.part.7+0x9c/0x330 [<ffffffff80321b4c>] uart_open+0x15c/0x1b0 [<ffffffff80302034>] tty_open+0x1fc/0x720 [<ffffffff801bffac>] chrdev_open+0x7c/0x180 [<ffffffff801b9ab8>] do_dentry_open.isra.14+0x288/0x390 [<ffffffff801bac5c>] nameidata_to_filp+0x5c/0xc0 [<ffffffff801ca700>] do_last.isra.33+0x330/0x8f0 [<ffffffff801caf3c>] path_openat+0xbc/0x440 [<ffffffff801cb3c8>] do_filp_open+0x38/0xa8 [<ffffffff801bade4>] do_sys_open+0x124/0x218 [<ffffffff80110538>] handle_sys+0x118/0x13c Code: 02d5a825 12800012 02a0b02d <de820000> de850008 0040f809 0220202d 0040a82d 40026000 ---[ end trace 5d8e7b9a86badd2d ]--- Kernel panic - not syncing: Fatal exception in interrupt Signed-off-by: Ralf Baechle <[email protected]>

busy_worker_rebind_fn() didn't clear WORKER_REBIND if rebinding failed (CPU is down again). This used to be okay because the flag wasn't used for anything else. However, after 25511a4 "workqueue: reimplement CPU online rebinding to handle idle workers", WORKER_REBIND is also used to command idle workers to rebind. If not cleared, the worker may confuse the next CPU_UP cycle by having REBIND spuriously set or oops / get stuck by prematurely calling idle_worker_rebind(). WARNING: at /work/os/wq/kernel/workqueue.c:1323 worker_thread+0x4cd/0x5 00() Hardware name: Bochs Modules linked in: test_wq(O-) Pid: 33, comm: kworker/1:1 Tainted: G O 3.6.0-rc1-work+ #3 Call Trace: [<ffffffff8109039f>] warn_slowpath_common+0x7f/0xc0 [<ffffffff810903fa>] warn_slowpath_null+0x1a/0x20 [<ffffffff810b3f1d>] worker_thread+0x4cd/0x500 [<ffffffff810bc16e>] kthread+0xbe/0xd0 [<ffffffff81bd2664>] kernel_thread_helper+0x4/0x10 ---[ end trace e977cf20f4661968 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff810b3db0>] worker_thread+0x360/0x500 PGD 0 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: test_wq(O-) CPU 0 Pid: 33, comm: kworker/1:1 Tainted: G W O 3.6.0-rc1-work+ #3 Bochs Bochs RIP: 0010:[<ffffffff810b3db0>] [<ffffffff810b3db0>] worker_thread+0x360/0x500 RSP: 0018:ffff88001e1c9de0 EFLAGS: 00010086 RAX: 0000000000000000 RBX: ffff88001e633e00 RCX: 0000000000004140 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000009 RBP: ffff88001e1c9ea0 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000002 R11: 0000000000000000 R12: ffff88001fc8d580 R13: ffff88001fc8d590 R14: ffff88001e633e20 R15: ffff88001e1c6900 FS: 0000000000000000(0000) GS:ffff88001fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 00000000130e8000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kworker/1:1 (pid: 33, threadinfo ffff88001e1c8000, task ffff88001e1c6900) Stack: ffff880000000000 ffff88001e1c9e40 0000000000000001 ffff88001e1c8010 ffff88001e519c78 ffff88001e1c9e58 ffff88001e1c6900 ffff88001e1c6900 ffff88001e1c6900 ffff88001e1c6900 ffff88001fc8d340 ffff88001fc8d340 Call Trace: [<ffffffff810bc16e>] kthread+0xbe/0xd0 [<ffffffff81bd2664>] kernel_thread_helper+0x4/0x10 Code: b1 00 f6 43 48 02 0f 85 91 01 00 00 48 8b 43 38 48 89 df 48 8b 00 48 89 45 90 e8 ac f0 ff ff 3c 01 0f 85 60 01 00 00 48 8b 53 50 <8b> 02 83 e8 01 85 c0 89 02 0f 84 3b 01 00 00 48 8b 43 38 48 8b RIP [<ffffffff810b3db0>] worker_thread+0x360/0x500 RSP <ffff88001e1c9de0> CR2: 0000000000000000 There was no reason to keep WORKER_REBIND on failure in the first place - WORKER_UNBOUND is guaranteed to be set in such cases preventing incorrectly activating concurrency management. Always clear WORKER_REBIND. tj: Updated comment and description. Signed-off-by: Lai Jiangshan <[email protected]> Signed-off-by: Tejun Heo <[email protected]>

Cancel work of the xfs_sync_worker before teardown of the log in xfs_unmountfs. This prevents occasional crashes on unmount like so: PID: 21602 TASK: ee9df060 CPU: 0 COMMAND: "kworker/0:3" #0 [c5377d28] crash_kexec at c0292c94 #1 [c5377d80] oops_end at c07090c2 #2 [c5377d98] no_context at c06f614e #3 [c5377dbc] __bad_area_nosemaphore at c06f6281 #4 [c5377df4] bad_area_nosemaphore at c06f629b #5 [c5377e00] do_page_fault at c070b0cb torvalds#6 [c5377e7c] error_code (via page_fault) at c070892c EAX: f300c6a8 EBX: f300c6a8 ECX: 000000c0 EDX: 000000c0 EBP: c5377ed0 DS: 007b ESI: 00000000 ES: 007b EDI: 00000001 GS: ffffad20 CS: 0060 EIP: c0481ad0 ERR: ffffffff EFLAGS: 00010246 torvalds#7 [c5377eb0] atomic64_read_cx8 at c0481ad0 torvalds#8 [c5377ebc] xlog_assign_tail_lsn_locked at f7cc7c6e [xfs] torvalds#9 [c5377ed4] xfs_trans_ail_delete_bulk at f7ccd520 [xfs] torvalds#10 [c5377f0c] xfs_buf_iodone at f7ccb602 [xfs] torvalds#11 [c5377f24] xfs_buf_do_callbacks at f7cca524 [xfs] torvalds#12 [c5377f30] xfs_buf_iodone_callbacks at f7cca5da [xfs] torvalds#13 [c5377f4c] xfs_buf_iodone_work at f7c718d0 [xfs] torvalds#14 [c5377f58] process_one_work at c024ee4c torvalds#15 [c5377f98] worker_thread at c024f43d torvalds#16 [c5377fb] kthread at c025326b torvalds#17 [c5377fe8] kernel_thread_helper at c070e834 PID: 26653 TASK: e79143b0 CPU: 3 COMMAND: "umount" #0 [cde0fda0] __schedule at c0706595 #1 [cde0fe28] schedule at c0706b89 #2 [cde0fe30] schedule_timeout at c0705600 #3 [cde0fe94] __down_common at c0706098 #4 [cde0fec8] __down at c0706122 #5 [cde0fed0] down at c025936f torvalds#6 [cde0fee0] xfs_buf_lock at f7c7131d [xfs] torvalds#7 [cde0ff00] xfs_freesb at f7cc2236 [xfs] torvalds#8 [cde0ff10] xfs_fs_put_super at f7c80f21 [xfs] torvalds#9 [cde0ff1c] generic_shutdown_super at c0333d7a torvalds#10 [cde0ff38] kill_block_super at c0333e0f torvalds#11 [cde0ff48] deactivate_locked_super at c0334218 torvalds#12 [cde0ff58] deactivate_super at c033495d torvalds#13 [cde0ff68] mntput_no_expire at c034bc13 torvalds#14 [cde0ff7c] sys_umount at c034cc69 torvalds#15 [cde0ffa0] sys_oldumount at c034ccd4 torvalds#16 [cde0ffb0] system_call at c0707e66 commit 11159a0 added this to xfs_log_unmount and needs to be cleaned up at a later date. Signed-off-by: Ben Myers <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Reviewed-by: Mark Tinguely <[email protected]>

When call_crda() is called we kick off a witch hunt search for the same regulatory domain on our internal regulatory database and that work gets kicked off on a workqueue, this is done while the cfg80211_mutex is held. If that workqueue kicks off it will first lock reg_regdb_search_mutex and later cfg80211_mutex but to ensure two CPUs will not contend against cfg80211_mutex the right thing to do is to have the reg_regdb_search() wait until the cfg80211_mutex is let go. The lockdep report is pasted below. cfg80211: Calling CRDA to update world regulatory domain ====================================================== [ INFO: possible circular locking dependency detected ] 3.3.8 #3 Tainted: G O ------------------------------------------------------- kworker/0:1/235 is trying to acquire lock: (cfg80211_mutex){+.+...}, at: [<816468a4>] set_regdom+0x78c/0x808 [cfg80211] but task is already holding lock: (reg_regdb_search_mutex){+.+...}, at: [<81646828>] set_regdom+0x710/0x808 [cfg80211] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (reg_regdb_search_mutex){+.+...}: [<800a8384>] lock_acquire+0x60/0x88 [<802950a8>] mutex_lock_nested+0x54/0x31c [<81645778>] is_world_regdom+0x9f8/0xc74 [cfg80211] -> #1 (reg_mutex#2){+.+...}: [<800a8384>] lock_acquire+0x60/0x88 [<802950a8>] mutex_lock_nested+0x54/0x31c [<8164539c>] is_world_regdom+0x61c/0xc74 [cfg80211] -> #0 (cfg80211_mutex){+.+...}: [<800a77b8>] __lock_acquire+0x10d4/0x17bc [<800a8384>] lock_acquire+0x60/0x88 [<802950a8>] mutex_lock_nested+0x54/0x31c [<816468a4>] set_regdom+0x78c/0x808 [cfg80211] other info that might help us debug this: Chain exists of: cfg80211_mutex --> reg_mutex#2 --> reg_regdb_search_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(reg_regdb_search_mutex); lock(reg_mutex#2); lock(reg_regdb_search_mutex); lock(cfg80211_mutex); *** DEADLOCK *** 3 locks held by kworker/0:1/235: #0: (events){.+.+..}, at: [<80089a00>] process_one_work+0x230/0x460 #1: (reg_regdb_work){+.+...}, at: [<80089a00>] process_one_work+0x230/0x460 #2: (reg_regdb_search_mutex){+.+...}, at: [<81646828>] set_regdom+0x710/0x808 [cfg80211] stack backtrace: Call Trace: [<80290fd4>] dump_stack+0x8/0x34 [<80291bc4>] print_circular_bug+0x2ac/0x2d8 [<800a77b8>] __lock_acquire+0x10d4/0x17bc [<800a8384>] lock_acquire+0x60/0x88 [<802950a8>] mutex_lock_nested+0x54/0x31c [<816468a4>] set_regdom+0x78c/0x808 [cfg80211] Reported-by: Felix Fietkau <[email protected]> Tested-by: Felix Fietkau <[email protected]> Cc: [email protected] Signed-off-by: Luis R. Rodriguez <[email protected]> Reviewed-by: Johannes Berg <[email protected]> Signed-off-by: John W. Linville <[email protected]>

This patch fixes an oops which occurs when unloading the driver, while the network interface is still up. The problem is that first the io mapping is teared own, then the CAN device is unregistered, resulting in accessing the hardware's iomem: [ 172.744232] Unable to handle kernel paging request at virtual address c88b0040 [ 172.752441] pgd = c7be4000 [ 172.755645] [c88b0040] *pgd=87821811, *pte=00000000, *ppte=00000000 [ 172.762207] Internal error: Oops: 807 [#1] PREEMPT ARM [ 172.767517] Modules linked in: ti_hecc(-) can_dev [ 172.772430] CPU: 0 Not tainted (3.5.0alpha-00037-g3554cc0 torvalds#126) [ 172.778961] PC is at ti_hecc_close+0xb0/0x100 [ti_hecc] [ 172.784423] LR is at __dev_close_many+0x90/0xc0 [ 172.789123] pc : [<bf00c768>] lr : [<c033be58>] psr: 60000013 [ 172.789123] sp : c5c1de68 ip : 00040081 fp : 00000000 [ 172.801025] r10: 00000001 r9 : c5c1c000 r8 : 00100100 [ 172.806457] r7 : c5d0a48c r6 : c5d0a400 r5 : 00000000 r4 : c5d0a000 [ 172.813232] r3 : c88b0000 r2 : 00000001 r1 : c5d0a000 r0 : c5d0a000 [ 172.820037] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user [ 172.827423] Control: 10c5387d Table: 87be4019 DAC: 00000015 [ 172.833404] Process rmmod (pid: 600, stack limit = 0xc5c1c2f0) [ 172.839447] Stack: (0xc5c1de68 to 0xc5c1e000) [ 172.843994] de60: bf00c6b8 c5c1dec8 c5d0a000 c5d0a000 00200200 c033be58 [ 172.852478] de80: c5c1de44 c5c1dec8 c5c1dec8 c033bf2c c5c1de90 c5c1de90 c5d0a084 c5c1de44 [ 172.860992] dea0: c5c1dec8 c033c098 c061d3dc c5d0a000 00000000 c05edf28 c05edb34 c000d724 [ 172.869476] dec0: 00000000 c033c2f8 c5d0a084 c5d0a084 00000000 c033c370 00000000 c5d0a000 [ 172.877990] dee0: c05edb00 c033c3b8 c5d0a000 bf00d3ac c05edb00 bf00d7c8 bf00d7c8 c02842dc [ 172.886474] df00: c02842c8 c0282f90 c5c1c000 c05edb00 bf00d7c8 c0283668 bf00d7c8 00000000 [ 172.894989] df20: c0611f98 befe2f80 c000d724 c0282d10 bf00d804 00000000 00000013 c0068a8c [ 172.903472] df40: c5c538e8 685f6974 00636365 c61571a8 c5cb9980 c61571a8 c6158a20 c00c9bc4 [ 172.911987] df60: 00000000 00000000 c5cb9980 00000000 c5cb9980 00000000 c7823680 00000006 [ 172.920471] df80: bf00d804 00000880 c5c1df8c 00000000 000d4267 befe2f80 00000001 b6d90068 [ 172.928985] dfa0: 00000081 c000d5a0 befe2f80 00000001 befe2f80 00000880 b6d90008 00000008 [ 172.937469] dfc0: befe2f80 00000001 b6d90068 00000081 00000001 00000000 befe2eac 00000000 [ 172.945983] dfe0: 00000000 befe2b18 00023ba4 b6e6addc 60000010 befe2f80 a8e00190 86d2d344 [ 172.954498] [<bf00c768>] (ti_hecc_close+0xb0/0x100 [ti_hecc]) from [<c033be58>] (__dev__registered_many+0xc0/0x2a0) [ 172.984161] [<c033c098>] (rollback_registered_many+0xc0/0x2a0) from [<c033c2f8>] (rollback_registered+0x20/0x30) [ 172.994750] [<c033c2f8>] (rollback_registered+0x20/0x30) from [<c033c370>] (unregister_netdevice_queue+0x68/0x98) [ 173.005401] [<c033c370>] (unregister_netdevice_queue+0x68/0x98) from [<c033c3b8>] (unregister_netdev+0x18/0x20) [ 173.015899] [<c033c3b8>] (unregister_netdev+0x18/0x20) from [<bf00d3ac>] (ti_hecc_remove+0x60/0x80 [ti_hecc]) [ 173.026245] [<bf00d3ac>] (ti_hecc_remove+0x60/0x80 [ti_hecc]) from [<c02842dc>] (platform_drv_remove+0x14/0x18) [ 173.036712] [<c02842dc>] (platform_drv_remove+0x14/0x18) from [<c0282f90>] (__device_release_driver+0x7c/0xbc) Cc: stable <[email protected]> Cc: Anant Gole <[email protected]> Tested-by: Jan Luebbe <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>

The hypervisor is in charge of allocating the proper "NUMA" memory and dealing with the CPU scheduler to keep them bound to the proper NUMA node. The PV guests (and PVHVM) have no inkling of where they run and do not need to know that right now. In the future we will need to inject NUMA configuration data (if a guest spans two or more NUMA nodes) so that the kernel can make the right choices. But those patches are not yet present. In the meantime, disable the NUMA capability in the PV guest, which also fixes a bootup issue. Andre says: "we see Dom0 crashes due to the kernel detecting the NUMA topology not by ACPI, but directly from the northbridge (CONFIG_AMD_NUMA). This will detect the actual NUMA config of the physical machine, but will crash about the mismatch with Dom0's virtual memory. Variation of the theme: Dom0 sees what it's not supposed to see. This happens with the said config option enabled and on a machine where this scanning is still enabled (K8 and Fam10h, not Bulldozer class) We have this dump then: NUMA: Warning: node ids are out of bound, from=-1 to=-1 distance=10 Scanning NUMA topology in Northbridge 24 Number of physical nodes 4 Node 0 MemBase 0000000000000000 Limit 0000000040000000 Node 1 MemBase 0000000040000000 Limit 0000000138000000 Node 2 MemBase 0000000138000000 Limit 00000001f8000000 Node 3 MemBase 00000001f8000000 Limit 0000000238000000 Initmem setup node 0 0000000000000000-0000000040000000 NODE_DATA [000000003ffd9000 - 000000003fffffff] Initmem setup node 1 0000000040000000-0000000138000000 NODE_DATA [0000000137fd9000 - 0000000137ffffff] Initmem setup node 2 0000000138000000-00000001f8000000 NODE_DATA [00000001f095e000 - 00000001f0984fff] Initmem setup node 3 00000001f8000000-0000000238000000 Cannot find 159744 bytes in node 3 BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81d220e6>] __alloc_bootmem_node+0x43/0x96 Pid: 0, comm: swapper Not tainted 3.3.6 #1 AMD Dinar/Dinar RIP: e030:[<ffffffff81d220e6>] [<ffffffff81d220e6>] __alloc_bootmem_node+0x43/0x96 .. snip.. [<ffffffff81d23024>] sparse_early_usemaps_alloc_node+0x64/0x178 [<ffffffff81d23348>] sparse_init+0xe4/0x25a [<ffffffff81d16840>] paging_init+0x13/0x22 [<ffffffff81d07fbb>] setup_arch+0x9c6/0xa9b [<ffffffff81683954>] ? printk+0x3c/0x3e [<ffffffff81d01a38>] start_kernel+0xe5/0x468 [<ffffffff81d012cf>] x86_64_start_reservations+0xba/0xc1 [<ffffffff81007153>] ? xen_setup_runstate_info+0x2c/0x36 [<ffffffff81d050ee>] xen_start_kernel+0x565/0x56c " so we just disable NUMA scanning by setting numa_off=1. CC: [email protected] Reported-and-Tested-by: Andre Przywara <[email protected]> Acked-by: Andre Przywara <[email protected]> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>

Fixes the following NULL pointer dereference: [ 7.740000] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver [ 7.810000] Unable to handle kernel NULL pointer dereference at virtual address 00000028 [ 7.810000] pgd = c3a38000 [ 7.810000] [00000028] *pgd=23a8c831, *pte=00000000, *ppte=00000000 [ 7.810000] Internal error: Oops: 17 [#1] PREEMPT ARM [ 7.810000] Modules linked in: ohci_hcd(+) regmap_i2c snd_pcm usbcore snd_page_alloc at91_cf snd_timer pcmcia_rsrc snd soundcore gpio_keys regmap_spi pcmcia_core usb_common nls_base [ 7.810000] CPU: 0 Not tainted (3.6.0-rc6-mpa+ torvalds#264) [ 7.810000] PC is at __gpio_to_irq+0x18/0x40 [ 7.810000] LR is at ohci_hcd_at91_overcurrent_irq+0x24/0xb4 [ohci_hcd] [ 7.810000] pc : [<c01392d4>] lr : [<bf08f694>] psr: 40000093 [ 7.810000] sp : c3a11c40 ip : c3a11c50 fp : c3a11c4c [ 7.810000] r10: 00000000 r9 : c02dcd6e r8 : fefff400 [ 7.810000] r7 : 00000000 r6 : c02cc928 r5 : 00000030 r4 : c02dd168 [ 7.810000] r3 : c02e7350 r2 : ffffffea r1 : c02cc928 r0 : 00000000 [ 7.810000] Flags: nZcv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user [ 7.810000] Control: c000717f Table: 23a38000 DAC: 00000015 [ 7.810000] Process modprobe (pid: 285, stack limit = 0xc3a10270) [ 7.810000] Stack: (0xc3a11c40 to 0xc3a12000) [ 7.810000] 1c40: c3a11c6c c3a11c50 bf08f694 c01392cc c3a11c84 c2c38b00 c3806900 00000030 [ 7.810000] 1c60: c3a11ca4 c3a11c70 c0051264 bf08f680 c3a11cac c3a11c80 c003e764 c3806900 [ 7.810000] 1c80: c2c38b00 c02cb05c c02cb000 fefff400 c3806930 c3a11cf4 c3a11cbc c3a11ca8 [ 7.810000] 1ca0: c005142c c005123c c3806900 c3805a00 c3a11cd4 c3a11cc0 c0053f24 c00513e4 [ 7.810000] 1cc0: c3a11cf4 00000030 c3a11cec c3a11cd8 c005120c c0053e88 00000000 00000000 [ 7.810000] 1ce0: c3a11d1c c3a11cf0 c00124d0 c00511e0 01400000 00000001 00000012 00000000 [ 7.810000] 1d00: ffffffff c3a11d94 00000030 00000000 c3a11d34 c3a11d20 c005120c c0012438 [ 7.810000] 1d20: c001dac4 00000012 c3a11d4c c3a11d38 c0009b08 c00511e0 c00523fc 60000013 [ 7.810000] 1d40: c3a11d5c c3a11d50 c0008510 c0009ab4 c3a11ddc c3a11d60 c0008eb4 c00084f [ 7.810000] 1d60: 00000000 00000030 00000000 00000080 60000013 bf08f670 c3806900 c2c38b00 [ 7.810000] 1d80: 00000030 c3806930 00000000 c3a11ddc c3a11d88 c3a11da8 c0054190 c00523fc [ 7.810000] 1da0: 60000013 ffffffff c3a11dec c3a11db8 00000000 c2c38b00 bf08f670 c3806900 [ 7.810000] 1dc0: 00000000 00000080 c02cc928 00000030 c3a11e0c c3a11de0 c0052764 c00520d8 [ 7.810000] 1de0: c3a11dfc 00000000 00000000 00000002 bf090f61 00000004 c02cc930 c02cc928 [ 7.810000] 1e00: c3a11e4c c3a11e10 bf090978 c005269c bf090f61 c02cc928 bf093000 c02dd170 [ 7.810000] 1e20: c3a11e3c c02cc930 c02cc930 bf0911d0 bf0911d0 bf093000 c3a10000 00000000 [ 7.810000] 1e40: c3a11e5c c3a11e50 c0155b7c bf090808 c3a11e7c c3a11e60 c0154690 c0155b6c [ 7.810000] 1e60: c02cc930 c02cc964 bf0911d0 c3a11ea0 c3a11e9c c3a11e80 c015484c c01545e8 [ 7.810000] 1e80: 00000000 00000000 c01547e4 bf0911d0 c3a11ec4 c3a11ea0 c0152e58 c01547f4 [ 7.810000] 1ea0: c381b88c c384ab10 c2c10540 bf0911d0 00000000 c02d7518 c3a11ed4 c3a11ec8 [ 7.810000] 1ec0: c01544c0 c0152e0c c3a11efc c3a11ed8 c01536cc c01544b0 bf091075 c3a11ee8 [ 7.810000] 1ee0: bf049af0 bf09120c bf0911d0 00000000 c3a11f1c c3a11f00 c0154e9c c0153628 [ 7.810000] 1f00: bf049af0 bf09120c 000ae190 00000000 c3a11f2c c3a11f20 c0155f58 c0154e04 [ 7.810000] 1f20: c3a11f44 c3a11f30 bf093054 c0155f1c 00000000 00006a4f c3a11f7c c3a11f48 [ 7.810000] 1f40: c0008638 bf093010 bf09120c 000ae190 00000000 c00093c4 00006a4f bf09120c [ 7.810000] 1f60: 000ae190 00000000 c00093c4 00000000 c3a11fa4 c3a11f80 c004fdc4 c000859c [ 7.810000] 1f80: c3a11fa4 000ae190 00006a4f 00016eb8 000ad018 00000080 00000000 c3a11fa8 [ 7.810000] 1fa0: c0009260 c004fd58 00006a4f 00016eb8 000ae190 00006a4f 000ae100 00000000 [ 7.810000] 1fc0: 00006a4f 00016eb8 000ad018 00000080 000adba0 000ad208 00000000 000ad3d [ 7.810000] 1fe0: beaf7ae8 beaf7ad8 000172b8 b6e4e940 20000010 000ae190 00000000 00000000 [ 7.810000] Backtrace: [ 7.810000] [<c01392bc>] (__gpio_to_irq+0x0/0x40) from [<bf08f694>] (ohci_hcd_at91_overcurrent_irq+0x24/0xb4 [ohci_hcd]) [ 7.810000] [<bf08f670>] (ohci_hcd_at91_overcurrent_irq+0x0/0xb4 [ohci_hcd]) from [<c0051264>] (handle_irq_event_percpu+0x38/0x1a8) [ 7.810000] r6:00000030 r5:c3806900 r4:c2c38b00 [ 7.810000] [<c005122c>] (handle_irq_event_percpu+0x0/0x1a8) from [<c005142c>] (handle_irq_event+0x58/0x7c) [ 7.810000] [<c00513d4>] (handle_irq_event+0x0/0x7c) from [<c0053f24>] (handle_simple_irq+0xac/0xd8) [ 7.810000] r5:c3805a00 r4:c3806900 [ 7.810000] [<c0053e78>] (handle_simple_irq+0x0/0xd8) from [<c005120c>] (generic_handle_irq+0x3c/0x48) [ 7.810000] r4:00000030 [ 7.810000] [<c00511d0>] (generic_handle_irq+0x0/0x48) from [<c00124d0>] (gpio_irq_handler+0xa8/0xfc) [ 7.810000] r4:00000000 [ 7.810000] [<c0012428>] (gpio_irq_handler+0x0/0xfc) from [<c005120c>] (generic_handle_irq+0x3c/0x48) [ 7.810000] [<c00511d0>] (generic_handle_irq+0x0/0x48) from [<c0009b08>] (handle_IRQ+0x64/0x88) [ 7.810000] r4:00000012 [ 7.810000] [<c0009aa4>] (handle_IRQ+0x0/0x88) from [<c0008510>] (at91_aic_handle_irq+0x30/0x38) [ 7.810000] r5:60000013 r4:c00523fc [ 7.810000] [<c00084e0>] (at91_aic_handle_irq+0x0/0x38) from [<c0008eb4>] (__irq_svc+0x34/0x60) [ 7.810000] Exception stack(0xc3a11d60 to 0xc3a11da8) [ 7.810000] 1d60: 00000000 00000030 00000000 00000080 60000013 bf08f670 c3806900 c2c38b00 [ 7.810000] 1d80: 00000030 c3806930 00000000 c3a11ddc c3a11d88 c3a11da8 c0054190 c00523fc [ 7.810000] 1da0: 60000013 ffffffff [ 7.810000] [<c00520c8>] (__setup_irq+0x0/0x458) from [<c0052764>] (request_threaded_irq+0xd8/0x134) [ 7.810000] [<c005268c>] (request_threaded_irq+0x0/0x134) from [<bf090978>] (ohci_hcd_at91_drv_probe+0x180/0x41c [ohci_hcd]) [ 7.810000] [<bf0907f8>] (ohci_hcd_at91_drv_probe+0x0/0x41c [ohci_hcd]) from [<c0155b7c>] (platform_drv_probe+0x20/0x24) [ 7.810000] [<c0155b5c>] (platform_drv_probe+0x0/0x24) from [<c0154690>] (driver_probe_device+0xb8/0x20c) [ 7.810000] [<c01545d8>] (driver_probe_device+0x0/0x20c) from [<c015484c>] (__driver_attach+0x68/0x88) [ 7.810000] r7:c3a11ea0 r6:bf0911d0 r5:c02cc964 r4:c02cc930 [ 7.810000] [<c01547e4>] (__driver_attach+0x0/0x88) from [<c0152e58>] (bus_for_each_dev+0x5c/0x9c) [ 7.810000] r6:bf0911d0 r5:c01547e4 r4:00000000 [ 7.810000] [<c0152dfc>] (bus_for_each_dev+0x0/0x9c) from [<c01544c0>] (driver_attach+0x20/0x28) [ 7.810000] r7:c02d7518 r6:00000000 r5:bf0911d0 r4:c2c10540 [ 7.810000] [<c01544a0>] (driver_attach+0x0/0x28) from [<c01536cc>] (bus_add_driver+0xb4/0x22c) [ 7.810000] [<c0153618>] (bus_add_driver+0x0/0x22c) from [<c0154e9c>] (driver_register+0xa8/0x144) [ 7.810000] r7:00000000 r6:bf0911d0 r5:bf09120c r4:bf049af0 [ 7.810000] [<c0154df4>] (driver_register+0x0/0x144) from [<c0155f58>] (platform_driver_register+0x4c/0x60) [ 7.810000] r7:00000000 r6:000ae190 r5:bf09120c r4:bf049af0 [ 7.810000] [<c0155f0c>] (platform_driver_register+0x0/0x60) from [<bf093054>] (ohci_hcd_mod_init+0x54/0x8c [ohci_hcd]) [ 7.810000] [<bf093000>] (ohci_hcd_mod_init+0x0/0x8c [ohci_hcd]) from [<c0008638>] (do_one_initcall+0xac/0x174) [ 7.810000] r4:00006a4f [ 7.810000] [<c000858c>] (do_one_initcall+0x0/0x174) from [<c004fdc4>] (sys_init_module+0x7c/0x1a0) [ 7.810000] [<c004fd48>] (sys_init_module+0x0/0x1a0) from [<c0009260>] (ret_fast_syscall+0x0/0x2c) [ 7.810000] r7:00000080 r6:000ad018 r5:00016eb8 r4:00006a4f [ 7.810000] Code: e24cb004 e59f3028 e1a02000 e7930180 (e5903028) [ 7.810000] ---[ end trace 85aa37ed128143b5 ]--- [ 7.810000] Kernel panic - not syncing: Fatal exception in interrupt Commit 6fffb77 (USB: ohci-at91: fix PIO handling in relation with number of ports) started setting unused pins to EINVAL. But this exposed a bug in the ohci_hcd_at91_overcurrent_irq function where the gpio was used without being checked to see if it is valid. This patches fixed the issue by adding the gpio valid check. Signed-off-by: Joachim Eastwood <[email protected]> Cc: stable <[email protected]> # [3.4+] whereever 6fffb77 went Signed-off-by: Greg Kroah-Hartman <[email protected]>

In the device close path, 'qlcnic_fw_destroy_ctx' and 'qlcnic_poll_rsp' call msleep. But 'qlcnic_fw_destroy_ctx' and 'qlcnic_poll_rsp' are called with 'adapter->tx_clean_lock' spin lock held resulting in scheduling while atomic bug causing the following trace. I observed that the commit 012dc19 from John Fastabend addresses a similar issue in ixgbevf driver. Adopting the same approach used in the commit, this patch uses mdelay to address the issue. [79884.999115] BUG: scheduling while atomic: ip/30846/0x00000002 [79885.005562] INFO: lockdep is turned off. [79885.009958] Modules linked in: qlcnic fuse nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE bnep bluetooth rfkill ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables iptable_nat nf_nat iptable_mangle ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack iptable_filter ip_tables dcdbas coretemp kvm_intel kvm iTCO_wdt ixgbe iTCO_vendor_support crc32c_intel ghash_clmulni_intel nfsd microcode sb_edac pcspkr edac_core dca bnx2x shpchp auth_rpcgss nfs_acl lpc_ich mfd_core mdio lockd libcrc32c wmi acpi_pad acpi_power_meter sunrpc uinput sd_mod sr_mod cdrom crc_t10dif ahci libahci libata megaraid_sas usb_storage dm_mirror dm_region_hash dm_log dm_mod [last unloaded: qlcnic] [79885.083608] Pid: 30846, comm: ip Tainted: G W O 3.6.0-rc7+ #1 [79885.090805] Call Trace: [79885.093569] [<ffffffff816764d8>] __schedule_bug+0x68/0x76 [79885.099699] [<ffffffff8168358e>] __schedule+0x99e/0xa00 [79885.105634] [<ffffffff81683929>] schedule+0x29/0x70 [79885.111186] [<ffffffff81680def>] schedule_timeout+0x16f/0x350 [79885.117724] [<ffffffff811afb7a>] ? init_object+0x4a/0x90 [79885.123770] [<ffffffff8107c190>] ? __internal_add_timer+0x140/0x140 [79885.130873] [<ffffffff81680fee>] schedule_timeout_uninterruptible+0x1e/0x20 [79885.138773] [<ffffffff8107e830>] msleep+0x20/0x30 [79885.144159] [<ffffffffa04c7fbf>] qlcnic_issue_cmd+0xef/0x290 [qlcnic] [79885.151478] [<ffffffffa04c8265>] qlcnic_fw_cmd_destroy_rx_ctx+0x55/0x90 [qlcnic] [79885.159868] [<ffffffffa04c92fd>] qlcnic_fw_destroy_ctx+0x2d/0xa0 [qlcnic] [79885.167576] [<ffffffffa04bf2ed>] __qlcnic_down+0x11d/0x180 [qlcnic] [79885.174708] [<ffffffffa04bf6f8>] qlcnic_close+0x18/0x20 [qlcnic] [79885.181547] [<ffffffff8153b4c5>] __dev_close_many+0x95/0xe0 [79885.187899] [<ffffffff8153b548>] __dev_close+0x38/0x50 [79885.193761] [<ffffffff81545101>] __dev_change_flags+0xa1/0x180 [79885.200419] [<ffffffff81545298>] dev_change_flags+0x28/0x70 [79885.206779] [<ffffffff815531b8>] do_setlink+0x378/0xa00 [79885.212731] [<ffffffff81354fe1>] ? nla_parse+0x31/0xe0 [79885.218612] [<ffffffff815558ee>] rtnl_newlink+0x37e/0x560 [79885.224768] [<ffffffff812cfa19>] ? selinux_capable+0x39/0x50 [79885.231217] [<ffffffff812cbf98>] ? security_capable+0x18/0x20 [79885.237765] [<ffffffff81555114>] rtnetlink_rcv_msg+0x114/0x2f0 [79885.244412] [<ffffffff81551f87>] ? rtnl_lock+0x17/0x20 [79885.250280] [<ffffffff81551f87>] ? rtnl_lock+0x17/0x20 [79885.256148] [<ffffffff81555000>] ? __rtnl_unlock+0x20/0x20 [79885.262413] [<ffffffff81570fc1>] netlink_rcv_skb+0xa1/0xb0 [79885.268661] [<ffffffff81551fb5>] rtnetlink_rcv+0x25/0x40 [79885.274727] [<ffffffff815708bd>] netlink_unicast+0x19d/0x220 [79885.281146] [<ffffffff81570c45>] netlink_sendmsg+0x305/0x3f0 [79885.287595] [<ffffffff8152b188>] ? sock_update_classid+0x148/0x2e0 [79885.294650] [<ffffffff81525c2c>] sock_sendmsg+0xbc/0xf0 [79885.300600] [<ffffffff8152600c>] __sys_sendmsg+0x3ac/0x3c0 [79885.306853] [<ffffffff8109be23>] ? up_read+0x23/0x40 [79885.312510] [<ffffffff816896cc>] ? do_page_fault+0x2bc/0x570 [79885.318968] [<ffffffff81191854>] ? sys_brk+0x44/0x150 [79885.324715] [<ffffffff811c458c>] ? fget_light+0x24c/0x520 [79885.330875] [<ffffffff815286f9>] sys_sendmsg+0x49/0x90 [79885.336707] [<ffffffff8168e429>] system_call_fastpath+0x16/0x1b Signed-off-by: Narendra K <[email protected]> Signed-off-by: David S. Miller <[email protected]>

When the tty_printk driver fails to create a node in sysfs, the system crashes. It is because the driver registers a tty driver and frees it without deregistering it first. The fix is easy: add a call to tty_unregister_driver to the fail path. This is very unlikely to happen in usual environment => no need for stable. The crash occurs at some place where we iterate over tty drivers first. It may look like this: BUG: unable to handle kernel paging request at ffffffffffffff84 IP: [<ffffffff81278d56>] tty_open+0xd6/0x650 PGD 1a0d067 PUD 1a0e067 PMD 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: CPU 0 Pid: 1183, comm: boot.localnet Tainted: G W 3.5.0-rc7-next-20120716+ torvalds#369 Bochs Bochs RIP: 0010:[<ffffffff81278d56>] [<ffffffff81278d56>] tty_open+0xd6/0x650 RSP: 0018:ffff8800162b3b98 EFLAGS: 00010207 RAX: 0000000000000000 RBX: ffff880016ba6200 RCX: 0000000000002208 RDX: 0000000000000000 RSI: 00000000000000d0 RDI: ffffffff81a35080 RBP: ffff8800162b3c08 R08: ffffffff81276f42 R09: 0000000000400040 R10: ffff8800161dc005 R11: ffff8800188ee048 R12: 0000000000000000 R13: ffffffffffffff58 R14: 0000000000400040 R15: 0000000000008000 FS: 00007f3684abd700(0000) GS:ffff880018e00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffff84 CR3: 000000001503e000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process boot.localnet (pid: 1183, threadinfo ffff8800162b2000, task ffff8800188c5880) Stack: ffff8800162b3c08 ffffffff81363d63 ffffffff81a62940 ffff8800189b4e88 ffff8800188c5880 ffffffff81123180 0000000000000000 ffffffff18b20600 0000000000000000 ffff8800189b4e88 ffff880016ba6200 ffff880018b20600 Call Trace: [<ffffffff81363d63>] ? kobj_lookup+0x103/0x160 [<ffffffff81123180>] ? mount_fs+0x110/0x110 [<ffffffff81123a9c>] chrdev_open+0x9c/0x1a0 [<ffffffff81123a00>] ? cdev_put+0x30/0x30 [<ffffffff8111de76>] do_dentry_open.isra.19+0x1e6/0x270 [<ffffffff8111df65>] finish_open+0x65/0xa0 [<ffffffff8112dc9e>] do_last.isra.52+0x26e/0xd80 [<ffffffff8112b163>] ? inode_permission+0x13/0x50 [<ffffffff8112b203>] ? link_path_walk+0x63/0x940 [<ffffffff8112e85b>] path_openat+0xab/0x3d0 [<ffffffff8112ef5d>] do_filp_open+0x3d/0xa0 [<ffffffff8113ba72>] ? alloc_fd+0xd2/0x120 [<ffffffff8111eee3>] do_sys_open+0xf3/0x1d0 [<ffffffff8111efdc>] sys_open+0x1c/0x20 [<ffffffff815b5fe2>] system_call_fastpath+0x16/0x1b Signed-off-by: Jiri Slaby <[email protected]> Cc: Samo Pogacnik <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

Change since v1: * Fixed inuse counters access spotted by Eric In patch eea68e2 (packet: Report socket mclist info via diag module) I've introduced a "scheduling in atomic" problem in packet diag module -- the socket list is traversed under rcu_read_lock() while performed under it sk mclist access requires rtnl lock (i.e. -- mutex) to be taken. [152363.820563] BUG: scheduling while atomic: crtools/12517/0x10000002 [152363.820573] 4 locks held by crtools/12517: [152363.820581] #0: (sock_diag_mutex){+.+.+.}, at: [<ffffffff81a2dcb5>] sock_diag_rcv+0x1f/0x3e [152363.820613] #1: (sock_diag_table_mutex){+.+.+.}, at: [<ffffffff81a2de70>] sock_diag_rcv_msg+0xdb/0x11a [152363.820644] #2: (nlk->cb_mutex){+.+.+.}, at: [<ffffffff81a67d01>] netlink_dump+0x23/0x1ab [152363.820693] #3: (rcu_read_lock){.+.+..}, at: [<ffffffff81b6a049>] packet_diag_dump+0x0/0x1af Similar thing was then re-introduced by further packet diag patches (fanount mutex and pgvec mutex for rings) :( Apart from being terribly sorry for the above, I propose to change the packet sk list protection from spinlock to mutex. This lock currently protects two modifications: * sklist * prot inuse counters The sklist modifications can be just reprotected with mutex since they already occur in a sleeping context. The inuse counters modifications are trickier -- the __this_cpu_-s are used inside, thus requiring the caller to handle the potential issues with contexts himself. Since packet sockets' counters are modified in two places only (packet_create and packet_release) we only need to protect the context from being preempted. BH disabling is not required in this case. Signed-off-by: Pavel Emelyanov <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>

The lockdep hunter mentions a non consistent usage of spin_lock and spin_lock_irqsafe in the composite_disconnect and usb_function_activate function: [ 15.700897] ================================= [ 15.705255] [ INFO: inconsistent lock state ] [ 15.709617] 3.5.0-rc5+ torvalds#413 Not tainted [ 15.713453] --------------------------------- [ 15.717812] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. [ 15.723822] uvc-gadget/116 [HC1[1]:SC0[0]:HE0:SE1] takes: [ 15.729222] (&(&cdev->lock)->rlock){?.+...}, at: [<7f0049e8>] composite_disconnect+0x2c/0x74 [g_webcam] [ 15.738797] {HARDIRQ-ON-W} state was registered at: [ 15.743677] [<8006de3c>] mark_lock+0x148/0x688 [ 15.748325] [<8006ecb0>] __lock_acquire+0x934/0x1b74 [ 15.753481] [<8007047c>] lock_acquire+0x98/0x138 [ 15.758288] [<804c776c>] _raw_spin_lock+0x4c/0x84 [ 15.763188] [<7f006ae4>] usb_function_activate+0x28/0x94 [g_webcam] [ 15.769652] [<7f00820c>] usb_ep_autoconfig_reset+0x78/0x98 [g_webcam] [ 15.776287] [<7f0082a4>] uvc_v4l2_open+0x78/0x94 [g_webcam] [ 15.782054] [<80366a38>] v4l2_open+0x104/0x130 [ 15.786697] [<800efd30>] chrdev_open+0xa0/0x170 [ 15.791423] [<800e9718>] do_dentry_open.isra.13+0x1e8/0x264 [ 15.797186] [<800ea5d4>] nameidata_to_filp+0x58/0x94 [ 15.802340] [<800fa29c>] do_last.isra.31+0x2a0/0x808 [ 15.807497] [<800faa40>] path_openat+0xc8/0x3e8 [ 15.812216] [<800fae90>] do_filp_open+0x3c/0x90 [ 15.816936] [<800ea6fc>] do_sys_open+0xec/0x184 [ 15.821655] [<800ea7c4>] sys_open+0x30/0x34 [ 15.826027] [<8000e5c0>] ret_fast_syscall+0x0/0x48 [ 15.831015] irq event stamp: 6048 [ 15.834330] hardirqs last enabled at (6047): [<804c81b8>] _raw_spin_unlock_irqrestore+0x40/0x54 [ 15.843132] hardirqs last disabled at (6048): [<8000e174>] __irq_svc+0x34/0x60 [ 15.850370] softirqs last enabled at (5940): [<80028380>] __do_softirq+0x188/0x270 [ 15.858043] softirqs last disabled at (5935): [<80028944>] irq_exit+0xa0/0xa8 [ 15.865195] [ 15.865195] other info that might help us debug this: [ 15.871724] Possible unsafe locking scenario: [ 15.871724] [ 15.877645] CPU0 [ 15.880091] ---- [ 15.882537] lock(&(&cdev->lock)->rlock); [ 15.886659] <Interrupt> [ 15.889278] lock(&(&cdev->lock)->rlock); [ 15.893573] [ 15.893573] *** DEADLOCK *** [ 15.893573] [ 15.899496] no locks held by uvc-gadget/116. [ 15.903765] [ 15.903765] stack backtrace: [ 15.908125] Backtrace: [ 15.910604] [<80012038>] (dump_backtrace+0x0/0x114) from [<804bf8a4>] (dump_stack+0x20/0x24) [ 15.919043] r6:dfb8e6f0 r5:dfb8e400 r4:809717ec r3:60000193 [ 15.924766] [<804bf884>] (dump_stack+0x0/0x24) from [<804c0c0c>] (print_usage_bug+0x258/0x2c0) [ 15.933388] [<804c09b4>] (print_usage_bug+0x0/0x2c0) from [<8006e240>] (mark_lock+0x54c/0x688) [ 15.942006] [<8006dcf4>] (mark_lock+0x0/0x688) from [<8006edb8>] (__lock_acquire+0xa3c/0x1b74) [ 15.950625] [<8006e37c>] (__lock_acquire+0x0/0x1b74) from [<8007047c>] (lock_acquire+0x98/0x138) [ 15.959418] [<800703e4>] (lock_acquire+0x0/0x138) from [<804c78fc>] (_raw_spin_lock_irqsave+0x58/0x94) [ 15.968736] [<804c78a4>] (_raw_spin_lock_irqsave+0x0/0x94) from [<7f0049e8>] (composite_disconnect+0x2c/0x74 [g_webcam]) [ 15.979605] r7:00000012 r6:df82b0c4 r5:ded755bc r4:ded75580 [ 15.985331] [<7f0049bc>] (composite_disconnect+0x0/0x74 [g_webcam]) from [<8033c170>] (_gadget_stop_activity+0xc4/0x120) [ 15.996200] r6:df82b0c4 r5:df82b0c8 r4:df82b0d0 r3:7f0049bc [ 16.001919] [<8033c0ac>] (_gadget_stop_activity+0x0/0x120) from [<8033e390>] (udc_irq+0x724/0xcb8) [ 16.010877] r6:df82b010 r5:00000000 r4:df82b010 r3:00000000 [ 16.016595] [<8033dc6c>] (udc_irq+0x0/0xcb8) from [<8033baec>] (ci_irq+0x64/0xdc) [ 16.024086] [<8033ba88>] (ci_irq+0x0/0xdc) from [<80086538>] (handle_irq_event_percpu+0x74/0x298) [ 16.032958] r5:807fd414 r4:df38fdc0 [ 16.036566] [<800864c4>] (handle_irq_event_percpu+0x0/0x298) from [<800867a8>] (handle_irq_event+0x4c/0x6c) [ 16.046315] [<8008675c>] (handle_irq_event+0x0/0x6c) from [<80089318>] (handle_level_irq+0xbc/0x11c) [ 16.055447] r6:def04000 r5:807fd414 r4:807fd3c0 r3:00020000 [ 16.061166] [<8008925c>] (handle_level_irq+0x0/0x11c) from [<80085cc8>] (generic_handle_irq+0x38/0x4c) [ 16.070472] r5:807f7f64 r4:8081e9f8 [ 16.074082] [<80085c90>] (generic_handle_irq+0x0/0x4c) from [<8000ef98>] (handle_IRQ+0x5c/0xbc) [ 16.082788] [<8000ef3c>] (handle_IRQ+0x0/0xbc) from [<800085cc>] (tzic_handle_irq+0x6c/0x9c) [ 16.091225] r8:00000000 r7:def059b0 r6:00000001 r5:00000000 r4:00000000 r3:00000012 [ 16.099141] [<80008560>] (tzic_handle_irq+0x0/0x9c) from [<8000e184>] (__irq_svc+0x44/0x60) [ 16.107494] Exception stack(0xdef059b0 to 0xdef059f8) [ 16.112550] 59a0: 00000001 00000001 00000000 dfb8e400 [ 16.120732] 59c0: 40000013 81a2e500 00000000 81a2e500 00000000 00000000 80862418 def05a0c [ 16.128912] 59e0: def059c8 def059f8 80070e24 804c81bc 20000013 ffffffff [ 16.135542] [<804c8178>] (_raw_spin_unlock_irqrestore+0x0/0x54) from [<8003d0ec>] (__queue_work+0x108/0x470) [ 16.145369] r5:dfb1a30c r4:81b93c00 [ 16.148978] [<8003cfe4>] (__queue_work+0x0/0x470) from [<8003d4e0>] (queue_work_on+0x4c/0x54) [ 16.157511] [<8003d494>] (queue_work_on+0x0/0x54) from [<8003d544>] (queue_work+0x30/0x34) [ 16.165774] r6:df2e6900 r5:80e0c2f8 r4:dfb1a2c8 r3:def04000 [ 16.171495] [<8003d514>] (queue_work+0x0/0x34) from [<80493284>] (rpc_make_runnable+0x9c/0xac) [ 16.180113] [<804931e8>] (rpc_make_runnable+0x0/0xac) from [<80493c88>] (rpc_execute+0x40/0xa8) [ 16.188811] r5:def05ad4 r4:dfb1a2c8 [ 16.192426] [<80493c48>] (rpc_execute+0x0/0xa8) from [<8048c734>] (rpc_run_task+0xa8/0xb4) [ 16.200690] r8:00000001 r7:df74f520 r6:ded75700 r5:def05ad4 r4:dfb1a2c8 r3:00000002 [ 16.208618] [<8048c68c>] (rpc_run_task+0x0/0xb4) from [<801f1608>] (nfs_initiate_read+0xb4/0xd4) [ 16.217403] r5:df3e86c0 r4:00000000 [ 16.221015] [<801f1554>] (nfs_initiate_read+0x0/0xd4) from [<801f1c64>] (nfs_generic_pg_readpages+0x9c/0x114) [ 16.230937] [<801f1bc8>] (nfs_generic_pg_readpages+0x0/0x114) from [<801f0744>] (__nfs_pageio_add_request+0xe8/0x214) [ 16.241545] r8:000bf000 r7:00000000 r6:00000000 r5:deef4640 r4:def05c1c r3:801f1bc8 [ 16.249463] [<801f065c>] (__nfs_pageio_add_request+0x0/0x214) from [<801f0e3c>] (nfs_pageio_add_request+0x28/0x54) [ 16.259818] [<801f0e14>] (nfs_pageio_add_request+0x0/0x54) from [<801f1394>] (readpage_async_filler+0x114/0x170) [ 16.269992] r5:def05c58 r4:80fd7300 [ 16.273607] [<801f1280>] (readpage_async_filler+0x0/0x170) from [<800bb418>] (read_cache_pages+0xa0/0x108) [ 16.283259] r8:00200200 r7:00100100 r6:df74f654 r5:def05cd0 r4:80fd7300 [ 16.290034] [<800bb378>] (read_cache_pages+0x0/0x108) from [<801f218c>] (nfs_readpages+0xc4/0x168) [ 16.298999] [<801f20c8>] (nfs_readpages+0x0/0x168) from [<800bb1d0>] (__do_page_cache_readahead+0x254/0x354) [ 16.308833] [<800baf7c>] (__do_page_cache_readahead+0x0/0x354) from [<800bb5d0>] (ra_submit+0x38/0x40) [ 16.318145] [<800bb598>] (ra_submit+0x0/0x40) from [<800bb6b0>] (ondemand_readahead+0xd8/0x3b0) [ 16.326851] [<800bb5d8>] (ondemand_readahead+0x0/0x3b0) from [<800bba20>] (page_cache_async_readahead+0x98/0xa8) [ 16.337043] [<800bb988>] (page_cache_async_readahead+0x0/0xa8) from [<800b2118>] (generic_file_aio_read+0x5b4/0x7c4) [ 16.347565] r6:00000000 r5:df74f654 r4:80fd70a0 [ 16.352231] [<800b1b64>] (generic_file_aio_read+0x0/0x7c4) from [<801e82c0>] (nfs_file_read+0x7c/0xcc) [ 16.361544] [<801e8244>] (nfs_file_read+0x0/0xcc) from [<800eab80>] (do_sync_read+0xb4/0xf4) [ 16.369981] r9:00000000 r8:def05f70 r7:00000000 r6:00000000 r5:dec34900 r4:fffffdee [ 16.377896] [<800eaacc>] (do_sync_read+0x0/0xf4) from [<800eb548>] (vfs_read+0xb4/0x144) [ 16.385987] r8:00000000 r7:def05f70 r6:76a95008 r5:003e3dd6 r4:dec34900 [ 16.392761] [<800eb494>] (vfs_read+0x0/0x144) from [<800eb624>] (sys_read+0x4c/0x78) [ 16.400504] r8:00000000 r7:00000003 r6:003e3dd6 r5:76a95008 r4:dec34900 [ 16.407279] [<800eb5d8>] (sys_read+0x0/0x78) from [<8000e5c0>] (ret_fast_syscall+0x0/0x48) [ 16.415543] r9:def04000 r8:8000e864 r6:000086b4 r5:00000000 r4:00000000 [ 20.872729] gadget: high-speed config #1: Video [ 20.877368] gadget: uvc_function_set_alt(0, 0) [ 20.881908] gadget: uvc_function_set_alt(1, 0) [ 20.891464] gadget: uvc_function_set_alt(1, 0) Signed-off-by: Michael Grzeschik <[email protected]> Signed-off-by: Felipe Balbi <[email protected]>

This moves ARM over to the asm-generic/unaligned.h header. This has the benefit of better code generated especially for ARMv7 on gcc 4.7+ compilers. As Arnd Bergmann, points out: The asm-generic version uses the "struct" version for native-endian unaligned access and the "byteshift" version for the opposite endianess. The current ARM version however uses the "byteshift" implementation for both. Thanks to Nicolas Pitre for the excellent analysis: Test case: int foo (int *x) { return get_unaligned(x); } long long bar (long long *x) { return get_unaligned(x); } With the current ARM version: foo: ldrb r3, [r0, #2] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 2B], MEM[(const u8 *)x_1(D) + 2B] ldrb r1, [r0, #1] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 1B], MEM[(const u8 *)x_1(D) + 1B] ldrb r2, [r0, #0] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D)], MEM[(const u8 *)x_1(D)] mov r3, r3, asl torvalds#16 @ tmp154, MEM[(const u8 *)x_1(D) + 2B], ldrb r0, [r0, #3] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 3B], MEM[(const u8 *)x_1(D) + 3B] orr r3, r3, r1, asl torvalds#8 @, tmp155, tmp154, MEM[(const u8 *)x_1(D) + 1B], orr r3, r3, r2 @ tmp157, tmp155, MEM[(const u8 *)x_1(D)] orr r0, r3, r0, asl torvalds#24 @,, tmp157, MEM[(const u8 *)x_1(D) + 3B], bx lr @ bar: stmfd sp!, {r4, r5, r6, r7} @, mov r2, #0 @ tmp184, ldrb r5, [r0, torvalds#6] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 6B], MEM[(const u8 *)x_1(D) + 6B] ldrb r4, [r0, #5] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 5B], MEM[(const u8 *)x_1(D) + 5B] ldrb ip, [r0, #2] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 2B], MEM[(const u8 *)x_1(D) + 2B] ldrb r1, [r0, #4] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 4B], MEM[(const u8 *)x_1(D) + 4B] mov r5, r5, asl torvalds#16 @ tmp175, MEM[(const u8 *)x_1(D) + 6B], ldrb r7, [r0, #1] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 1B], MEM[(const u8 *)x_1(D) + 1B] orr r5, r5, r4, asl torvalds#8 @, tmp176, tmp175, MEM[(const u8 *)x_1(D) + 5B], ldrb r6, [r0, torvalds#7] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 7B], MEM[(const u8 *)x_1(D) + 7B] orr r5, r5, r1 @ tmp178, tmp176, MEM[(const u8 *)x_1(D) + 4B] ldrb r4, [r0, #0] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D)], MEM[(const u8 *)x_1(D)] mov ip, ip, asl torvalds#16 @ tmp188, MEM[(const u8 *)x_1(D) + 2B], ldrb r1, [r0, #3] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 3B], MEM[(const u8 *)x_1(D) + 3B] orr ip, ip, r7, asl torvalds#8 @, tmp189, tmp188, MEM[(const u8 *)x_1(D) + 1B], orr r3, r5, r6, asl torvalds#24 @,, tmp178, MEM[(const u8 *)x_1(D) + 7B], orr ip, ip, r4 @ tmp191, tmp189, MEM[(const u8 *)x_1(D)] orr ip, ip, r1, asl torvalds#24 @, tmp194, tmp191, MEM[(const u8 *)x_1(D) + 3B], mov r1, r3 @, orr r0, r2, ip @ tmp171, tmp184, tmp194 ldmfd sp!, {r4, r5, r6, r7} bx lr In both cases the code is slightly suboptimal. One may wonder why wasting r2 with the constant 0 in the second case for example. And all the mov's could be folded in subsequent orr's, etc. Now with the asm-generic version: foo: ldr r0, [r0, #0] @ unaligned @,* x bx lr @ bar: mov r3, r0 @ x, x ldr r0, [r0, #0] @ unaligned @,* x ldr r1, [r3, #4] @ unaligned @, bx lr @ This is way better of course, but only because this was compiled for ARMv7. In this case the compiler knows that the hardware can do unaligned word access. This isn't that obvious for foo(), but if we remove the get_unaligned() from bar as follows: long long bar (long long *x) {return *x; } then the resulting code is: bar: ldmia r0, {r0, r1} @ x,, bx lr @ So this proves that the presumed aligned vs unaligned cases does have influence on the instructions the compiler may use and that the above unaligned code results are not just an accident. Still... this isn't fully conclusive without at least looking at the resulting assembly fron a pre ARMv6 compilation. Let's see with an ARMv5 target: foo: ldrb r3, [r0, #0] @ zero_extendqisi2 @ tmp139,* x ldrb r1, [r0, #1] @ zero_extendqisi2 @ tmp140, ldrb r2, [r0, #2] @ zero_extendqisi2 @ tmp143, ldrb r0, [r0, #3] @ zero_extendqisi2 @ tmp146, orr r3, r3, r1, asl torvalds#8 @, tmp142, tmp139, tmp140, orr r3, r3, r2, asl torvalds#16 @, tmp145, tmp142, tmp143, orr r0, r3, r0, asl torvalds#24 @,, tmp145, tmp146, bx lr @ bar: stmfd sp!, {r4, r5, r6, r7} @, ldrb r2, [r0, #0] @ zero_extendqisi2 @ tmp139,* x ldrb r7, [r0, #1] @ zero_extendqisi2 @ tmp140, ldrb r3, [r0, #4] @ zero_extendqisi2 @ tmp149, ldrb r6, [r0, #5] @ zero_extendqisi2 @ tmp150, ldrb r5, [r0, #2] @ zero_extendqisi2 @ tmp143, ldrb r4, [r0, torvalds#6] @ zero_extendqisi2 @ tmp153, ldrb r1, [r0, torvalds#7] @ zero_extendqisi2 @ tmp156, ldrb ip, [r0, #3] @ zero_extendqisi2 @ tmp146, orr r2, r2, r7, asl torvalds#8 @, tmp142, tmp139, tmp140, orr r3, r3, r6, asl torvalds#8 @, tmp152, tmp149, tmp150, orr r2, r2, r5, asl torvalds#16 @, tmp145, tmp142, tmp143, orr r3, r3, r4, asl torvalds#16 @, tmp155, tmp152, tmp153, orr r0, r2, ip, asl torvalds#24 @,, tmp145, tmp146, orr r1, r3, r1, asl torvalds#24 @,, tmp155, tmp156, ldmfd sp!, {r4, r5, r6, r7} bx lr Compared to the initial results, this is really nicely optimized and I couldn't do much better if I were to hand code it myself. Signed-off-by: Rob Herring <[email protected]> Reviewed-by: Nicolas Pitre <[email protected]> Tested-by: Thomas Petazzoni <[email protected]> Reviewed-by: Arnd Bergmann <[email protected]> Signed-off-by: Russell King <[email protected]>

This patch fixes stmmac_pltfr_remove function, which is broken because, it is accessing plat variable via freed memory priv pointer which gets freed by free_netdev called from stmmac_dvr_remove. In short this patch caches the plat pointer in local variable before calling stmmac_dvr_remove to prevent code accessing freed memory. Without this patch any attempt to remove the stmmac device will fail as below: Unregistering eth 0 ... Unable to handle kernel paging request at virtual address 6b6b6bab pgd = de5dc000 [6b6b6bab] *pgd=00000000 Internal error: Oops: 5 [#1] PREEMPT SMP Modules linked in: cdev(O+) CPU: 0 Tainted: G O (3.3.1_stm24_0210-b2000+ torvalds#25) PC is at stmmac_pltfr_remove+0x2c/0xa0 LR is at stmmac_pltfr_remove+0x28/0xa0 pc : [<c01b8908>] lr : [<c01b8904>] psr: 60000013 sp : def6be78 ip : de6c5a00 fp : 00000000 r10: 00000028 r9 : c082d81d r8 : 00000001 r7 : de65a600 r6 : df81b240 r5 : c0413fd8 r4 : 00000000 r3 : 6b6b6b6b r2 : def6be6c r1 : c0355e2b r0 : 00000020 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 10c53c7d Table: 5e5dc04a DAC: 00000015 Process insmod (pid: 738, stack limit = 0xdef6a2f0) Stack: (0xdef6be78 to 0xdef6c000) be60: c0413fe0 c0403658 be80: c0400bb0 c019270c c01926f8 c0191478 00000000 c0414014 c0413fe0 c01914d8 bea0: 00000000 c0413fe0 df8045d0 c019109c c0413fe0 c0400bf0 c0413fd8 c018f04c bec0: 00000000 bf000000 c0413fd8 c01929a0 c0413fd8 bf000000 00000000 c0192bfc bee0: bf00009c bf000014 def6a000 c000859c 00000000 00000001 bf00009c bf00009c bf00: 00000001 bf00009c 00000001 bf0000e4 de65a600 00000001 c082d81d c0058cd0 bf20: bf0000a8 c004fbd8 c0056414 c082d815 c02aea20 bf0001f0 00b0b008 e0846208 bf40: c03ec8a0 e0846000 0000db0d e0850604 e08504de e0853a24 00000204 000002d4 bf60: 00000000 00000000 0000001c 0000001d 00000009 00000000 00000006 00000000 bf80: 00000003 f63d4e2e 0000db0d bef02ed8 00000080 c000d2e8 def6a000 00000000 bfa0: 00000000 c000d140 f63d4e2e 0000db0d 00b0b018 0000db0d 00b0b008 b6f4f298 bfc0: f63d4e2e 0000db0d bef02ed8 00000080 00000003 00000000 00010000 00000000 bfe0: 00b0b008 bef02c64 00008d20 b6ef3784 60000010 00b0b018 5a5a5a5a 5a5a5a5a [<c01b8908>] (stmmac_pltfr_remove+0x2c/0xa0) from [<c019270c>] (platform_drv_remove+0x14/0x18) [<c019270c>] (platform_drv_remove+0x14/0x18) from [<c0191478>] (__device_release_driver+0x64/0xa4) [<c0191478>] (__device_release_driver+0x64/0xa4) from [<c01914d8>] (device_release_driver+0x20/0x2c) [<c01914d8>] (device_release_driver+0x20/0x2c) from [<c019109c>] (bus_remove_device+0xcc/0xdc) [<c019109c>] (bus_remove_device+0xcc/0xdc) from [<c018f04c>] (device_del+0x104/0x160) [<c018f04c>] (device_del+0x104/0x160) from [<c01929a0>] (platform_device_del+0x18/0x58) [<c01929a0>] (platform_device_del+0x18/0x58) from [<c0192bfc>] (platform_device_unregister+0xc/0x18) [<c0192bfc>] (platform_device_unregister+0xc/0x18) from [<bf000014>] (r_init+0x14/0x2c [cdev]) [<bf000014>] (r_init+0x14/0x2c [cdev]) from [<c000859c>] (do_one_initcall+0x90/0x160) [<c000859c>] (do_one_initcall+0x90/0x160) from [<c0058cd0>] (sys_init_module+0x15c4/0x1794) [<c0058cd0>] (sys_init_module+0x15c4/0x1794) from [<c000d140>] (ret_fast_syscall+0x0/0x30) Code: e1a04000 e59f0070 eb039b65 e59636e4 (e5933040) Signed-off-by: Srinivas Kandagatla <[email protected]> Signed-off-by: David S. Miller <[email protected]>

It seems we need to provide ability for stacked devices to use specific lock_class_key for sch->busylock We could instead default l2tpeth tx_queue_len to 0 (no qdisc), but a user might use a qdisc anyway. (So same fixes are probably needed on non LLTX stacked drivers) Noticed while stressing L2TPV3 setup : ====================================================== [ INFO: possible circular locking dependency detected ] 3.6.0-rc3+ #788 Not tainted ------------------------------------------------------- netperf/4660 is trying to acquire lock: (l2tpsock){+.-...}, at: [<ffffffffa0208db2>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core] but task is already holding lock: (&(&sch->busylock)->rlock){+.-...}, at: [<ffffffff81596595>] dev_queue_xmit+0xd75/0xe00 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&(&sch->busylock)->rlock){+.-...}: [<ffffffff810a5df0>] lock_acquire+0x90/0x200 [<ffffffff817499fc>] _raw_spin_lock_irqsave+0x4c/0x60 [<ffffffff81074872>] __wake_up+0x32/0x70 [<ffffffff8136d39e>] tty_wakeup+0x3e/0x80 [<ffffffff81378fb3>] pty_write+0x73/0x80 [<ffffffff8136cb4c>] tty_put_char+0x3c/0x40 [<ffffffff813722b2>] process_echoes+0x142/0x330 [<ffffffff813742ab>] n_tty_receive_buf+0x8fb/0x1230 [<ffffffff813777b2>] flush_to_ldisc+0x142/0x1c0 [<ffffffff81062818>] process_one_work+0x198/0x760 [<ffffffff81063236>] worker_thread+0x186/0x4b0 [<ffffffff810694d3>] kthread+0x93/0xa0 [<ffffffff81753e24>] kernel_thread_helper+0x4/0x10 -> #0 (l2tpsock){+.-...}: [<ffffffff810a5288>] __lock_acquire+0x1628/0x1b10 [<ffffffff810a5df0>] lock_acquire+0x90/0x200 [<ffffffff817498c1>] _raw_spin_lock+0x41/0x50 [<ffffffffa0208db2>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core] [<ffffffffa021a802>] l2tp_eth_dev_xmit+0x32/0x60 [l2tp_eth] [<ffffffff815952b2>] dev_hard_start_xmit+0x502/0xa70 [<ffffffff815b63ce>] sch_direct_xmit+0xfe/0x290 [<ffffffff81595a05>] dev_queue_xmit+0x1e5/0xe00 [<ffffffff815d9d60>] ip_finish_output+0x3d0/0x890 [<ffffffff815db019>] ip_output+0x59/0xf0 [<ffffffff815da36d>] ip_local_out+0x2d/0xa0 [<ffffffff815da5a3>] ip_queue_xmit+0x1c3/0x680 [<ffffffff815f4192>] tcp_transmit_skb+0x402/0xa60 [<ffffffff815f4a94>] tcp_write_xmit+0x1f4/0xa30 [<ffffffff815f5300>] tcp_push_one+0x30/0x40 [<ffffffff815e6672>] tcp_sendmsg+0xe82/0x1040 [<ffffffff81614495>] inet_sendmsg+0x125/0x230 [<ffffffff81576cdc>] sock_sendmsg+0xdc/0xf0 [<ffffffff81579ece>] sys_sendto+0xfe/0x130 [<ffffffff81752c92>] system_call_fastpath+0x16/0x1b Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&(&sch->busylock)->rlock); lock(l2tpsock); lock(&(&sch->busylock)->rlock); lock(l2tpsock); *** DEADLOCK *** 5 locks held by netperf/4660: #0: (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff815e581c>] tcp_sendmsg+0x2c/0x1040 #1: (rcu_read_lock){.+.+..}, at: [<ffffffff815da3e0>] ip_queue_xmit+0x0/0x680 #2: (rcu_read_lock_bh){.+....}, at: [<ffffffff815d9ac5>] ip_finish_output+0x135/0x890 #3: (rcu_read_lock_bh){.+....}, at: [<ffffffff81595820>] dev_queue_xmit+0x0/0xe00 #4: (&(&sch->busylock)->rlock){+.-...}, at: [<ffffffff81596595>] dev_queue_xmit+0xd75/0xe00 stack backtrace: Pid: 4660, comm: netperf Not tainted 3.6.0-rc3+ #788 Call Trace: [<ffffffff8173dbf8>] print_circular_bug+0x1fb/0x20c [<ffffffff810a5288>] __lock_acquire+0x1628/0x1b10 [<ffffffff810a334b>] ? check_usage+0x9b/0x4d0 [<ffffffff810a3f44>] ? __lock_acquire+0x2e4/0x1b10 [<ffffffff810a5df0>] lock_acquire+0x90/0x200 [<ffffffffa0208db2>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core] [<ffffffff817498c1>] _raw_spin_lock+0x41/0x50 [<ffffffffa0208db2>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core] [<ffffffffa0208db2>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core] [<ffffffffa021a802>] l2tp_eth_dev_xmit+0x32/0x60 [l2tp_eth] [<ffffffff815952b2>] dev_hard_start_xmit+0x502/0xa70 [<ffffffff81594e0e>] ? dev_hard_start_xmit+0x5e/0xa70 [<ffffffff81595961>] ? dev_queue_xmit+0x141/0xe00 [<ffffffff815b63ce>] sch_direct_xmit+0xfe/0x290 [<ffffffff81595a05>] dev_queue_xmit+0x1e5/0xe00 [<ffffffff81595820>] ? dev_hard_start_xmit+0xa70/0xa70 [<ffffffff815d9d60>] ip_finish_output+0x3d0/0x890 [<ffffffff815d9ac5>] ? ip_finish_output+0x135/0x890 [<ffffffff815db019>] ip_output+0x59/0xf0 [<ffffffff815da36d>] ip_local_out+0x2d/0xa0 [<ffffffff815da5a3>] ip_queue_xmit+0x1c3/0x680 [<ffffffff815da3e0>] ? ip_local_out+0xa0/0xa0 [<ffffffff815f4192>] tcp_transmit_skb+0x402/0xa60 [<ffffffff815fa25e>] ? tcp_md5_do_lookup+0x18e/0x1a0 [<ffffffff815f4a94>] tcp_write_xmit+0x1f4/0xa30 [<ffffffff815f5300>] tcp_push_one+0x30/0x40 [<ffffffff815e6672>] tcp_sendmsg+0xe82/0x1040 [<ffffffff81614495>] inet_sendmsg+0x125/0x230 [<ffffffff81614370>] ? inet_create+0x6b0/0x6b0 [<ffffffff8157e6e2>] ? sock_update_classid+0xc2/0x3b0 [<ffffffff8157e750>] ? sock_update_classid+0x130/0x3b0 [<ffffffff81576cdc>] sock_sendmsg+0xdc/0xf0 [<ffffffff81162579>] ? fget_light+0x3f9/0x4f0 [<ffffffff81579ece>] sys_sendto+0xfe/0x130 [<ffffffff810a69ad>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff8174a0b0>] ? _raw_spin_unlock_irq+0x30/0x50 [<ffffffff810757e3>] ? finish_task_switch+0x83/0xf0 [<ffffffff810757a6>] ? finish_task_switch+0x46/0xf0 [<ffffffff81752cb7>] ? sysret_check+0x1b/0x56 [<ffffffff81752c92>] system_call_fastpath+0x16/0x1b Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>

Fengguang Wu reported various panics and bisected to commit 8336886 (tcp: TCP Fast Open Server - support TFO listeners) Fix this by making sure socket is a TCP socket before accessing TFO data structures. [ 233.046014] kfree_debugcheck: out of range ptr ea6000000bb8h. [ 233.047399] ------------[ cut here ]------------ [ 233.048393] kernel BUG at /c/kernel-tests/src/stable/mm/slab.c:3074! [ 233.048393] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC [ 233.048393] Modules linked in: [ 233.048393] CPU 0 [ 233.048393] Pid: 3929, comm: trinity-watchdo Not tainted 3.6.0-rc3+ #4192 Bochs Bochs [ 233.048393] RIP: 0010:[<ffffffff81169653>] [<ffffffff81169653>] kfree_debugcheck+0x27/0x2d [ 233.048393] RSP: 0018:ffff88000facbca8 EFLAGS: 00010092 [ 233.048393] RAX: 0000000000000031 RBX: 0000ea6000000bb8 RCX: 00000000a189a188 [ 233.048393] RDX: 000000000000a189 RSI: ffffffff8108ad32 RDI: ffffffff810d30f9 [ 233.048393] RBP: ffff88000facbcb8 R08: 0000000000000002 R09: ffffffff843846f0 [ 233.048393] R10: ffffffff810ae37c R11: 0000000000000908 R12: 0000000000000202 [ 233.048393] R13: ffffffff823dbd5a R14: ffff88000ec5bea8 R15: ffffffff8363c780 [ 233.048393] FS: 00007faa6899c700(0000) GS:ffff88001f200000(0000) knlGS:0000000000000000 [ 233.048393] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 233.048393] CR2: 00007faa6841019c CR3: 0000000012c82000 CR4: 00000000000006f0 [ 233.048393] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 233.048393] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 233.048393] Process trinity-watchdo (pid: 3929, threadinfo ffff88000faca000, task ffff88000faec600) [ 233.048393] Stack: [ 233.048393] 0000000000000000 0000ea6000000bb8 ffff88000facbce8 ffffffff8116ad81 [ 233.048393] ffff88000ff588a0 ffff88000ff58850 ffff88000ff588a0 0000000000000000 [ 233.048393] ffff88000facbd08 ffffffff823dbd5a ffffffff823dbcb0 ffff88000ff58850 [ 233.048393] Call Trace: [ 233.048393] [<ffffffff8116ad81>] kfree+0x5f/0xca [ 233.048393] [<ffffffff823dbd5a>] inet_sock_destruct+0xaa/0x13c [ 233.048393] [<ffffffff823dbcb0>] ? inet_sk_rebuild_header +0x319/0x319 [ 233.048393] [<ffffffff8231c307>] __sk_free+0x21/0x14b [ 233.048393] [<ffffffff8231c4bd>] sk_free+0x26/0x2a [ 233.048393] [<ffffffff825372db>] sctp_close+0x215/0x224 [ 233.048393] [<ffffffff810d6835>] ? lock_release+0x16f/0x1b9 [ 233.048393] [<ffffffff823daf12>] inet_release+0x7e/0x85 [ 233.048393] [<ffffffff82317d15>] sock_release+0x1f/0x77 [ 233.048393] [<ffffffff82317d94>] sock_close+0x27/0x2b [ 233.048393] [<ffffffff81173bbe>] __fput+0x101/0x20a [ 233.048393] [<ffffffff81173cd5>] ____fput+0xe/0x10 [ 233.048393] [<ffffffff810a3794>] task_work_run+0x5d/0x75 [ 233.048393] [<ffffffff8108da70>] do_exit+0x290/0x7f5 [ 233.048393] [<ffffffff82707415>] ? retint_swapgs+0x13/0x1b [ 233.048393] [<ffffffff8108e23f>] do_group_exit+0x7b/0xba [ 233.048393] [<ffffffff8108e295>] sys_exit_group+0x17/0x17 [ 233.048393] [<ffffffff8270de10>] tracesys+0xdd/0xe2 [ 233.048393] Code: 59 01 5d c3 55 48 89 e5 53 41 50 0f 1f 44 00 00 48 89 fb e8 d4 b0 f0 ff 84 c0 75 11 48 89 de 48 c7 c7 fc fa f7 82 e8 0d 0f 57 01 <0f> 0b 5f 5b 5d c3 55 48 89 e5 0f 1f 44 00 00 48 63 87 d8 00 00 [ 233.048393] RIP [<ffffffff81169653>] kfree_debugcheck+0x27/0x2d [ 233.048393] RSP <ffff88000facbca8> Reported-by: Fengguang Wu <[email protected]> Tested-by: Fengguang Wu <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Cc: "H.K. Jerry Chu" <[email protected]> Acked-by: Neal Cardwell <[email protected]> Acked-by: H.K. Jerry Chu <[email protected]> Signed-off-by: David S. Miller <[email protected]>

When we kill the radio with the RF kill button we could access the HW after having stopped the APM which would result in the warning below. The flow goes like this: * RF kill iwlwifi notifies the stack which stops the driver fw sends CARD_STATE_NOTIFICATION * iwl_trans_pcie_stop_device stops the APM * the tasklet runs and calls to iwl_rx_handle * iwl_rx_handle calls iwl_rx_queue_restock * iwl_rx_queue_restock tries to access the HW... [255908.543823] ------------[ cut here ]------------ [255908.543843] WARNING: at drivers/net/wireless/iwlwifi/iwl-io.c:150 iwl_grab_nic_access+0x79/0xb0 [iwlwifi]() [255908.543849] Hardware name: Latitude E6410 [255908.543852] Timeout waiting for hardware access (CSR_GP_CNTRL 0x000003d8) [255908.543856] Modules linked in: iwlmvm iwlwifi mac80211 [...] [255908.543935] Pid: 0, comm: swapper Tainted: G W 3.1.0 #1 [255908.543939] Call Trace: [255908.543950] [<c1046e42>] warn_slowpath_common+0x72/0xa0 [255908.543980] [<c1046f13>] warn_slowpath_fmt+0x33/0x40 [255908.543992] [<fa4bb3b9>] iwl_grab_nic_access+0x79/0xb0 [iwlwifi] [255908.544004] [<fa4bb9eb>] iwl_write_direct32+0x2b/0xa0 [iwlwifi] [255908.544018] [<fa4c0ff9>] iwl_rx_queue_update_write_ptr+0x89/0x1d0 [iwlwifi] [255908.544054] [<fa4c1250>] iwlagn_rx_queue_restock+0x110/0x140 [iwlwifi] [255908.544067] [<fa4c234d>] iwl_irq_tasklet+0x82d/0xf40 [iwlwifi] [255908.544096] [<c104e11e>] tasklet_action+0xbe/0x100 [255908.544102] [<c104d91e>] __do_softirq+0xae/0x1f0 [255908.544227] ---[ end trace d150f49345d85009 ]--- Prevent this. Signed-off-by: Emmanuel Grumbach <[email protected]> Reviewed-by: Johannes Berg <[email protected]> Signed-off-by: Johannes Berg <[email protected]>

This patch fixes a regression which was introduced in: "mac80211: move TX station pointer and restructure TX" IP: p54_tx_80211+0x21/0x513 [p54common] Oops: 0000 [#1] SMP Modules linked in: p54usb p54common [...] Pid: 13394, comm: hostapd 3.6.0-rc4-wl+ RIP: 0010:p54_tx_80211+0x21/0x513 RSP: 0018:... EFLAGS: 00010292 [...] Process hostapd Stack: [...] Call Trace: p54_bss_info_changed+0x204/0x21e [p54common] ieee80211_del_station+0x16/0x32 [mac80211] ieee80211_start_ap+0x10f/0x157 [mac80211] nl80211_start_ap+0x315/0x361 [cfg80211] p54_tx_80211 function is called as part of the beacon update. The caller p54_bss_info_changed has to supply a valid tx control struct, or the control->sta will lead to a null pointer dereference. Signed-off-by: Christian Lamparter <[email protected]> Signed-off-by: John W. Linville <[email protected]>

There are cases when cryptlen can be zero in crypto_ccm_auth(): -encryptiom: input scatterlist length is zero (no plaintext) -decryption: input scatterlist contains only the mac plus the condition of having different source and destination buffers (or else scatterlist length = max(plaintext_len, ciphertext_len)). These are not handled correctly, leading to crashes like: root@p4080ds:~/crypto# insmod tcrypt.ko mode=45 ------------[ cut here ]------------ kernel BUG at crypto/scatterwalk.c:37! Oops: Exception in kernel mode, sig: 5 [#1] SMP NR_CPUS=8 P4080 DS Modules linked in: tcrypt(+) crc32c xts xcbc vmac pcbc ecb gcm ghash_generic gf128mul ccm ctr seqiv CPU: 3 PID: 1082 Comm: cryptomgr_test Not tainted 3.11.0 torvalds#14 task: ee12c5b0 ti: eecd0000 task.ti: eecd0000 NIP: c0204d98 LR: f9225848 CTR: c0204d80 REGS: eecd1b70 TRAP: 0700 Not tainted (3.11.0) MSR: 00029002 <CE,EE,ME> CR: 22044022 XER: 20000000 GPR00: f9225c94 eecd1c20 ee12c5b0 eecd1c28 ee879400 ee879400 00000000 ee607464 GPR08: 00000001 00000001 00000000 006b0000 c0204d80 00000000 00000002 c0698e20 GPR16: ee987000 ee895000 fffffff4 ee879500 00000100 eecd1d58 00000001 00000000 GPR24: ee879400 00000020 00000000 00000000 ee5b2800 ee607430 00000004 ee607460 NIP [c0204d98] scatterwalk_start+0x18/0x30 LR [f9225848] get_data_to_compute+0x28/0x2f0 [ccm] Call Trace: [eecd1c20] [f9225974] get_data_to_compute+0x154/0x2f0 [ccm] (unreliable) [eecd1c70] [f9225c94] crypto_ccm_auth+0x184/0x1d0 [ccm] [eecd1cb0] [f9225d40] crypto_ccm_encrypt+0x60/0x2d0 [ccm] [eecd1cf0] [c020d77c] __test_aead+0x3ec/0xe20 [eecd1e20] [c020f35c] test_aead+0x6c/0xe0 [eecd1e40] [c020f420] alg_test_aead+0x50/0xd0 [eecd1e60] [c020e5e4] alg_test+0x114/0x2e0 [eecd1ee0] [c020bd1c] cryptomgr_test+0x4c/0x60 [eecd1ef0] [c0047058] kthread+0xa8/0xb0 [eecd1f40] [c000eb0c] ret_from_kernel_thread+0x5c/0x64 Instruction dump: 0f080000 81290024 552807fe 0f080000 5529003a 4bffffb4 90830000 39400000 39000001 8124000c 2f890000 7d28579e <0f090000> 81240008 91230004 4e800020 ---[ end trace 6d652dfcd1be37bd ]--- Cc: <[email protected]> Cc: Jussi Kivilinna <[email protected]> Signed-off-by: Horia Geanta <[email protected]> Signed-off-by: Herbert Xu <[email protected]>

For aead case when source and destination buffers are different, there is an incorrect assumption that the source length includes the ICV length. Fix this, since it leads to an oops when using sg_count() to find the number of nents in the scatterlist: Unable to handle kernel paging request for data at address 0x00000004 Faulting instruction address: 0xf91f7634 Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=8 P4080 DS Modules linked in: caamalg(+) caam_jr caam CPU: 1 PID: 1053 Comm: cryptomgr_test Not tainted 3.11.0 torvalds#16 task: eeb24ab0 ti: eeafa000 task.ti: eeafa000 NIP: f91f7634 LR: f91f7f24 CTR: f91f7ef0 REGS: eeafbbc0 TRAP: 0300 Not tainted (3.11.0) MSR: 00029002 <CE,EE,ME> CR: 44044044 XER: 00000000 DEAR: 00000004, ESR: 00000000 GPR00: f91f7f24 eeafbc70 eeb24ab0 00000002 ee8e0900 ee8e0800 00000024 c45c4462 GPR08: 00000010 00000000 00000014 0c0e4000 24044044 00000000 00000000 c0691590 GPR16: eeab0000 eeb23000 00000000 00000000 00000000 00000001 00000001 eeafbcc8 GPR24: 000000d1 00000010 ee2d5000 ee49ea1 ee49ea1 ee46f640 ee46f640 c0691590 NIP [f91f7634] aead_edesc_alloc.constprop.14+0x144/0x780 [caamalg] LR [f91f7f24] aead_encrypt+0x34/0x288 [caamalg] Call Trace: [eeafbc70] [a1004000] 0xa1004000 (unreliable) [eeafbcc0] [f91f7f24] aead_encrypt+0x34/0x288 [caamalg] [eeafbcf0] [c020d77c] __test_aead+0x3ec/0xe20 [eeafbe20] [c020f35c] test_aead+0x6c/0xe0 [eeafbe40] [c020f420] alg_test_aead+0x50/0xd0 [eeafbe60] [c020e5e4] alg_test+0x114/0x2e0 [eeafbee0] [c020bd1c] cryptomgr_test+0x4c/0x60 [eeafbef0] [c0047058] kthread+0xa8/0xb0 [eeafbf40] [c000eb0c] ret_from_kernel_thread+0x5c/0x64 Instruction dump: 69084321 7d080034 5508d97e 69080001 0f080000 81290024 552807fe 0f080000 3a600001 5529003a 2f8a0000 40dd0028 <80e90004> 3ab50001 8109000c 70e30002 ---[ end trace b3c3e23925c7484e ]--- While here, add a tcrypt mode for making it easy to test authenc (needed for triggering case above). Signed-off-by: Horia Geanta <[email protected]> Signed-off-by: Herbert Xu <[email protected]>

For aead case when source and destination buffers are different, there is an incorrect assumption that the source length includes the ICV length. Fix this, since it leads to an oops when using sg_count() to find the number of nents in the scatterlist: Unable to handle kernel paging request for data at address 0x00000004 Faulting instruction address: 0xf2265a28 Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=8 P2020 RDB Modules linked in: talitos(+) CPU: 1 PID: 2187 Comm: cryptomgr_test Not tainted 3.11.0 torvalds#12 task: c4e72e20 ti: ef634000 task.ti: ef634000 NIP: f2265a28 LR: f2266ad8 CTR: c000c900 REGS: ef635bb0 TRAP: 0300 Not tainted (3.11.0) MSR: 00029000 <CE,EE,ME> CR: 42042084 XER: 00000000 DEAR: 00000004, ESR: 00000000 GPR00: f2266e10 ef635c60 c4e72e20 00000001 00000014 ef635c69 00000001 c11f308 GPR08: 00000010 00000000 00000002 2f635d58 22044084 00000000 00000000 c0755c80 GPR16: c4bf1000 ef784000 00000000 00000000 00000020 00000014 00000010 ef2f6100 GPR24: ef2f6200 00000024 ef143210 ef2f6000 00000000 ef635d58 00000000 2f635d58 NIP [f2265a28] sg_count+0x1c/0xb4 [talitos] LR [f2266ad8] talitos_edesc_alloc+0x12c/0x410 [talitos] Call Trace: [ef635c60] [c0552068] schedule_timeout+0x148/0x1ac (unreliable) [ef635cc0] [f2266e10] aead_edesc_alloc+0x54/0x64 [talitos] [ef635ce0] [f22680f0] aead_encrypt+0x24/0x70 [talitos] [ef635cf0] [c024b948] __test_aead+0x494/0xf68 [ef635e20] [c024d54c] test_aead+0x64/0xcc [ef635e40] [c024d604] alg_test_aead+0x50/0xc4 [ef635e60] [c024c838] alg_test+0x10c/0x2e4 [ef635ee0] [c0249d1c] cryptomgr_test+0x4c/0x54 [ef635ef0] [c005d598] kthread+0xa8/0xac [ef635f40] [c000e3bc] ret_from_kernel_thread+0x5c/0x64 Instruction dump: 81230024 552807fe 0f080000 5523003a 4bffff24 39000000 2c040000 99050000 408100a0 7c691b78 38c00001 38600000 <80e90004> 38630001 8109000c 70ea0002 ---[ end trace 4498123cd8478591 ]--- Signed-off-by: Horia Geanta <[email protected]> Signed-off-by: Herbert Xu <[email protected]>

Using iperf to send packets(GSO mode is on), a bug is triggered: [ 212.672781] kernel BUG at lib/dynamic_queue_limits.c:26! [ 212.673396] invalid opcode: 0000 [#1] SMP [ 212.673882] Modules linked in: 8139cp(O) nls_utf8 edd fuse loop dm_mod ipv6 i2c_piix4 8139too i2c_core intel_agp joydev pcspkr hid_generic intel_gtt floppy sr_mod mii button sg cdrom ext3 jbd mbcache usbhid hid uhci_hcd ehci_hcd usbcore sd_mod usb_common crc_t10dif crct10dif_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata scsi_mod [last unloaded: 8139cp] [ 212.676084] CPU: 0 PID: 4124 Comm: iperf Tainted: G O 3.12.0-0.7-default+ torvalds#16 [ 212.676084] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [ 212.676084] task: ffff8800d83966c0 ti: ffff8800db4c8000 task.ti: ffff8800db4c8000 [ 212.676084] RIP: 0010:[<ffffffff8122e23f>] [<ffffffff8122e23f>] dql_completed+0x17f/0x190 [ 212.676084] RSP: 0018:ffff880116e03e30 EFLAGS: 00010083 [ 212.676084] RAX: 00000000000005ea RBX: 0000000000000f7c RCX: 0000000000000002 [ 212.676084] RDX: ffff880111dd0dc0 RSI: 0000000000000bd4 RDI: ffff8800db6ffcc0 [ 212.676084] RBP: ffff880116e03e48 R08: 0000000000000992 R09: 0000000000000000 [ 212.676084] R10: ffffffff8181e400 R11: 0000000000000004 R12: 000000000000000f [ 212.676084] R13: ffff8800d94ec840 R14: ffff8800db440c80 R15: 000000000000000e [ 212.676084] FS: 00007f6685a3c700(0000) GS:ffff880116e00000(0000) knlGS:0000000000000000 [ 212.676084] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 212.676084] CR2: 00007f6685ad6460 CR3: 00000000db714000 CR4: 00000000000006f0 [ 212.676084] Stack: [ 212.676084] ffff8800db6ffc00 000000000000000f ffff8800d94ec840 ffff880116e03eb8 [ 212.676084] ffffffffa041509f ffff880116e03e88 0000000f16e03e88 ffff8800d94ec000 [ 212.676084] 00000bd400059858 000000050000000f ffffffff81094c36 ffff880116e03eb8 [ 212.676084] Call Trace: [ 212.676084] <IRQ> [ 212.676084] [<ffffffffa041509f>] cp_interrupt+0x4ef/0x590 [8139cp] [ 212.676084] [<ffffffff81094c36>] ? ktime_get+0x56/0xd0 [ 212.676084] [<ffffffff8108cf73>] handle_irq_event_percpu+0x53/0x170 [ 212.676084] [<ffffffff8108d0cc>] handle_irq_event+0x3c/0x60 [ 212.676084] [<ffffffff8108fdb5>] handle_fasteoi_irq+0x55/0xf0 [ 212.676084] [<ffffffff810045df>] handle_irq+0x1f/0x30 [ 212.676084] [<ffffffff81003c8b>] do_IRQ+0x5b/0xe0 [ 212.676084] [<ffffffff8142beaa>] common_interrupt+0x6a/0x6a [ 212.676084] <EOI> [ 212.676084] [<ffffffffa0416a21>] ? cp_start_xmit+0x621/0x97c [8139cp] [ 212.676084] [<ffffffffa0416a09>] ? cp_start_xmit+0x609/0x97c [8139cp] [ 212.676084] [<ffffffff81378ed9>] dev_hard_start_xmit+0x2c9/0x550 [ 212.676084] [<ffffffff813960a9>] sch_direct_xmit+0x179/0x1d0 [ 212.676084] [<ffffffff813793f3>] dev_queue_xmit+0x293/0x440 [ 212.676084] [<ffffffff813b0e46>] ip_finish_output+0x236/0x450 [ 212.676084] [<ffffffff810e59e7>] ? __alloc_pages_nodemask+0x187/0xb10 [ 212.676084] [<ffffffff813b10e8>] ip_output+0x88/0x90 [ 212.676084] [<ffffffff813afa64>] ip_local_out+0x24/0x30 [ 212.676084] [<ffffffff813aff0d>] ip_queue_xmit+0x14d/0x3e0 [ 212.676084] [<ffffffff813c6fd1>] tcp_transmit_skb+0x501/0x840 [ 212.676084] [<ffffffff813c8323>] tcp_write_xmit+0x1e3/0xb20 [ 212.676084] [<ffffffff81363237>] ? skb_page_frag_refill+0x87/0xd0 [ 212.676084] [<ffffffff813c8c8b>] tcp_push_one+0x2b/0x40 [ 212.676084] [<ffffffff813bb7e6>] tcp_sendmsg+0x926/0xc90 [ 212.676084] [<ffffffff813e1d21>] inet_sendmsg+0x61/0xc0 [ 212.676084] [<ffffffff8135e861>] sock_aio_write+0x101/0x120 [ 212.676084] [<ffffffff81107cf1>] ? vma_adjust+0x2e1/0x5d0 [ 212.676084] [<ffffffff812163e0>] ? timerqueue_add+0x60/0xb0 [ 212.676084] [<ffffffff81130b60>] do_sync_write+0x60/0x90 [ 212.676084] [<ffffffff81130d44>] ? rw_verify_area+0x54/0xf0 [ 212.676084] [<ffffffff81130f66>] vfs_write+0x186/0x190 [ 212.676084] [<ffffffff811317fd>] SyS_write+0x5d/0xa0 [ 212.676084] [<ffffffff814321e2>] system_call_fastpath+0x16/0x1b [ 212.676084] Code: ca 41 89 dc 41 29 cc 45 31 db 29 c2 41 89 c5 89 d0 45 29 c5 f7 d0 c1 e8 1f e9 43 ff ff ff 66 0f 1f 44 00 00 31 c0 e9 7b ff ff ff <0f> 0b eb fe 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 c7 47 40 00 [ 212.676084] RIP [<ffffffff8122e23f>] dql_completed+0x17f/0x190 ------------[ cut here ]------------ When a skb has frags, bytes_compl plus skb->len nr_frags times in cp_tx(). It's not the correct value(actually, it should plus skb->len once) and it will trigger the BUG_ON(bytes_compl > num_queued - dql->num_completed). So only increase bytes_compl when finish sending all frags. pkts_compl also has a wrong value, fix it too. It's introduced by commit 871f0d4 ("8139cp: enable bql"). Suggested-by: Eric Dumazet <[email protected]> Signed-off-by: Yang Yingliang <[email protected]> Signed-off-by: David S. Miller <[email protected]>

After commit e9e4ea7 "net: smc91x: dont't use SMC_outw for fixing up halfword-aligned data" The Versatile SMSC LAN91C111 is crashing like this: ------------[ cut here ]------------ kernel BUG at /home/linus/linux/drivers/net/ethernet/smsc/smc91x.c:599! Internal error: Oops - BUG: 0 [#1] ARM Modules linked in: CPU: 0 PID: 43 Comm: udhcpc Not tainted 3.13.0-rc1+ torvalds#24 task: c6ccfaa0 ti: c6cd0000 task.ti: c6cd0000 PC is at smc_hardware_send_pkt+0x198/0x22c LR is at smc_hardware_send_pkt+0x24/0x22c pc : [<c01be324>] lr : [<c01be1b0>] psr: 20000013 sp : c6cd1d08 ip : 00000001 fp : 00000000 r10: c02adb08 r9 : 00000000 r8 : c6ced802 r7 : c786fba0 r6 : 00000146 r5 : c8800000 r4 : c78d6000 r3 : 0000000f r2 : 00000146 r1 : 00000000 r0 : 00000031 Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 0005317f Table: 06cf4000 DAC: 00000015 Process udhcpc (pid: 43, stack limit = 0xc6cd01c0) Stack: (0xc6cd1d08 to 0xc6cd2000) 1d00: 00000010 c8800000 c78d6000 c786fba0 c78d6000 c01be868 1d20: c01be7a4 00004000 00000000 c786fba0 c6c12b80 c0208554 000004d0 c780fc60 1d40: 00000220 c01fb734 00000000 00000000 00000000 c6c9a440 c6c12b80 c78d6000 1d60: c786fba0 c6c9a440 00000000 c021d1d8 00000000 00000000 c6c12b80 c78d6000 1d80: c786fba0 00000001 c6c9a440 c02087f8 c6c9a4a0 00080008 00000000 00000000 1da0: c78d6000 c786fba0 c78d6000 00000138 00000000 00000000 00000000 00000000 1dc0: 00000000 c027ba74 00000138 00000138 00000001 00000010 c6cedc00 00000000 1de0: 00000008 c7404400 c6cd1eec c6cd1f14 c067a73c c065c0b8 00000000 c067a740 1e00: 01ffffff 002040d0 00000000 00000000 00000000 00000000 00000000 ffffffff 1e20: 43004400 00110022 c6cdef20 c027ae8c c6ccfaa0 be82d65c 00000014 be82d3cc 1e40: 00000000 00000000 00000000 c01f2870 00000000 00000000 00000000 c6cd1e88 1e60: c6ccfaa0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 1e80: 00000000 00000000 00000031 c7802310 c7802300 00000138 c7404400 c0771da0 1ea0: 00000000 c6cd1eec c7800340 00000138 be82d65c 00000014 be82d3cc c6cd1f08 1ec0: 00000014 00000000 c7404400 c7404400 00000138 c01f4628 c78d6000 00000000 1ee0: 00000000 be82d3cc 00000138 c6cd1f08 00000014 c6cd1ee4 00000001 00000000 1f00: 00000000 00000000 00080011 00000002 06000000 ffffffff 0000ffff 00000002 1f20: 06000000 ffffffff 0000ffff c00928c8 c065c52 c6cd1f58 00000003 c009299c 1f40: 00000003 c065c52 c7404400 00000000 c7404400 c01f2218 c78106b0 c7441cb0 1f60: 00000000 00000006 c06799fc 00000000 00000000 00000006 00000000 c01f3ee0 1f80: 00000000 00000000 be82d678 be82d65c 00000014 00000001 00000122 c00139c8 1fa0: c6cd0000 c0013840 be82d65c 00000014 00000006 be82d3cc 00000138 00000000 1fc0: be82d65c 00000014 00000001 00000122 00000000 00000000 00018cb1 00000000 1fe0: 00003801 be82d3a8 0003a0c7 b6e9af08 60000010 00000006 00000000 00000000 [<c01be324>] (smc_hardware_send_pkt+0x198/0x22c) from [<c01be868>] (smc_hard_start_xmit+0xc4/0x1e8) [<c01be868>] (smc_hard_start_xmit+0xc4/0x1e8) from [<c0208554>] (dev_hard_start_xmit+0x460/0x4cc) [<c0208554>] (dev_hard_start_xmit+0x460/0x4cc) from [<c021d1d8>] (sch_direct_xmit+0x94/0x18c) [<c021d1d8>] (sch_direct_xmit+0x94/0x18c) from [<c02087f8>] (dev_queue_xmit+0x238/0x42c) [<c02087f8>] (dev_queue_xmit+0x238/0x42c) from [<c027ba74>] (packet_sendmsg+0xbe8/0xd28) [<c027ba74>] (packet_sendmsg+0xbe8/0xd28) from [<c01f2870>] (sock_sendmsg+0x84/0xa8) [<c01f2870>] (sock_sendmsg+0x84/0xa8) from [<c01f4628>] (SyS_sendto+0xb8/0xdc) [<c01f4628>] (SyS_sendto+0xb8/0xdc) from [<c0013840>] (ret_fast_syscall+0x0/0x2c) Code: e3130002 1a000001 e3130001 0affffcd (e7f001f2) ---[ end trace 81104fe70e8da7fe ]--- Kernel panic - not syncing: Fatal exception in interrupt This is because the macro operations in smc91x.h defined for Versatile are missing SMC_outsw() as used in this commit. The Versatile needs and uses the same accessors as the other platforms in the first if(...) clause, just switch it to using that and we have one problem less to worry about. Checkpatch complains about spacing, but I have opted to follow the style of this .h-file. Cc: Russell King <[email protected]> Cc: Nicolas Pitre <[email protected]> Cc: Eric Miao <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Signed-off-by: Linus Walleij <[email protected]> Signed-off-by: David S. Miller <[email protected]>

After commit e9e4ea7 "net: smc91x: dont't use SMC_outw for fixing up halfword-aligned data" The Versatile SMSC LAN91C111 is crashing like this: ------------[ cut here ]------------ kernel BUG at /home/linus/linux/drivers/net/ethernet/smsc/smc91x.c:599! Internal error: Oops - BUG: 0 [#1] ARM Modules linked in: CPU: 0 PID: 43 Comm: udhcpc Not tainted 3.13.0-rc1+ torvalds#24 task: c6ccfaa0 ti: c6cd0000 task.ti: c6cd0000 PC is at smc_hardware_send_pkt+0x198/0x22c LR is at smc_hardware_send_pkt+0x24/0x22c pc : [<c01be324>] lr : [<c01be1b0>] psr: 20000013 sp : c6cd1d08 ip : 00000001 fp : 00000000 r10: c02adb08 r9 : 00000000 r8 : c6ced802 r7 : c786fba0 r6 : 00000146 r5 : c8800000 r4 : c78d6000 r3 : 0000000f r2 : 00000146 r1 : 00000000 r0 : 00000031 Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 0005317f Table: 06cf4000 DAC: 00000015 Process udhcpc (pid: 43, stack limit = 0xc6cd01c0) Stack: (0xc6cd1d08 to 0xc6cd2000) 1d00: 00000010 c8800000 c78d6000 c786fba0 c78d6000 c01be868 1d20: c01be7a4 00004000 00000000 c786fba0 c6c12b80 c0208554 000004d0 c780fc60 1d40: 00000220 c01fb734 00000000 00000000 00000000 c6c9a440 c6c12b80 c78d6000 1d60: c786fba0 c6c9a440 00000000 c021d1d8 00000000 00000000 c6c12b80 c78d6000 1d80: c786fba0 00000001 c6c9a440 c02087f8 c6c9a4a0 00080008 00000000 00000000 1da0: c78d6000 c786fba0 c78d6000 00000138 00000000 00000000 00000000 00000000 1dc0: 00000000 c027ba74 00000138 00000138 00000001 00000010 c6cedc00 00000000 1de0: 00000008 c7404400 c6cd1eec c6cd1f14 c067a73c c065c0b8 00000000 c067a740 1e00: 01ffffff 002040d0 00000000 00000000 00000000 00000000 00000000 ffffffff 1e20: 43004400 00110022 c6cdef20 c027ae8c c6ccfaa0 be82d65c 00000014 be82d3cc 1e40: 00000000 00000000 00000000 c01f2870 00000000 00000000 00000000 c6cd1e88 1e60: c6ccfaa0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 1e80: 00000000 00000000 00000031 c7802310 c7802300 00000138 c7404400 c0771da0 1ea0: 00000000 c6cd1eec c7800340 00000138 be82d65c 00000014 be82d3cc c6cd1f08 1ec0: 00000014 00000000 c7404400 c7404400 00000138 c01f4628 c78d6000 00000000 1ee0: 00000000 be82d3cc 00000138 c6cd1f08 00000014 c6cd1ee4 00000001 00000000 1f00: 00000000 00000000 00080011 00000002 06000000 ffffffff 0000ffff 00000002 1f20: 06000000 ffffffff 0000ffff c00928c8 c065c52 c6cd1f58 00000003 c009299c 1f40: 00000003 c065c52 c7404400 00000000 c7404400 c01f2218 c78106b0 c7441cb0 1f60: 00000000 00000006 c06799fc 00000000 00000000 00000006 00000000 c01f3ee0 1f80: 00000000 00000000 be82d678 be82d65c 00000014 00000001 00000122 c00139c8 1fa0: c6cd0000 c0013840 be82d65c 00000014 00000006 be82d3cc 00000138 00000000 1fc0: be82d65c 00000014 00000001 00000122 00000000 00000000 00018cb1 00000000 1fe0: 00003801 be82d3a8 0003a0c7 b6e9af08 60000010 00000006 00000000 00000000 [<c01be324>] (smc_hardware_send_pkt+0x198/0x22c) from [<c01be868>] (smc_hard_start_xmit+0xc4/0x1e8) [<c01be868>] (smc_hard_start_xmit+0xc4/0x1e8) from [<c0208554>] (dev_hard_start_xmit+0x460/0x4cc) [<c0208554>] (dev_hard_start_xmit+0x460/0x4cc) from [<c021d1d8>] (sch_direct_xmit+0x94/0x18c) [<c021d1d8>] (sch_direct_xmit+0x94/0x18c) from [<c02087f8>] (dev_queue_xmit+0x238/0x42c) [<c02087f8>] (dev_queue_xmit+0x238/0x42c) from [<c027ba74>] (packet_sendmsg+0xbe8/0xd28) [<c027ba74>] (packet_sendmsg+0xbe8/0xd28) from [<c01f2870>] (sock_sendmsg+0x84/0xa8) [<c01f2870>] (sock_sendmsg+0x84/0xa8) from [<c01f4628>] (SyS_sendto+0xb8/0xdc) [<c01f4628>] (SyS_sendto+0xb8/0xdc) from [<c0013840>] (ret_fast_syscall+0x0/0x2c) Code: e3130002 1a000001 e3130001 0affffcd (e7f001f2) ---[ end trace 81104fe70e8da7fe ]--- Kernel panic - not syncing: Fatal exception in interrupt This is because the macro operations in smc91x.h defined for Versatile are missing SMC_outsw() as used in this commit. The Versatile needs and uses the same accessors as the other platforms in the first if(...) clause, just switch it to using that and we have one problem less to worry about. This includes a hunk of a patch from Will Deacon fixin the other 32bit platforms as well: Innokom, Ramses, PXA, PCM027. Checkpatch complains about spacing, but I have opted to follow the style of this .h-file. Cc: Russell King <[email protected]> Cc: Nicolas Pitre <[email protected]> Cc: Eric Miao <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: [email protected] Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Linus Walleij <[email protected]> Signed-off-by: David S. Miller <[email protected]>

The patch fixes the following lockdep warning, which is 100% reproducible on network restart: ====================================================== [ INFO: possible circular locking dependency detected ] 3.12.0+ torvalds#47 Tainted: GF ------------------------------------------------------- kworker/1:1/27 is trying to acquire lock: ((&(&adapter->watchdog_task)->work)){+.+...}, at: [<ffffffff8108a5b0>] flush_work+0x0/0x70 but task is already holding lock: (&adapter->mutex){+.+...}, at: [<ffffffffa0177c0a>] e1000_reset_task+0x4a/0xa0 [e1000] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&adapter->mutex){+.+...}: [<ffffffff810bdb5d>] lock_acquire+0x9d/0x120 [<ffffffff816b8cbc>] mutex_lock_nested+0x4c/0x390 [<ffffffffa017233d>] e1000_watchdog+0x7d/0x5b0 [e1000] [<ffffffff8108b972>] process_one_work+0x1d2/0x510 [<ffffffff8108ca80>] worker_thread+0x120/0x3a0 [<ffffffff81092c1e>] kthread+0xee/0x110 [<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0 -> #0 ((&(&adapter->watchdog_task)->work)){+.+...}: [<ffffffff810bd9c0>] __lock_acquire+0x1710/0x1810 [<ffffffff810bdb5d>] lock_acquire+0x9d/0x120 [<ffffffff8108a5eb>] flush_work+0x3b/0x70 [<ffffffff8108b5d8>] __cancel_work_timer+0x98/0x140 [<ffffffff8108b693>] cancel_delayed_work_sync+0x13/0x20 [<ffffffffa0170cec>] e1000_down_and_stop+0x3c/0x60 [e1000] [<ffffffffa01775b1>] e1000_down+0x131/0x220 [e1000] [<ffffffffa0177c12>] e1000_reset_task+0x52/0xa0 [e1000] [<ffffffff8108b972>] process_one_work+0x1d2/0x510 [<ffffffff8108ca80>] worker_thread+0x120/0x3a0 [<ffffffff81092c1e>] kthread+0xee/0x110 [<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&adapter->mutex); lock((&(&adapter->watchdog_task)->work)); lock(&adapter->mutex); lock((&(&adapter->watchdog_task)->work)); *** DEADLOCK *** 3 locks held by kworker/1:1/27: #0: (events){.+.+.+}, at: [<ffffffff8108b906>] process_one_work+0x166/0x510 #1: ((&adapter->reset_task)){+.+...}, at: [<ffffffff8108b906>] process_one_work+0x166/0x510 #2: (&adapter->mutex){+.+...}, at: [<ffffffffa0177c0a>] e1000_reset_task+0x4a/0xa0 [e1000] stack backtrace: CPU: 1 PID: 27 Comm: kworker/1:1 Tainted: GF 3.12.0+ torvalds#47 Hardware name: System manufacturer System Product Name/P5B-VM SE, BIOS 0501 05/31/2007 Workqueue: events e1000_reset_task [e1000] ffffffff820f6000 ffff88007b9dba98 ffffffff816b54a2 0000000000000002 ffffffff820f5e50 ffff88007b9dbae8 ffffffff810ba936 ffff88007b9dbac8 ffff88007b9dbb48 ffff88007b9d8f00 ffff88007b9d8780 ffff88007b9d8f00 Call Trace: [<ffffffff816b54a2>] dump_stack+0x49/0x5f [<ffffffff810ba936>] print_circular_bug+0x216/0x310 [<ffffffff810bd9c0>] __lock_acquire+0x1710/0x1810 [<ffffffff8108a5b0>] ? __flush_work+0x250/0x250 [<ffffffff810bdb5d>] lock_acquire+0x9d/0x120 [<ffffffff8108a5b0>] ? __flush_work+0x250/0x250 [<ffffffff8108a5eb>] flush_work+0x3b/0x70 [<ffffffff8108a5b0>] ? __flush_work+0x250/0x250 [<ffffffff8108b5d8>] __cancel_work_timer+0x98/0x140 [<ffffffff8108b693>] cancel_delayed_work_sync+0x13/0x20 [<ffffffffa0170cec>] e1000_down_and_stop+0x3c/0x60 [e1000] [<ffffffffa01775b1>] e1000_down+0x131/0x220 [e1000] [<ffffffffa0177c12>] e1000_reset_task+0x52/0xa0 [e1000] [<ffffffff8108b972>] process_one_work+0x1d2/0x510 [<ffffffff8108b906>] ? process_one_work+0x166/0x510 [<ffffffff8108ca80>] worker_thread+0x120/0x3a0 [<ffffffff8108c960>] ? manage_workers+0x2c0/0x2c0 [<ffffffff81092c1e>] kthread+0xee/0x110 [<ffffffff81092b30>] ? __init_kthread_worker+0x70/0x70 [<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0 [<ffffffff81092b30>] ? __init_kthread_worker+0x70/0x70 == The issue background == The problem occurs, because e1000_down(), which is called under adapter->mutex by e1000_reset_task(), tries to synchronously cancel e1000 auxiliary works (reset_task, watchdog_task, phy_info_task, fifo_stall_task), which take adapter->mutex in their handlers. So the question is what does adapter->mutex protect there? The adapter->mutex was introduced by commit 0ef4ee ("e1000: convert to private mutex from rtnl") as a replacement for rtnl_lock() taken in the asynchronous handlers. It targeted on fixing a similar lockdep warning issued when e1000_down() was called under rtnl_lock(), and it fixed it, but unfortunately it introduced the lockdep warning described above. Anyway, that said the source of this bug is that the asynchronous works were made to take rtnl_lock() some time ago, so let's look deeper and find why it was added there. The rtnl_lock() was added to asynchronous handlers by commit 338c15 ("e1000: fix occasional panic on unload") in order to prevent asynchronous handlers from execution after the module is unloaded (e1000_down() is called) as it follows from the comment to the commit: > Net drivers in general have an issue where timers fired > by mod_timer or work threads with schedule_work are running > outside of the rtnl_lock. > > With no other lock protection these routines are vulnerable > to races with driver unload or reset paths. > > The longer term solution to this might be a redesign with > safer locks being taken in the driver to guarantee no > reentrance, but for now a safe and effective fix is > to take the rtnl_lock in these routines. I'm not sure if this locking scheme fixed the problem or just made it unlikely, although I incline to the latter. Anyway, this was long time ago when e1000 auxiliary works were implemented as timers scheduling real work handlers in their routines. The e1000_down() function only canceled the timers, but left the real handlers running if they were running, which could result in work execution after module unload. Today, the e1000 driver uses sane delayed works instead of the pair timer+work to implement its delayed asynchronous handlers, and the e1000_down() synchronously cancels all the works so that the problem that commit 338c15 tried to cope with disappeared, and we don't need any locks in the handlers any more. Moreover, any locking there can potentially result in a deadlock. So, this patch reverts commits 0ef4ee and 338c15. Fixes: 0ef4eed ("e1000: convert to private mutex from rtnl") Fixes: 338c15e ("e1000: fix occasional panic on unload") Cc: Tushar Dave <[email protected]> Cc: Patrick McHardy <[email protected]> Signed-off-by: Vladimir Davydov <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>

It's no good setting vga_base after the VGA console has been initialised, because if we do that we get this: Unable to handle kernel paging request at virtual address 000b8000 pgd = c0004000 [000b8000] *pgd=07ffc831, *pte=00000000, *ppte=00000000 0Internal error: Oops: 5017 [#1] ARM Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 3.12.0+ torvalds#49 task: c03e2974 ti: c03d8000 task.ti: c03d8000 PC is at vgacon_startup+0x258/0x39c LR is at request_resource+0x10/0x1c pc : [<c01725d0>] lr : [<c0022b50>] psr: 60000053 sp : c03d9f68 ip : 000b8000 fp : c03d9f8c r10: 000055aa r9 : 4401a103 r8 : ffffaa55 r7 : c03e357c r6 : c051b460 r5 : 000000ff r4 : 000c0000 r3 : 000b8000 r2 : c03e0514 r1 : 00000000 r0 : c0304971 Flags: nZCv IRQs on FIQs off Mode SVC_32 ISA ARM Segment kernel which is an access to the 0xb8000 without the PCI offset required to make it work. Fixes: cc22b4c ("ARM: set vga memory base at run-time") Signed-off-by: Russell King <[email protected]> Cc: <[email protected]>

…culation Currently mx53 (CortexA8) running at 1GHz reports: Calibrating delay loop... 663.55 BogoMIPS (lpj=3317760) Tom Evans verified that alignments of 0x0 and 0x8 run the two instructions of __loop_delay in one clock cycle (1 clock/loop), while alignments of 0x4 and 0xc take 3 clocks to run the loop twice. (1.5 clock/loop) The original object code looks like this: 00000010 <__loop_const_udelay>: 10: e3e01000 mvn r1, #0 14: e51f201c ldr r2, [pc, #-28] ; 0 <__loop_udelay-0x8> 18: e5922000 ldr r2, [r2] 1c: e0800921 add r0, r0, r1, lsr torvalds#18 20: e1a00720 lsr r0, r0, torvalds#14 24: e0822b21 add r2, r2, r1, lsr torvalds#22 28: e1a02522 lsr r2, r2, torvalds#10 2c: e0000092 mul r0, r2, r0 30: e0800d21 add r0, r0, r1, lsr torvalds#26 34: e1b00320 lsrs r0, r0, torvalds#6 38: 01a0f00e moveq pc, lr 0000003c <__loop_delay>: 3c: e2500001 subs r0, r0, #1 40: 8afffffe bhi 3c <__loop_delay> 44: e1a0f00e mov pc, lr After adding the 'align 3' directive to __loop_delay (align to 8 bytes): 00000010 <__loop_const_udelay>: 10: e3e01000 mvn r1, #0 14: e51f201c ldr r2, [pc, #-28] ; 0 <__loop_udelay-0x8> 18: e5922000 ldr r2, [r2] 1c: e0800921 add r0, r0, r1, lsr torvalds#18 20: e1a00720 lsr r0, r0, torvalds#14 24: e0822b21 add r2, r2, r1, lsr torvalds#22 28: e1a02522 lsr r2, r2, torvalds#10 2c: e0000092 mul r0, r2, r0 30: e0800d21 add r0, r0, r1, lsr torvalds#26 34: e1b00320 lsrs r0, r0, torvalds#6 38: 01a0f00e moveq pc, lr 3c: e320f000 nop {0} 00000040 <__loop_delay>: 40: e2500001 subs r0, r0, #1 44: 8afffffe bhi 40 <__loop_delay> 48: e1a0f00e mov pc, lr 4c: e320f000 nop {0} , which now reports: Calibrating delay loop... 996.14 BogoMIPS (lpj=4980736) Some more test results: On mx31 (ARM1136) running at 532 MHz, before the patch: Calibrating delay loop... 351.43 BogoMIPS (lpj=1757184) On mx31 (ARM1136) running at 532 MHz after the patch: Calibrating delay loop... 528.79 BogoMIPS (lpj=2643968) Also tested on mx6 (CortexA9) and on mx27 (ARM926), which shows the same BogoMIPS value before and after this patch. Reported-by: Tom Evans <[email protected]> Suggested-by: Tom Evans <[email protected]> Signed-off-by: Fabio Estevam <[email protected]> Signed-off-by: Russell King <[email protected]>

…ce_activate power_supply_register() calls device_init_wakeup() to register a wakeup source before initializing dev_name. As a result, device_wakeup_enable() end up registering wakeup source with a null name when wakeup_source_register() gets called with dev_name(dev) which is null at the time. When kernel is booted with wakeup_source_activate enabled, it will panic when the trace point code tries to dereference ws->name. Fixed the problem by moving up the kobject_set_name() call prior to accesses to dev_name(). Replaced kobject_set_name() with dev_set_name() which is the right interface to be called from drivers. Fixed the call to device_del() prior to device_add() in for wakeup_init_failed error handling code. Trace after the change: bash-2143 [003] d... 132.280697: wakeup_source_activate: BAT1 state=0x20001 kworker/3:2-1169 [003] d... 132.281305: wakeup_source_deactivate: BAT1 state=0x30000 Oops message: [ 819.769934] device: 'BAT1': device_add [ 819.770078] PM: Adding info for No Bus:BAT1 [ 819.770235] BUG: unable to handle kernel NULL pointer dereference at (null) [ 819.770435] IP: [<ffffffff813381c0>] skip_spaces+0x30/0x30 [ 819.770572] PGD 3efd90067 PUD 3eff61067 PMD 0 [ 819.770716] Oops: 0000 [#1] SMP [ 819.770829] Modules linked in: arc4 iwldvm mac80211 x86_pkg_temp_thermal coretemp kvm_intel joydev i915 kvm uvcvideo ghash_clmulni_intel videobuf2_vmalloc aesni_intel videobuf2_memops videobuf2_core aes_x86_64 ablk_helper cryptd videodev iwlwifi lrw rfcomm gf128mul glue_helper bnep btusb media bluetooth parport_pc hid_generic ppdev snd_hda_codec_hdmi drm_kms_helper snd_hda_codec_realtek cfg80211 drm tpm_infineon samsung_laptop snd_hda_intel usbhid snd_hda_codec hid snd_hwdep snd_pcm microcode snd_page_alloc snd_timer psmouse i2c_algo_bit lpc_ich tpm_tis video wmi mac_hid serio_raw ext2 lp parport r8169 mii [ 819.771802] CPU: 0 PID: 2167 Comm: bash Not tainted 3.12.0+ torvalds#25 [ 819.771876] Hardware name: SAMSUNG ELECTRONICS CO., LTD. 900X3C/900X3D/900X4C/900X4D/SAMSUNG_NP1234567890, BIOS P03AAC 07/12/2012 [ 819.772022] task: ffff88002e6ddcc0 ti: ffff8804015ca000 task.ti: ffff8804015ca000 [ 819.772119] RIP: 0010:[<ffffffff813381c0>] [<ffffffff813381c0>] skip_spaces+0x30/0x30 [ 819.772242] RSP: 0018:ffff8804015cbc70 EFLAGS: 00010046 [ 819.772310] RAX: 0000000000000003 RBX: ffff88040cfd6d40 RCX: 0000000000000018 [ 819.772397] RDX: 0000000000020001 RSI: 0000000000000000 RDI: 0000000000000000 [ 819.772484] RBP: ffff8804015cbcc0 R08: 0000000000000000 R09: ffff8803f0768d40 [ 819.772570] R10: ffffea001033b800 R11: 0000000000000000 R12: ffffffff81c519c0 [ 819.772656] R13: 0000000000020001 R14: 0000000000000000 R15: 0000000000020001 [ 819.772744] FS: 00007ff98309b740(0000) GS:ffff88041f200000(0000) knlGS:0000000000000000 [ 819.772845] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 819.772917] CR2: 0000000000000000 CR3: 00000003f59dc000 CR4: 00000000001407f0 [ 819.773001] Stack: [ 819.773030] ffffffff81114003 ffff8804015cbcb0 0000000000000000 0000000000000046 [ 819.773146] ffff880409757a18 ffff8803f065a160 0000000000000000 0000000000020001 [ 819.773273] 0000000000000000 0000000000000000 ffff8804015cbce8 ffffffff8143e388 [ 819.773387] Call Trace: [ 819.773434] [<ffffffff81114003>] ? ftrace_raw_event_wakeup_source+0x43/0xe0 [ 819.773520] [<ffffffff8143e388>] wakeup_source_report_event+0xb8/0xd0 [ 819.773595] [<ffffffff8143e3cd>] __pm_stay_awake+0x2d/0x50 [ 819.773724] [<ffffffff8153395c>] power_supply_changed+0x3c/0x90 [ 819.773795] [<ffffffff8153407c>] power_supply_register+0x18c/0x250 [ 819.773869] [<ffffffff813d8d18>] sysfs_add_battery+0x61/0x7b [ 819.773935] [<ffffffff813d8d69>] battery_notify+0x37/0x3f [ 819.774001] [<ffffffff816ccb7c>] notifier_call_chain+0x4c/0x70 [ 819.774071] [<ffffffff81073ded>] __blocking_notifier_call_chain+0x4d/0x70 [ 819.774149] [<ffffffff81073e26>] blocking_notifier_call_chain+0x16/0x20 [ 819.774227] [<ffffffff8109397a>] pm_notifier_call_chain+0x1a/0x40 [ 819.774316] [<ffffffff81095b66>] hibernate+0x66/0x1c0 [ 819.774407] [<ffffffff81093931>] state_store+0x71/0xa0 [ 819.774507] [<ffffffff81331d8f>] kobj_attr_store+0xf/0x20 [ 819.774613] [<ffffffff811f8618>] sysfs_write_file+0x128/0x1c0 [ 819.774735] [<ffffffff8118579d>] vfs_write+0xbd/0x1e0 [ 819.774841] [<ffffffff811861d9>] SyS_write+0x49/0xa0 [ 819.774939] [<ffffffff816d1052>] system_call_fastpath+0x16/0x1b [ 819.775055] Code: 89 f8 48 89 e5 f6 82 c0 a6 84 81 20 74 15 0f 1f 44 00 00 48 83 c0 01 0f b6 10 f6 82 c0 a6 84 81 20 75 f0 5d c3 66 0f 1f 44 00 00 <80> 3f 00 55 48 89 e5 74 15 48 89 f8 0f 1f 40 00 48 83 c0 01 80 [ 819.775760] RIP [<ffffffff813381c0>] skip_spaces+0x30/0x30 [ 819.775881] RSP <ffff8804015cbc70> [ 819.775949] CR2: 0000000000000000 [ 819.794175] ---[ end trace c4ef25127039952e ]--- Signed-off-by: Shuah Khan <[email protected]> Acked-by: Anton Vorontsov <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> Cc: [email protected] Signed-off-by: Anton Vorontsov <[email protected]>

The second word of key->payload does not get initialised in key_alloc(), but the big_key type is relying on it having been cleared. The problem comes when big_key fails to instantiate a large key and doesn't then set the payload. The big_key_destroy() op is called from the garbage collector and this assumes that the dentry pointer stored in the second word will be NULL if instantiation did not complete. Therefore just pre-clear the entire struct key on allocation rather than trying to be clever and only initialising to 0 only those bits that aren't otherwise initialised. The lack of initialisation can lead to a bug report like the following if big_key failed to initialise its file: general protection fault: 0000 [#1] SMP Modules linked in: ... CPU: 0 PID: 51 Comm: kworker/0:1 Not tainted 3.10.0-53.el7.x86_64 #1 Hardware name: Dell Inc. PowerEdge 1955/0HC513, BIOS 1.4.4 12/09/2008 Workqueue: events key_garbage_collector task: ffff8801294f5680 ti: ffff8801296e2000 task.ti: ffff8801296e2000 RIP: 0010:[<ffffffff811b4a51>] dput+0x21/0x2d0 ... Call Trace: [<ffffffff811a7b06>] path_put+0x16/0x30 [<ffffffff81235604>] big_key_destroy+0x44/0x60 [<ffffffff8122dc4b>] key_gc_unused_keys.constprop.2+0x5b/0xe0 [<ffffffff8122df2f>] key_garbage_collector+0x1df/0x3c0 [<ffffffff8107759b>] process_one_work+0x17b/0x460 [<ffffffff8107834b>] worker_thread+0x11b/0x400 [<ffffffff81078230>] ? rescuer_thread+0x3e0/0x3e0 [<ffffffff8107eb00>] kthread+0xc0/0xd0 [<ffffffff8107ea40>] ? kthread_create_on_node+0x110/0x110 [<ffffffff815c4bec>] ret_from_fork+0x7c/0xb0 [<ffffffff8107ea40>] ? kthread_create_on_node+0x110/0x110 Reported-by: Patrik Kis <[email protected]> Signed-off-by: David Howells <[email protected]> Reviewed-by: Stephen Gallagher <[email protected]>

Since devm_card_release() expects parameter 'res' to be a pointer to struct snd_soc_card, devm_snd_soc_register_card() should really pass such a pointer rather than the one to struct device. This bug causes the kernel Oops below with imx-sgtl500 driver when we remove the module. It happens because with 'card' pointing to the wrong structure, card->num_rtd becomes 0 in function soc_remove_dai_links(). Consequently, soc_remove_link_components() and in turn soc_cleanup_codec[platform]_debugfs() will not be called on card removal. It results in that debugfs_card_root is being removed while its child entries debugfs_codec_root and debugfs_platform_root are still there, and thus the kernel Oops. Fix the bug by correcting the parameter 'res' to be the pointer to struct snd_soc_card. $ lsmod Module Size Used by snd_soc_imx_sgtl5000 3506 0 snd_soc_sgtl5000 13677 2 snd_soc_imx_audmux 5324 1 snd_soc_imx_sgtl5000 snd_soc_fsl_ssi 8139 2 imx_pcm_dma 1380 1 snd_soc_fsl_ssi $ rmmod snd_soc_imx_sgtl5000 Unable to handle kernel paging request at virtual address e594025c pgd = be134000 [e594025c] *pgd=00000000 Internal error: Oops: 5 [#1] SMP ARM Modules linked in: snd_soc_imx_sgtl5000(-) snd_soc_sgtl5000 snd_soc_imx_audmux snd_soc_fsl_ssi imx_pcm_dma CPU: 0 PID: 1793 Comm: rmmod Not tainted 3.13.0-rc1 #1570 task: bee28900 ti: bfbec000 task.ti: bfbec000 PC is at debugfs_remove_recursive+0x28/0x154 LR is at snd_soc_unregister_card+0xa0/0xcc pc : [<80252b38>] lr : [<80496ac4>] psr: a0000013 sp : bfbede00 ip : bfbede28 fp : bfbede24 r10: 803281d4 r9 : bfbec000 r8 : 803271ac r7 : bef54440 r6 : 00000004 r5 : bf9a4010 r4 : bf9a4010 r3 : e5940224 r2 : 00000000 r1 : bef54450 r0 : 803271ac Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 10c53c7d Table: 4e13404a DAC: 00000015 Process rmmod (pid: 1793, stack limit = 0xbfbec240) Stack: (0xbfbede00 to 0xbfbee000) de00: 00000000 bf9a4010 bf9a4010 00000004 bef54440 bec89000 bfbede44 bfbede28 de20: 80496ac4 80252b1c 804a4b60 bfbede60 bf9a4010 00000004 bfbede54 bfbede48 de40: 804a4b74 80496a30 bfbede94 bfbede58 80328728 804a4b6c bfbede94 a0000013 de60: bf1b5800 bef54440 00000002 bf9a4010 7f0169f8 bf9a4044 00000081 8000e9c4 de80: bfbec000 00000000 bfbedeac bfbede98 80328cb0 80328618 7f016000 bf9a4010 dea0: bfbedec4 bfbedeb0 8032561c 80328c84 bf9a4010 7f0169f8 bfbedee4 bfbedec8 dec0: 80325e84 803255a8 bee28900 7f0169f8 00000000 78208d30 bfbedefc bfbedee8 dee0: 80325410 80325dd4 beca8100 7f0169f8 bfbedf14 bfbedf00 803264f8 803253c8 df00: 7f01635c 7f016a3c bfbedf24 bfbedf18 80327098 803264d4 bfbedf34 bfbedf28 df20: 7f016370 80327090 bfbedfa4 bfbedf38 80085ef0 7f016368 bfbedf54 5f646e73 df40: 5f636f73 5f786d69 6c746773 30303035 00000000 78208008 bfbedf84 bfbedf68 df60: 800613b0 80061194 fffffffe 78208d00 7efc2f07 00000081 7f016a3c 00000800 df80: bfbedf84 00000000 00000000 fffffffe 78208d00 7efc2f07 00000000 bfbedfa8 dfa0: 8000e800 80085dcc fffffffe 78208d00 78208d30 00000800 a8c82400 a8c82400 dfc0: fffffffe 78208d00 7efc2f07 00000081 00000002 00000000 78208008 00000800 dfe0: 7efc2e1c 7efc2ba8 76f5ca47 76edec7c 80000010 78208d30 00000000 00000000 Backtrace: [<80252b10>] (debugfs_remove_recursive+0x0/0x154) from [<80496ac4>] (snd_soc_unregister_card+0xa0/0xcc) r8:bec89000 r7:bef54440 r6:00000004 r5:bf9a4010 r4:bf9a4010 r3:00000000 [<80496a24>] (snd_soc_unregister_card+0x0/0xcc) from [<804a4b74>] (devm_card_release+0x14/0x18) r6:00000004 r5:bf9a4010 r4:bfbede60 r3:804a4b60 [<804a4b60>] (devm_card_release+0x0/0x18) from [<80328728>] (release_nodes+0x11c/0x1dc) [<8032860c>] (release_nodes+0x0/0x1dc) from [<80328cb0>] (devres_release_all+0x38/0x54) [<80328c78>] (devres_release_all+0x0/0x54) from [<8032561c>] (__device_release_driver+0x80/0xd4) r4:bf9a4010 r3:7f016000 [<8032559c>] (__device_release_driver+0x0/0xd4) from [<80325e84>] (driver_detach+0xbc/0xc0) r5:7f0169f8 r4:bf9a4010 [<80325dc8>] (driver_detach+0x0/0xc0) from [<80325410>] (bus_remove_driver+0x54/0x98) r6:78208d30 r5:00000000 r4:7f0169f8 r3:bee28900 [<803253bc>] (bus_remove_driver+0x0/0x98) from [<803264f8>] (driver_unregister+0x30/0x50) r4:7f0169f8 r3:beca8100 [<803264c8>] (driver_unregister+0x0/0x50) from [<80327098>] (platform_driver_unregister+0x14/0x18) r4:7f016a3c r3:7f01635c [<80327084>] (platform_driver_unregister+0x0/0x18) from [<7f016370>] (imx_sgtl5000_driver_exit+0x14/0x1c [snd_soc_imx_sgtl5000]) [<7f01635c>] (imx_sgtl5000_driver_exit+0x0/0x1c [snd_soc_imx_sgtl5000]) from [<80085ef0>] (SyS_delete_module+0x130/0x18c) [<80085dc0>] (SyS_delete_module+0x0/0x18c) from [<8000e800>] (ret_fast_syscall+0x0/0x48) r6:7efc2f07 r5:78208d00 r4:fffffffe Code: 889da9f8 e5983020 e3530000 089da9f8 (e5933038) ---[ end trace 825e7e125251a225 ]--- Signed-off-by: Shawn Guo <[email protected]> Signed-off-by: Mark Brown <[email protected]>

ae7f164 ("cgroup: move cgroup->subsys[] assignment to online_css()") moved cgroup->subsys[] assignements later in cgroup_create() but didn't update error handling path accordingly leading to the following oops and leaking later css's after an online_css() failure. The oops is from cgroup destruction path being invoked on the partially constructed cgroup which is not ready to handle empty slots in cgrp->subsys[] array. BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: [<ffffffff810eeaa8>] cgroup_destroy_locked+0x118/0x2f0 PGD a780a067 PUD aadbe067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: CPU: 6 PID: 7360 Comm: mkdir Not tainted 3.13.0-rc2+ torvalds#69 Hardware name: task: ffff8800b9dbec00 ti: ffff8800a781a000 task.ti: ffff8800a781a000 RIP: 0010:[<ffffffff810eeaa8>] [<ffffffff810eeaa8>] cgroup_destroy_locked+0x118/0x2f0 RSP: 0018:ffff8800a781bd98 EFLAGS: 00010282 RAX: ffff880586903878 RBX: ffff880586903800 RCX: ffff880586903820 RDX: ffff880586903860 RSI: ffff8800a781bdb0 RDI: ffff880586903820 RBP: ffff8800a781bde8 R08: ffff88060e0b8048 R09: ffffffff811d7bc1 R10: 000000000000008c R11: 0000000000000001 R12: ffff8800a72286c0 R13: 0000000000000000 R14: ffffffff81cf7a40 R15: 0000000000000001 FS: 00007f60ecda57a0(0000) GS:ffff8806272c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 00000000a7a03000 CR4: 00000000000007e0 Stack: ffff880586903860 ffff880586903910 ffff8800a72286c0 ffff880586903820 ffffffff81cf7a40 ffff880586903800 ffff88060e0b8018 ffffffff81cf7a40 ffff8800b9dbec00 ffff8800b9dbf098 ffff8800a781bec8 ffffffff810ef5bf Call Trace: [<ffffffff810ef5bf>] cgroup_mkdir+0x55f/0x5f0 [<ffffffff811c90ae>] vfs_mkdir+0xee/0x140 [<ffffffff811cb07e>] SyS_mkdirat+0x6e/0xf0 [<ffffffff811c6a19>] SyS_mkdir+0x19/0x20 [<ffffffff8169e569>] system_call_fastpath+0x16/0x1b This patch moves reference bumping inside online_css() loop, clears css_ar[] as css's are brought online successfully, and updates err_destroy path so that either a css is fully online and destroyed by cgroup_destroy_locked() or the error path frees it. This creates a duplicate css free logic in the error path but it will be cleaned up soon. v2: Li pointed out that cgroup_destroy_locked() would do NULL-deref if invoked with a cgroup which doesn't have all css's populated. Update cgroup_destroy_locked() so that it skips NULL css's. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Li Zefan <[email protected]> Reported-by: Vladimir Davydov <[email protected]> Cc: [email protected] # v3.12+

free_unused_bufs must check vi->mergeable_rx_bufs before vi->big_packets, because we use this sequence in other places. Otherwise we allocate buffer of one type, then free it as another type. general protection fault: 0000 [#1] SMP Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: ip6table_filter ip6_tables iptable_filter ip_tables pcspkr virtio_balloon virtio_net(-) i2c_pii CPU: 0 PID: 400 Comm: rmmod Not tainted 3.13.0-rc2+ torvalds#170 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff8800b6d2a210 ti: ffff8800aed32000 task.ti: ffff8800aed32000 RIP: 0010:[<ffffffffa00345f3>] [<ffffffffa00345f3>] free_unused_bufs+0xc3/0x190 [virtio_net] RSP: 0018:ffff8800aed33dd8 EFLAGS: 00010202 RAX: ffff8800b1fe2c00 RBX: ffff8800b66a7240 RCX: 6b6b6b6b6b6b6b6b RDX: 6b6b6b6b6b6b6b6b RSI: ffff8800b8419a68 RDI: ffff8800b66a1148 RBP: ffff8800aed33e00 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 R13: ffff8800b66a1148 R14: 0000000000000000 R15: 000077ff80000000 FS: 00007fc4f9c4e740(0000) GS:ffff8800bfa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f63f432f000 CR3: 00000000b6538000 CR4: 00000000000006f0 Stack: ffff8800b66a7240 ffff8800b66a7380 ffff8800377bd3f0 0000000000000000 00000000023302f0 ffff8800aed33e18 ffffffffa00346e2 ffff8800b66a7240 ffff8800aed33e38 ffffffffa003474d ffff8800377bd388 ffff8800377bd390 Call Trace: [<ffffffffa00346e2>] remove_vq_common+0x22/0x40 [virtio_net] [<ffffffffa003474d>] virtnet_remove+0x4d/0x90 [virtio_net] [<ffffffff813ae303>] virtio_dev_remove+0x23/0x80 [<ffffffff813f62cf>] __device_release_driver+0x7f/0xf0 [<ffffffff813f6c80>] driver_detach+0xc0/0xd0 [<ffffffff813f5f08>] bus_remove_driver+0x58/0xd0 [<ffffffff813f72cc>] driver_unregister+0x2c/0x50 [<ffffffff813ae63e>] unregister_virtio_driver+0xe/0x10 [<ffffffffa0036852>] virtio_net_driver_exit+0x10/0x7be [virtio_net] [<ffffffff810d7cf2>] SyS_delete_module+0x172/0x220 [<ffffffff810a732d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff810f5d4c>] ? __audit_syscall_entry+0x9c/0xf0 [<ffffffff81677f69>] system_call_fastpath+0x16/0x1b Code: c0 74 55 0f 1f 44 00 00 80 7b 30 00 74 7a 48 8b 50 30 4c 89 e6 48 03 73 20 48 85 d2 0f 84 bb 00 00 00 66 0f RIP [<ffffffffa00345f3>] free_unused_bufs+0xc3/0x190 [virtio_net] RSP <ffff8800aed33dd8> ---[ end trace edb570ea923cce9c ]--- Fixes: 2613af0 (virtio_net: migrate mergeable rx buffers to page frag allocators) Cc: Michael Dalton <[email protected]> Cc: Rusty Russell <[email protected]> Cc: "Michael S. Tsirkin" <[email protected]> Signed-off-by: Andrey Vagin <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>

free_netdev calls netif_napi_del too, but it's too late, because napi structures are placed on vi->rq. netif_napi_add() is called from virtnet_alloc_queues. general protection fault: 0000 [#1] SMP Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: ip6table_filter ip6_tables iptable_filter ip_tables virtio_balloon pcspkr virtio_net(-) i2c_pii CPU: 1 PID: 347 Comm: rmmod Not tainted 3.13.0-rc2+ torvalds#171 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff8800b779c420 ti: ffff8800379e0000 task.ti: ffff8800379e0000 RIP: 0010:[<ffffffff81322e19>] [<ffffffff81322e19>] __list_del_entry+0x29/0xd0 RSP: 0018:ffff8800379e1dd0 EFLAGS: 00010a83 RAX: 6b6b6b6b6b6b6b6b RBX: ffff8800379c2fd0 RCX: dead000000200200 RDX: 6b6b6b6b6b6b6b6b RSI: 0000000000000001 RDI: ffff8800379c2fd0 RBP: ffff8800379e1dd0 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800379c2f90 R13: ffff880037839160 R14: 0000000000000000 R15: 00000000013352f0 FS: 00007f1400e34740(0000) GS:ffff8800bfb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f464124c763 CR3: 00000000b68cf000 CR4: 00000000000006e0 Stack: ffff8800379e1df0 ffffffff8155beab 6b6b6b6b6b6b6b2b ffff8800378391c0 ffff8800379e1e18 ffffffff8156499b ffff880037839be0 ffff880037839d20 ffff88003779d3f0 ffff8800379e1e38 ffffffffa003477c ffff88003779d388 Call Trace: [<ffffffff8155beab>] netif_napi_del+0x1b/0x80 [<ffffffff8156499b>] free_netdev+0x8b/0x110 [<ffffffffa003477c>] virtnet_remove+0x7c/0x90 [virtio_net] [<ffffffff813ae323>] virtio_dev_remove+0x23/0x80 [<ffffffff813f62ef>] __device_release_driver+0x7f/0xf0 [<ffffffff813f6ca0>] driver_detach+0xc0/0xd0 [<ffffffff813f5f28>] bus_remove_driver+0x58/0xd0 [<ffffffff813f72ec>] driver_unregister+0x2c/0x50 [<ffffffff813ae65e>] unregister_virtio_driver+0xe/0x10 [<ffffffffa0036942>] virtio_net_driver_exit+0x10/0x6ce [virtio_net] [<ffffffff810d7cf2>] SyS_delete_module+0x172/0x220 [<ffffffff810a732d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff810f5d4c>] ? __audit_syscall_entry+0x9c/0xf0 [<ffffffff81677f69>] system_call_fastpath+0x16/0x1b Code: 00 00 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de 48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00 RIP [<ffffffff81322e19>] __list_del_entry+0x29/0xd0 RSP <ffff8800379e1dd0> ---[ end trace d5931cd3f87c9763 ]--- Fixes: 986a4f4 (virtio_net: multiqueue support) Cc: Rusty Russell <[email protected]> Cc: "Michael S. Tsirkin" <[email protected]> Signed-off-by: Andrey Vagin <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>

…is not present commit dc75925(OMAP: hwmod: Fix the missing braces) introduced missing braces, however, we just set return result if clk_get fail and we populate the error pointer in clk pointer and pass it along to clk_prepare. This is wrong. The intent seems to be retry remaining clocks if they are available and warn the ones we cant find clks for. With the current logic, we see the following crash: omap_hwmod: l3_main: cannot clk_get interface_clk emac_ick Unable to handle kernel NULL pointer dereference at virtual address 00000032 pgd = c0004000 [00000032] *pgd=00000000 Internal error: Oops: 5 [#1] SMP ARM Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.13.0-rc1-00044-gcc9fd5a-dirty torvalds#19 task: ce0c3440 ti: ce0c4000 task.ti: ce0c4000 PC is at __clk_prepare+0x10/0x74 LR is at clk_prepare+0x14/0x24 <snip> [<c044d59c>] (__clk_prepare+0x10/0x74) from [<c044d9b0>] (clk_prepare+0x14/0x24) [<c044d9b0>] (clk_prepare+0x14/0x24) from [<c077d8c4>] (_init+0x24c/0x3bc) [<c077d8c4>] (_init+0x24c/0x3bc) from [<c0027328>] (omap_hwmod_for_each+0x34/0x5c) [<c0027328>] (omap_hwmod_for_each+0x34/0x5c) from [<c077dfa0>] (__omap_hwmod_setup_all+0x24/0x40) [<c077dfa0>] (__omap_hwmod_setup_all+0x24/0x40) from [<c0008928>] (do_one_initcall+0x38/0x168) [<c0008928>] (do_one_initcall+0x38/0x168) from [<c0771be8>] (kernel_init_freeable+0xfc/0x1cc) [<c0771be8>] (kernel_init_freeable+0xfc/0x1cc) from [<c0521064>] (kernel_init+0x8/0x110) [<c0521064>] (kernel_init+0x8/0x110) from [<c000e568>] (ret_from_fork+0x14/0x2c) Code: e92d4038 e2504000 01a05004 0a000005 (e5943034) So, just warn and continue instead of proceeding and crashing, with missing clock nodes/bad data, we will eventually fail, however we should now have enough information to identify the culprit. Signed-off-by: Nishanth Menon <[email protected]> Signed-off-by: Paul Walmsley <[email protected]>

Commit 2fc4802 ("dm persistent metadata: add space map threshold callback") introduced a regression to the metadata block allocation path that resulted in errors being ignored. This regression was uncovered by running the following device-mapper-test-suite test: dmtest run --suite thin-provisioning -n /exhausting_metadata_space_causes_fail_mode/ The ignored error codes in sm_metadata_new_block() could crash the kernel through use of either the dm-thin or dm-cache targets, e.g.: device-mapper: thin: 253:4: reached low water mark for metadata device: sending event. device-mapper: space map metadata: unable to allocate new metadata block general protection fault: 0000 [#1] SMP ... Workqueue: dm-thin do_worker [dm_thin_pool] task: ffff880035ce2ab0 ti: ffff88021a054000 task.ti: ffff88021a054000 RIP: 0010:[<ffffffffa0331385>] [<ffffffffa0331385>] metadata_ll_load_ie+0x15/0x30 [dm_persistent_data] RSP: 0018:ffff88021a055a68 EFLAGS: 00010202 RAX: 003fc8243d212ba0 RBX: ffff88021a780070 RCX: ffff88021a055a78 RDX: ffff88021a055a78 RSI: 0040402222a92a80 RDI: ffff88021a780070 RBP: ffff88021a055a68 R08: ffff88021a055ba4 R09: 0000000000000010 R10: 0000000000000000 R11: 00000002a02e1000 R12: ffff88021a055ad4 R13: 0000000000000598 R14: ffffffffa0338470 R15: ffff88021a055ba4 FS: 0000000000000000(0000) GS:ffff88033fca0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f467c0291b8 CR3: 0000000001a0b000 CR4: 00000000000007e0 Stack: ffff88021a055ab8 ffffffffa0332020 ffff88021a055b30 0000000000000001 ffff88021a055b30 0000000000000000 ffff88021a055b18 0000000000000000 ffff88021a055ba4 ffff88021a055b98 ffff88021a055ae8 ffffffffa033304c Call Trace: [<ffffffffa0332020>] sm_ll_lookup_bitmap+0x40/0xa0 [dm_persistent_data] [<ffffffffa033304c>] sm_metadata_count_is_more_than_one+0x8c/0xc0 [dm_persistent_data] [<ffffffffa0333825>] dm_tm_shadow_block+0x65/0x110 [dm_persistent_data] [<ffffffffa0331b00>] sm_ll_mutate+0x80/0x300 [dm_persistent_data] [<ffffffffa0330e60>] ? set_ref_count+0x10/0x10 [dm_persistent_data] [<ffffffffa0331dba>] sm_ll_inc+0x1a/0x20 [dm_persistent_data] [<ffffffffa0332270>] sm_disk_new_block+0x60/0x80 [dm_persistent_data] [<ffffffff81520036>] ? down_write+0x16/0x40 [<ffffffffa001e5c4>] dm_pool_alloc_data_block+0x54/0x80 [dm_thin_pool] [<ffffffffa001b23c>] alloc_data_block+0x9c/0x130 [dm_thin_pool] [<ffffffffa001c27e>] provision_block+0x4e/0x180 [dm_thin_pool] [<ffffffffa001fe9a>] ? dm_thin_find_block+0x6a/0x110 [dm_thin_pool] [<ffffffffa001c57a>] process_bio+0x1ca/0x1f0 [dm_thin_pool] [<ffffffff8111e2ed>] ? mempool_free+0x8d/0xa0 [<ffffffffa001d755>] process_deferred_bios+0xc5/0x230 [dm_thin_pool] [<ffffffffa001d911>] do_worker+0x51/0x60 [dm_thin_pool] [<ffffffff81067872>] process_one_work+0x182/0x3b0 [<ffffffff81068c90>] worker_thread+0x120/0x3a0 [<ffffffff81068b70>] ? manage_workers+0x160/0x160 [<ffffffff8106eb2e>] kthread+0xce/0xe0 [<ffffffff8106ea60>] ? kthread_freezable_should_stop+0x70/0x70 [<ffffffff8152af6c>] ret_from_fork+0x7c/0xb0 [<ffffffff8106ea60>] ? kthread_freezable_should_stop+0x70/0x70 [<ffffffff8152af6c>] ret_from_fork+0x7c/0xb0 [<ffffffff8106ea60>] ? kthread_freezable_should_stop+0x70/0x70 Signed-off-by: Mike Snitzer <[email protected]> Acked-by: Joe Thornber <[email protected]> Cc: [email protected] # v3.10+

Dave Jones reported a use after free in UDP stack : [ 5059.434216] ========================= [ 5059.434314] [ BUG: held lock freed! ] [ 5059.434420] 3.13.0-rc3+ torvalds#9 Not tainted [ 5059.434520] ------------------------- [ 5059.434620] named/863 is freeing memory ffff88005e960000-ffff88005e96061f, with a lock still held there! [ 5059.434815] (slock-AF_INET){+.-...}, at: [<ffffffff8149bd21>] udp_queue_rcv_skb+0xd1/0x4b0 [ 5059.435012] 3 locks held by named/863: [ 5059.435086] #0: (rcu_read_lock){.+.+..}, at: [<ffffffff8143054d>] __netif_receive_skb_core+0x11d/0x940 [ 5059.435295] #1: (rcu_read_lock){.+.+..}, at: [<ffffffff81467a5e>] ip_local_deliver_finish+0x3e/0x410 [ 5059.435500] #2: (slock-AF_INET){+.-...}, at: [<ffffffff8149bd21>] udp_queue_rcv_skb+0xd1/0x4b0 [ 5059.435734] stack backtrace: [ 5059.435858] CPU: 0 PID: 863 Comm: named Not tainted 3.13.0-rc3+ torvalds#9 [loadavg: 0.21 0.06 0.06 1/115 1365] [ 5059.436052] Hardware name: /D510MO, BIOS MOPNV10J.86A.0175.2010.0308.0620 03/08/2010 [ 5059.436223] 0000000000000002 ffff88007e203ad8 ffffffff8153a372 ffff8800677130e0 [ 5059.436390] ffff88007e203b10 ffffffff8108cafa ffff88005e960000 ffff88007b00cfc0 [ 5059.436554] ffffea00017a5800 ffffffff8141c490 0000000000000246 ffff88007e203b48 [ 5059.436718] Call Trace: [ 5059.436769] <IRQ> [<ffffffff8153a372>] dump_stack+0x4d/0x66 [ 5059.436904] [<ffffffff8108cafa>] debug_check_no_locks_freed+0x15a/0x160 [ 5059.437037] [<ffffffff8141c490>] ? __sk_free+0x110/0x230 [ 5059.437147] [<ffffffff8112da2a>] kmem_cache_free+0x6a/0x150 [ 5059.437260] [<ffffffff8141c490>] __sk_free+0x110/0x230 [ 5059.437364] [<ffffffff8141c5c9>] sk_free+0x19/0x20 [ 5059.437463] [<ffffffff8141cb25>] sock_edemux+0x25/0x40 [ 5059.437567] [<ffffffff8141c181>] sock_queue_rcv_skb+0x81/0x280 [ 5059.437685] [<ffffffff8149bd21>] ? udp_queue_rcv_skb+0xd1/0x4b0 [ 5059.437805] [<ffffffff81499c82>] __udp_queue_rcv_skb+0x42/0x240 [ 5059.437925] [<ffffffff81541d25>] ? _raw_spin_lock+0x65/0x70 [ 5059.438038] [<ffffffff8149bebb>] udp_queue_rcv_skb+0x26b/0x4b0 [ 5059.438155] [<ffffffff8149c712>] __udp4_lib_rcv+0x152/0xb00 [ 5059.438269] [<ffffffff8149d7f5>] udp_rcv+0x15/0x20 [ 5059.438367] [<ffffffff81467b2f>] ip_local_deliver_finish+0x10f/0x410 [ 5059.438492] [<ffffffff81467a5e>] ? ip_local_deliver_finish+0x3e/0x410 [ 5059.438621] [<ffffffff81468653>] ip_local_deliver+0x43/0x80 [ 5059.438733] [<ffffffff81467f70>] ip_rcv_finish+0x140/0x5a0 [ 5059.438843] [<ffffffff81468926>] ip_rcv+0x296/0x3f0 [ 5059.438945] [<ffffffff81430b72>] __netif_receive_skb_core+0x742/0x940 [ 5059.439074] [<ffffffff8143054d>] ? __netif_receive_skb_core+0x11d/0x940 [ 5059.442231] [<ffffffff8108c81d>] ? trace_hardirqs_on+0xd/0x10 [ 5059.442231] [<ffffffff81430d83>] __netif_receive_skb+0x13/0x60 [ 5059.442231] [<ffffffff81431c1e>] netif_receive_skb+0x1e/0x1f0 [ 5059.442231] [<ffffffff814334e0>] napi_gro_receive+0x70/0xa0 [ 5059.442231] [<ffffffffa01de426>] rtl8169_poll+0x166/0x700 [r8169] [ 5059.442231] [<ffffffff81432bc9>] net_rx_action+0x129/0x1e0 [ 5059.442231] [<ffffffff810478cd>] __do_softirq+0xed/0x240 [ 5059.442231] [<ffffffff81047e25>] irq_exit+0x125/0x140 [ 5059.442231] [<ffffffff81004241>] do_IRQ+0x51/0xc0 [ 5059.442231] [<ffffffff81542bef>] common_interrupt+0x6f/0x6f We need to keep a reference on the socket, by using skb_steal_sock() at the right place. Note that another patch is needed to fix a race in udp_sk_rx_dst_set(), as we hold no lock protecting the dst. Fixes: 421b388 ("udp: ipv4: Add udp early demux") Reported-by: Dave Jones <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Cc: Shawn Bohrer <[email protected]> Signed-off-by: David S. Miller <[email protected]>

We run into this bug: [ 2736.063245] Unable to handle kernel paging request for data at address 0x00000000 [ 2736.063293] Faulting instruction address: 0xc00000000037efb0 [ 2736.063300] Oops: Kernel access of bad area, sig: 11 [#1] [ 2736.063303] SMP NR_CPUS=2048 NUMA pSeries [ 2736.063310] Modules linked in: sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_mangle ip6table_security ip6table_raw ip6t_REJECT iptable_nat nf_nat_ipv4 iptable_mangle iptable_security iptable_raw ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ebtable_filter ebtables ip6table_filter iptable_filter ip_tables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 nf_nat nf_conntrack ip6_tables ibmveth pseries_rng nx_crypto nfsd auth_rpcgss nfs_acl lockd sunrpc binfmt_misc xfs libcrc32c dm_service_time sd_mod crc_t10dif crct10dif_common ibmvfc scsi_transport_fc scsi_tgt dm_mirror dm_region_hash dm_log dm_multipath dm_mod [ 2736.063383] CPU: 1 PID: 7128 Comm: ssh Not tainted 3.10.0-48.el7.ppc64 #1 [ 2736.063389] task: c000000131930120 ti: c0000001319a0000 task.ti: c0000001319a0000 [ 2736.063394] NIP: c00000000037efb0 LR: c0000000006c40f8 CTR: 0000000000000000 [ 2736.063399] REGS: c0000001319a3870 TRAP: 0300 Not tainted (3.10.0-48.el7.ppc64) [ 2736.063403] MSR: 8000000000009032 <SF,EE,ME,IR,DR,RI> CR: 28824242 XER: 20000000 [ 2736.063415] SOFTE: 0 [ 2736.063418] CFAR: c00000000000908c [ 2736.063421] DAR: 0000000000000000, DSISR: 40000000 [ 2736.063425] GPR00: c0000000006c40f8 c0000001319a3af0 c000000001074788 c0000001319a3bf0 GPR04: 0000000000000000 0000000000000000 0000000000000020 000000000000000a GPR08: fffffffe00000002 00000000ffff0000 0000000080000001 c000000000924888 GPR12: 0000000028824248 c000000007e00400 00001fffffa0f998 0000000000000000 GPR16: 0000000000000022 00001fffffa0f998 0000010022e92470 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24: 0000000000000000 c000000000f4a828 00003ffffe527108 0000000000000000 GPR28: c000000000f4a730 c000000000f4a828 0000000000000000 c0000001319a3bf0 [ 2736.063498] NIP [c00000000037efb0] .__list_add+0x30/0x110 [ 2736.063504] LR [c0000000006c40f8] .rwsem_down_write_failed+0x78/0x264 [ 2736.063508] PACATMSCRATCH [800000000280f032] [ 2736.063511] Call Trace: [ 2736.063516] [c0000001319a3af0] [c0000001319a3b80] 0xc0000001319a3b80 (unreliable) [ 2736.063523] [c0000001319a3b80] [c0000000006c40f8] .rwsem_down_write_failed+0x78/0x264 [ 2736.063530] [c0000001319a3c50] [c0000000006c1bb0] .down_write+0x70/0x78 [ 2736.063536] [c0000001319a3cd0] [c0000000002e5ffc] .keyctl_get_persistent+0x20c/0x320 [ 2736.063542] [c0000001319a3dc0] [c0000000002e2388] .SyS_keyctl+0x238/0x260 [ 2736.063548] [c0000001319a3e30] [c000000000009e7c] syscall_exit+0x0/0x7c [ 2736.063553] Instruction dump: [ 2736.063556] 7c0802a6 fba1ffe8 fbc1fff0 fbe1fff8 7cbd2b78 7c9e2378 7c7f1b78 f8010010 [ 2736.063566] f821ff71 e8a50008 7fa52040 40de00c0 <e8be0000> 7fbd2840 40de0094 7fbff040 [ 2736.063579] ---[ end trace 2708241785538296 ]--- It's caused by uninitialized persistent_keyring_register_sem. The bug was introduced by commit f36f8c7, two typos are in that commit: CONFIG_KEYS_KERBEROS_CACHE should be CONFIG_PERSISTENT_KEYRINGS and krb_cache_register_sem should be persistent_keyring_register_sem. Signed-off-by: Xiao Guangrong <[email protected]> Signed-off-by: David Howells <[email protected]>

…ow_temp_thresh() Since commit ec39f64 ("drm/radeon/dpm: Convert to use devm_hwmon_register_with_groups") radeon_hwmon_init() is using hwmon_device_register_with_groups(), which sets `rdev' as a device private driver_data, while hwmon_attributes_visible() and radeon_hwmon_show_temp_thresh() are still waiting for `drm_device'. Fix them by using dev_get_drvdata(), in order to avoid this oops: BUG: unable to handle kernel paging request at 0000000000001e28 IP: [<ffffffffa02ae8b4>] hwmon_attributes_visible+0x18/0x3d [radeon] PGD 15057e067 PUD 151a8e067 PMD 0 Oops: 0000 [#1] PREEMPT SMP Call Trace: internal_create_group+0x114/0x1d9 sysfs_create_group+0xe/0x10 sysfs_create_groups+0x22/0x5f device_add+0x34f/0x501 device_register+0x15/0x18 hwmon_device_register_with_groups+0xb5/0xed radeon_hwmon_init+0x56/0x7c [radeon] radeon_pm_init+0x134/0x7e5 [radeon] radeon_modeset_init+0x75f/0x8ed [radeon] radeon_driver_load_kms+0xc6/0x187 [radeon] drm_dev_register+0xf9/0x1b4 [drm] drm_get_pci_dev+0x98/0x129 [drm] radeon_pci_probe+0xa3/0xac [radeon] pci_device_probe+0x6e/0xcf driver_probe_device+0x98/0x1c4 __driver_attach+0x5c/0x7e bus_for_each_dev+0x7b/0x85 driver_attach+0x19/0x1b bus_add_driver+0x104/0x1ce driver_register+0x89/0xc5 __pci_register_driver+0x58/0x5b drm_pci_init+0x86/0xea [drm] radeon_init+0x97/0x1000 [radeon] do_one_initcall+0x7f/0x117 load_module+0x1583/0x1bb4 SyS_init_module+0xa0/0xaf Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Alexander Deucher <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>

After the previous fix, there still has another ASSERT failure if turning off any type of quota while fsstress is running at the same time. Backtrace in this case: [ 50.867897] XFS: Assertion failed: XFS_IS_GQUOTA_ON(mp), file: fs/xfs/xfs_qm.c, line: 2118 [ 50.867924] ------------[ cut here ]------------ ... <snip> [ 50.867957] Kernel BUG at ffffffffa0b55a32 [verbose debug info unavailable] [ 50.867999] invalid opcode: 0000 [#1] SMP [ 50.869407] Call Trace: [ 50.869446] [<ffffffffa0bc408a>] xfs_qm_vop_create_dqattach+0x19a/0x2d0 [xfs] [ 50.869512] [<ffffffffa0b9cc45>] xfs_create+0x5c5/0x6a0 [xfs] [ 50.869564] [<ffffffffa0b5307c>] xfs_vn_mknod+0xac/0x1d0 [xfs] [ 50.869615] [<ffffffffa0b531d6>] xfs_vn_mkdir+0x16/0x20 [xfs] [ 50.869655] [<ffffffff811becd5>] vfs_mkdir+0x95/0x130 [ 50.869689] [<ffffffff811bf63a>] SyS_mkdirat+0xaa/0xe0 [ 50.869723] [<ffffffff811bf689>] SyS_mkdir+0x19/0x20 [ 50.869757] [<ffffffff8170f7dd>] system_call_fastpath+0x1a/0x1f [ 50.869793] Code: 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 <snip> [ 50.870003] RIP [<ffffffffa0b55a32>] assfail+0x22/0x30 [xfs] [ 50.870050] RSP <ffff88002941fd60> [ 50.879251] ---[ end trace c93a2b342341c65b ]--- We're hitting the ASSERT(XFS_IS_*QUOTA_ON(mp)) in xfs_qm_vop_create_dqattach(), however the assertion itself is not right IMHO. While performing quota off, we firstly clear the XFS_*QUOTA_ACTIVE bit(s) from struct xfs_mount without taking any special locks, see xfs_qm_scall_quotaoff(). Hence there is no guarantee that the desired quota is still active. Signed-off-by: Jie Liu <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Ben Myers <[email protected]> (cherry picked from commit 37eb970)

There's a possible deadlock if we flush the peers notifying work during setting mtu: [ 22.991149] ====================================================== [ 22.991173] [ INFO: possible circular locking dependency detected ] [ 22.991198] 3.10.0-54.0.1.el7.x86_64.debug #1 Not tainted [ 22.991219] ------------------------------------------------------- [ 22.991243] ip/974 is trying to acquire lock: [ 22.991261] ((&(&net_device_ctx->dwork)->work)){+.+.+.}, at: [<ffffffff8108af95>] flush_work+0x5/0x2e0 [ 22.991307] but task is already holding lock: [ 22.991330] (rtnl_mutex){+.+.+.}, at: [<ffffffff81539deb>] rtnetlink_rcv+0x1b/0x40 [ 22.991367] which lock already depends on the new lock. [ 22.991398] the existing dependency chain (in reverse order) is: [ 22.991426] -> #1 (rtnl_mutex){+.+.+.}: [ 22.991449] [<ffffffff810dfdd9>] __lock_acquire+0xb19/0x1260 [ 22.991477] [<ffffffff810e0d12>] lock_acquire+0xa2/0x1f0 [ 22.991501] [<ffffffff81673659>] mutex_lock_nested+0x89/0x4f0 [ 22.991529] [<ffffffff815392b7>] rtnl_lock+0x17/0x20 [ 22.991552] [<ffffffff815230b2>] netdev_notify_peers+0x12/0x30 [ 22.991579] [<ffffffffa0340212>] netvsc_send_garp+0x22/0x30 [hv_netvsc] [ 22.991610] [<ffffffff8108d251>] process_one_work+0x211/0x6e0 [ 22.991637] [<ffffffff8108d83b>] worker_thread+0x11b/0x3a0 [ 22.991663] [<ffffffff81095e5d>] kthread+0xed/0x100 [ 22.991686] [<ffffffff81681c6c>] ret_from_fork+0x7c/0xb0 [ 22.991715] -> #0 ((&(&net_device_ctx->dwork)->work)){+.+.+.}: [ 22.991715] [<ffffffff810de817>] check_prevs_add+0x967/0x970 [ 22.991715] [<ffffffff810dfdd9>] __lock_acquire+0xb19/0x1260 [ 22.991715] [<ffffffff810e0d12>] lock_acquire+0xa2/0x1f0 [ 22.991715] [<ffffffff8108afde>] flush_work+0x4e/0x2e0 [ 22.991715] [<ffffffff8108e1b5>] __cancel_work_timer+0x95/0x130 [ 22.991715] [<ffffffff8108e303>] cancel_delayed_work_sync+0x13/0x20 [ 22.991715] [<ffffffffa03404e4>] netvsc_change_mtu+0x84/0x200 [hv_netvsc] [ 22.991715] [<ffffffff815233d4>] dev_set_mtu+0x34/0x80 [ 22.991715] [<ffffffff8153bc2a>] do_setlink+0x23a/0xa00 [ 22.991715] [<ffffffff8153d054>] rtnl_newlink+0x394/0x5e0 [ 22.991715] [<ffffffff81539eac>] rtnetlink_rcv_msg+0x9c/0x260 [ 22.991715] [<ffffffff8155cdd9>] netlink_rcv_skb+0xa9/0xc0 [ 22.991715] [<ffffffff81539dfa>] rtnetlink_rcv+0x2a/0x40 [ 22.991715] [<ffffffff8155c41d>] netlink_unicast+0xdd/0x190 [ 22.991715] [<ffffffff8155c807>] netlink_sendmsg+0x337/0x750 [ 22.991715] [<ffffffff8150d219>] sock_sendmsg+0x99/0xd0 [ 22.991715] [<ffffffff8150d63e>] ___sys_sendmsg+0x39e/0x3b0 [ 22.991715] [<ffffffff8150eba2>] __sys_sendmsg+0x42/0x80 [ 22.991715] [<ffffffff8150ebf2>] SyS_sendmsg+0x12/0x20 [ 22.991715] [<ffffffff81681d19>] system_call_fastpath+0x16/0x1b This is because we hold the rtnl_lock() before ndo_change_mtu() and try to flush the work in netvsc_change_mtu(), in the mean time, netdev_notify_peers() may be called from worker and also trying to hold the rtnl_lock. This will lead the flush won't succeed forever. Solve this by not canceling and flushing the work, this is safe because the transmission done by NETDEV_NOTIFY_PEERS was synchronized with the netif_tx_disable() called by netvsc_change_mtu(). Reported-by: Yaju Cao <[email protected]> Tested-by: Yaju Cao <[email protected]> Cc: K. Y. Srinivasan <[email protected]> Cc: Haiyang Zhang <[email protected]> Signed-off-by: Jason Wang <[email protected]> Acked-by: Haiyang Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]>

I see the following splat with 3.13-rc1 when attempting to perform DMA: [ 253.004516] Alignment trap: not handling instruction e1902f9f at [<c0204b40>] [ 253.004583] Unhandled fault: alignment exception (0x221) at 0xdfdfdfd7 [ 253.004646] Internal error: : 221 [#1] PREEMPT SMP ARM [ 253.004691] Modules linked in: dmatest(+) [last unloaded: dmatest] [ 253.004798] CPU: 0 PID: 671 Comm: kthreadd Not tainted 3.13.0-rc1+ #2 [ 253.004864] task: df9b0900 ti: df03e000 task.ti: df03e000 [ 253.004937] PC is at dmaengine_unmap_put+0x14/0x34 [ 253.005010] LR is at pl330_tasklet+0x3c8/0x550 [ 253.005087] pc : [<c0204b44>] lr : [<c0207478>] psr: a00e0193 [ 253.005087] sp : df03fe48 ip : 00000000 fp : df03bf18 [ 253.005178] r10: bf00e108 r9 : 00000001 r8 : 00000000 [ 253.005245] r7 : df837040 r6 : dfb41800 r5 : df837048 r4 : df837000 [ 253.005316] r3 : dfdfdfcf r2 : dfb41f80 r1 : df837048 r0 : dfdfdfd7 [ 253.005384] Flags: NzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel [ 253.005459] Control: 30c5387d Table: 9fb9ba80 DAC: fffffffd [ 253.005520] Process kthreadd (pid: 671, stack limit = 0xdf03e248) This is due to desc->txd.unmap containing garbage (uninitialised memory). Rather than add another dummy initialisation to _init_desc, instead ensure that the descriptors are zero-initialised during allocation and remove the dummy, per-field initialisation. Cc: Andriy Shevchenko <[email protected]> Acked-by: Jassi Brar <[email protected]> Signed-off-by: Will Deacon <[email protected]> Acked-by: Vinod Koul <[email protected]> Signed-off-by: Dan Williams <[email protected]>

…ccessfully After a successful hugetlb page migration by soft offline, the source page will either be freed into hugepage_freelists or buddy(over-commit page). If page is in buddy, page_hstate(page) will be NULL. It will hit a NULL pointer dereference in dequeue_hwpoisoned_huge_page(). BUG: unable to handle kernel NULL pointer dereference at 0000000000000058 IP: [<ffffffff81163761>] dequeue_hwpoisoned_huge_page+0x131/0x1d0 PGD c23762067 PUD c24be2067 PMD 0 Oops: 0000 [#1] SMP So check PageHuge(page) after call migrate_pages() successfully. Signed-off-by: Jianguo Wu <[email protected]> Tested-by: Naoya Horiguchi <[email protected]> Reviewed-by: Naoya Horiguchi <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>

BUG_ON(!vma) assumption is introduced by commit 0bf598d ("mbind: add BUG_ON(!vma) in new_vma_page()"), however, even if address = __vma_address(page, vma); and vma->start < address < vma->end page_address_in_vma() may still return -EFAULT because of many other conditions in it. As a result the while loop in new_vma_page() may end with vma=NULL. This patch revert the commit and also fix the potential dereference NULL pointer reported by Dan. http://marc.info/?l=linux-mm&m=137689530323257&w=2 kernel BUG at mm/mempolicy.c:1204! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC CPU: 3 PID: 7056 Comm: trinity-child3 Not tainted 3.13.0-rc3+ #2 task: ffff8801ca5295d0 ti: ffff88005ab20000 task.ti: ffff88005ab20000 RIP: new_vma_page+0x70/0x90 RSP: 0000:ffff88005ab21db0 EFLAGS: 00010246 RAX: fffffffffffffff2 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000008040075 RSI: ffff8801c3d74600 RDI: ffffea00079a8b80 RBP: ffff88005ab21dc8 R08: 0000000000000004 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: fffffffffffffff2 R13: ffffea00079a8b80 R14: 0000000000400000 R15: 0000000000400000 FS: 00007ff49c6f4740(0000) GS:ffff880244e00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ff49c68f994 CR3: 000000005a205000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Stack: ffffea00079a8b80 ffffea00079a8bc0 ffffea00079a8ba0 ffff88005ab21e50 ffffffff811adc7a 0000000000000000 ffff8801ca5295d0 0000000464e224f8 0000000000000000 0000000000000002 0000000000000000 ffff88020ce75c00 Call Trace: migrate_pages+0x12a/0x850 SYSC_mbind+0x513/0x6a0 SyS_mbind+0xe/0x10 ia32_do_call+0x13/0x13 Code: 85 c0 75 2f 4c 89 e1 48 89 da 31 f6 bf da 00 02 00 65 44 8b 04 25 08 f7 1c 00 e8 ec fd ff ff 5b 41 5c 41 5d 5d c3 0f 1f 44 00 00 <0f> 0b 66 0f 1f 44 00 00 4c 89 e6 48 89 df ba 01 00 00 00 e8 48 RIP [<ffffffff8119f200>] new_vma_page+0x70/0x90 RSP <ffff88005ab21db0> Signed-off-by: Wanpeng Li <[email protected]> Reported-by: Dave Jones <[email protected]> Reported-by: Sasha Levin <[email protected]> Reviewed-by: Naoya Horiguchi <[email protected]> Reviewed-by: Bob Liu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>

When having nf_conntrack_timestamp enabled deleting a netns can lead to the following BUG being triggered: [63836.660000] Kernel bug detected[#1]: [63836.660000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.10.18 torvalds#14 [63836.660000] task: 802d9420 ti: 802d2000 task.ti: 802d2000 [63836.660000] $ 0 : 00000000 00000000 00000000 00000000 [63836.660000] $ 4 : 00000001 00000004 00000020 00000020 [63836.660000] $ 8 : 00000000 80064910 00000000 00000000 [63836.660000] $12 : 0bff0002 00000001 00000000 0a0a0abe [63836.660000] $16 : 802e70a0 85f29d80 00000000 00000004 [63836.660000] $20 : 85fb62a 00000002 802d3bc0 85fb62a [63836.660000] $24 : 00000000 87138110 [63836.660000] $28 : 802d2000 802d3b40 00000014 871327cc [63836.660000] Hi : 000005ff [63836.660000] Lo : f2edd000 [63836.660000] epc : 87138794 __nf_ct_ext_add_length+0xe8/0x1ec [nf_conntrack] [63836.660000] Not tainted [63836.660000] ra : 871327cc nf_conntrack_in+0x31c/0x7b8 [nf_conntrack] [63836.660000] Status: 1100d403 KERNEL EXL IE [63836.660000] Cause : 00800034 [63836.660000] PrId : 0001974c (MIPS 74Kc) [63836.660000] Modules linked in: ath9k ath9k_common pppoe ppp_async iptable_nat ath9k_hw ath pppox ppp_generic nf_nat_ipv4 nf_conntrack_ipv4 mac80211 ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_state xt_quota xt_policy xt_pkttype xt_owner xt_nat xt_multiport xt_mark xh [63836.660000] Process swapper (pid: 0, threadinfo=802d2000, task=802d9420, tls=00000000) [63836.660000] Stack : 802e70a0 871323d4 00000005 87080234 802e70a0 86d2a840 00000000 00000000 [63836.660000] Call Trace: [63836.660000] [<87138794>] __nf_ct_ext_add_length+0xe8/0x1ec [nf_conntrack] [63836.660000] [<871327cc>] nf_conntrack_in+0x31c/0x7b8 [nf_conntrack] [63836.660000] [<801ff63c>] nf_iterate+0x90/0xec [63836.660000] [<801ff730>] nf_hook_slow+0x98/0x164 [63836.660000] [<80205968>] ip_rcv+0x3e8/0x40c [63836.660000] [<801d9754>] __netif_receive_skb_core+0x624/0x6a4 [63836.660000] [<801da124>] process_backlog+0xa4/0x16c [63836.660000] [<801d9bb4>] net_rx_action+0x10c/0x1e0 [63836.660000] [<8007c5a4>] __do_softirq+0xd0/0x1bc [63836.660000] [<8007c730>] do_softirq+0x48/0x68 [63836.660000] [<8007c964>] irq_exit+0x54/0x70 [63836.660000] [<80060830>] ret_from_irq+0x0/0x4 [63836.660000] [<8006a9f8>] r4k_wait_irqoff+0x18/0x1c [63836.660000] [<8009cfb8>] cpu_startup_entry+0xa4/0x104 [63836.660000] [<802eb918>] start_kernel+0x394/0x3ac [63836.660000] [63836.660000] Code: 0082102 8c420000 2c440001 <00040336> 90440011 92350010 90560010 2485ffff 02a5a821 [63837.040000] ---[ end trace ebf660c3ce3b55e7 ]--- [63837.050000] Kernel panic - not syncing: Fatal exception in interrupt [63837.050000] Rebooting in 3 seconds.. Fix this by not unregistering the conntrack extension in the per-netns cleanup code. This bug was introduced in (73f4001 netfilter: nf_ct_tstamp: move initialization out of pernet_operations). Signed-off-by: Helmut Schaa <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>

Binding might result in a NULL device, which is dereferenced causing this BUG: [ 1317.260548] BUG: unable to handle kernel NULL pointer dereference at 000000000000097 4 [ 1317.261847] IP: [<ffffffff84225f52>] rds_ib_laddr_check+0x82/0x110 [ 1317.263315] PGD 418bcb067 PUD 3ceb21067 PMD 0 [ 1317.263502] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 1317.264179] Dumping ftrace buffer: [ 1317.264774] (ftrace buffer empty) [ 1317.265220] Modules linked in: [ 1317.265824] CPU: 4 PID: 836 Comm: trinity-child46 Tainted: G W 3.13.0-rc4- next-20131218-sasha-00013-g2cebb9b-dirty #4159 [ 1317.267415] task: ffff8803ddf33000 ti: ffff8803cd31a000 task.ti: ffff8803cd31a000 [ 1317.268399] RIP: 0010:[<ffffffff84225f52>] [<ffffffff84225f52>] rds_ib_laddr_check+ 0x82/0x110 [ 1317.269670] RSP: 0000:ffff8803cd31bdf8 EFLAGS: 00010246 [ 1317.270230] RAX: 0000000000000000 RBX: ffff88020b0dd388 RCX: 0000000000000000 [ 1317.270230] RDX: ffffffff8439822e RSI: 00000000000c000a RDI: 0000000000000286 [ 1317.270230] RBP: ffff8803cd31be38 R08: 0000000000000000 R09: 0000000000000000 [ 1317.270230] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [ 1317.270230] R13: 0000000054086700 R14: 0000000000a25de0 R15: 0000000000000031 [ 1317.270230] FS: 00007ff40251d700(0000) GS:ffff88022e200000(0000) knlGS:000000000000 0000 [ 1317.270230] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1317.270230] CR2: 0000000000000974 CR3: 00000003cd478000 CR4: 00000000000006e0 [ 1317.270230] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1317.270230] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000090602 [ 1317.270230] Stack: [ 1317.270230] 0000000054086700 5408670000a25de0 5408670000000002 0000000000000000 [ 1317.270230] ffffffff84223542 00000000ea54c767 0000000000000000 ffffffff86d26160 [ 1317.270230] ffff8803cd31be68 ffffffff84223556 ffff8803cd31beb8 ffff8800c6765280 [ 1317.270230] Call Trace: [ 1317.270230] [<ffffffff84223542>] ? rds_trans_get_preferred+0x42/0xa0 [ 1317.270230] [<ffffffff84223556>] rds_trans_get_preferred+0x56/0xa0 [ 1317.270230] [<ffffffff8421c9c3>] rds_bind+0x73/0xf0 [ 1317.270230] [<ffffffff83e4ce62>] SYSC_bind+0x92/0xf0 [ 1317.270230] [<ffffffff812493f8>] ? context_tracking_user_exit+0xb8/0x1d0 [ 1317.270230] [<ffffffff8119313d>] ? trace_hardirqs_on+0xd/0x10 [ 1317.270230] [<ffffffff8107a852>] ? syscall_trace_enter+0x32/0x290 [ 1317.270230] [<ffffffff83e4cece>] SyS_bind+0xe/0x10 [ 1317.270230] [<ffffffff843a6ad0>] tracesys+0xdd/0xe2 [ 1317.270230] Code: 00 8b 45 cc 48 8d 75 d0 48 c7 45 d8 00 00 00 00 66 c7 45 d0 02 00 89 45 d4 48 89 df e8 78 49 76 ff 41 89 c4 85 c0 75 0c 48 8b 03 <80> b8 74 09 00 00 01 7 4 06 41 bc 9d ff ff ff f6 05 2a b6 c2 02 [ 1317.270230] RIP [<ffffffff84225f52>] rds_ib_laddr_check+0x82/0x110 [ 1317.270230] RSP <ffff8803cd31bdf8> [ 1317.270230] CR2: 0000000000000974 Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: David S. Miller <[email protected]>

This patch fixes a crash while trying to deactivate a table that contains user chains. You can reproduce it via: % nft add table table1 % nft add chain table1 chain1 % nft-table-upd ip table1 dormant [ 253.021026] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030 [ 253.021114] IP: [<ffffffff8134cebd>] nf_register_hook+0x35/0x6f [ 253.021167] PGD 30fa5067 PUD 30fa2067 PMD 0 [ 253.021208] Oops: 0000 [#1] SMP [...] [ 253.023305] Call Trace: [ 253.023331] [<ffffffffa0885020>] nf_tables_newtable+0x11c/0x258 [nf_tables] [ 253.023385] [<ffffffffa0878592>] nfnetlink_rcv_msg+0x1f4/0x226 [nfnetlink] [ 253.023438] [<ffffffffa0878418>] ? nfnetlink_rcv_msg+0x7a/0x226 [nfnetlink] [ 253.023491] [<ffffffffa087839e>] ? nfnetlink_bind+0x45/0x45 [nfnetlink] [ 253.023542] [<ffffffff8134b47e>] netlink_rcv_skb+0x3c/0x88 [ 253.023586] [<ffffffffa0878973>] nfnetlink_rcv+0x3af/0x3e4 [nfnetlink] [ 253.023638] [<ffffffff813fb0d4>] ? _raw_read_unlock+0x22/0x34 [ 253.023683] [<ffffffff8134af17>] netlink_unicast+0xe2/0x161 [ 253.023727] [<ffffffff8134b29a>] netlink_sendmsg+0x304/0x332 [ 253.023773] [<ffffffff8130d250>] __sock_sendmsg_nosec+0x25/0x27 [ 253.023820] [<ffffffff8130fb93>] sock_sendmsg+0x5a/0x7b [ 253.023861] [<ffffffff8130d5d5>] ? copy_from_user+0x2a/0x2c [ 253.023905] [<ffffffff8131066f>] ? move_addr_to_kernel+0x35/0x60 [ 253.023952] [<ffffffff813107b3>] SYSC_sendto+0x119/0x15c [ 253.023995] [<ffffffff81401107>] ? sysret_check+0x1b/0x56 [ 253.024039] [<ffffffff8108dc30>] ? trace_hardirqs_on_caller+0x140/0x1db [ 253.024090] [<ffffffff8120164e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 253.024141] [<ffffffff81310caf>] SyS_sendto+0x9/0xb [ 253.026219] [<ffffffff814010e2>] system_call_fastpath+0x16/0x1b Reported-by: Alex Wei <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>

If pstate.current_pstate is 0 after the initial intel_pstate_get_cpu_pstates(), this means that we were unable to obtain any useful P-state information and there is no reason to continue, so free memory and return an error in that case. This fixes the following divide error occuring in a nested KVM guest: Intel P-state driver initializing. Intel pstate controlling: cpu 0 cpufreq: __cpufreq_add_dev: ->get() failed divide error: 0000 [#1] SMP Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.13.0-0.rc4.git5.1.fc21.x86_64 #1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff88001ea20000 ti: ffff88001e9bc000 task.ti: ffff88001e9bc000 RIP: 0010:[<ffffffff815c551d>] [<ffffffff815c551d>] intel_pstate_timer_func+0x11d/0x2b0 RSP: 0000:ffff88001ee03e18 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88001a454348 RCX: 0000000000006100 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88001ee03e38 R08: 0000000000000000 R09: 0000000000000000 R10: ffff88001ea20000 R11: 0000000000000000 R12: 00000c0a1ea20000 R13: 1ea200001ea20000 R14: ffffffff815c5400 R15: ffff88001a454348 FS: 0000000000000000(0000) GS:ffff88001ee00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 0000000001c0c000 CR4: 00000000000006f0 Stack: fffffffb1a454390 ffffffff821a4500 ffff88001a454390 0000000000000100 ffff88001ee03ea8 ffffffff81083e9a ffffffff81083e15 ffffffff82d5ed40 ffffffff8258cc60 0000000000000000 ffffffff81ac39de 0000000000000000 Call Trace: <IRQ> [<ffffffff81083e9a>] call_timer_fn+0x8a/0x310 [<ffffffff81083e15>] ? call_timer_fn+0x5/0x310 [<ffffffff815c5400>] ? pid_param_set+0x130/0x130 [<ffffffff81084354>] run_timer_softirq+0x234/0x380 [<ffffffff8107aee4>] __do_softirq+0x104/0x430 [<ffffffff8107b5fd>] irq_exit+0xcd/0xe0 [<ffffffff81770645>] smp_apic_timer_interrupt+0x45/0x60 [<ffffffff8176efb2>] apic_timer_interrupt+0x72/0x80 <EOI> [<ffffffff810e15cd>] ? vprintk_emit+0x1dd/0x5e0 [<ffffffff81757719>] printk+0x67/0x69 [<ffffffff815c1493>] __cpufreq_add_dev.isra.13+0x883/0x8d0 [<ffffffff815c14f0>] cpufreq_add_dev+0x10/0x20 [<ffffffff814a14d1>] subsys_interface_register+0xb1/0xf0 [<ffffffff815bf5cf>] cpufreq_register_driver+0x9f/0x210 [<ffffffff81fb19af>] intel_pstate_init+0x27d/0x3be [<ffffffff81761e3e>] ? mutex_unlock+0xe/0x10 [<ffffffff81fb1732>] ? cpufreq_gov_dbs_init+0x12/0x12 [<ffffffff8100214a>] do_one_initcall+0xfa/0x1b0 [<ffffffff8109dbf5>] ? parse_args+0x225/0x3f0 [<ffffffff81f64193>] kernel_init_freeable+0x1fc/0x287 [<ffffffff81f638d0>] ? do_early_param+0x88/0x88 [<ffffffff8174b530>] ? rest_init+0x150/0x150 [<ffffffff8174b53e>] kernel_init+0xe/0x130 [<ffffffff8176e27c>] ret_from_fork+0x7c/0xb0 [<ffffffff8174b530>] ? rest_init+0x150/0x150 Code: c1 e0 05 48 63 bc 03 10 01 00 00 48 63 83 d0 00 00 00 48 63 d6 48 c1 e2 08 c1 e1 08 4c 63 c2 48 c1 e0 08 48 98 48 c1 e0 08 48 99 <49> f7 f8 48 98 48 0f af f8 48 c1 ff 08 29 f9 89 ca c1 fa 1f 89 RIP [<ffffffff815c551d>] intel_pstate_timer_func+0x11d/0x2b0 RSP <ffff88001ee03e18> ---[ end trace f166110ed22cc37a ]--- Kernel panic - not syncing: Fatal exception in interrupt Reported-and-tested-by: Kashyap Chamarthy <[email protected]> Cc: Josh Boyer <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]> Cc: All applicable <[email protected]>

…after split thp Memory failures on thp tail pages cause kernel panic like below: mce: [Hardware Error]: Machine check events logged MCE exception done on CPU 7 BUG: unable to handle kernel NULL pointer dereference at 0000000000000058 IP: [<ffffffff811b7cd1>] dequeue_hwpoisoned_huge_page+0x131/0x1e0 PGD bae42067 PUD ba47d067 PMD 0 Oops: 0000 [#1] SMP ... CPU: 7 PID: 128 Comm: kworker/7:2 Tainted: G M O 3.13.0-rc4-131217-1558-00003-g83b7df08e462 torvalds#25 ... Call Trace: me_huge_page+0x3e/0x50 memory_failure+0x4bb/0xc20 mce_process_work+0x3e/0x70 process_one_work+0x171/0x420 worker_thread+0x11b/0x3a0 ? manage_workers.isra.25+0x2b0/0x2b0 kthread+0xe4/0x100 ? kthread_create_on_node+0x190/0x190 ret_from_fork+0x7c/0xb0 ? kthread_create_on_node+0x190/0x190 ... RIP dequeue_hwpoisoned_huge_page+0x131/0x1e0 CR2: 0000000000000058 The reasoning of this problem is shown below: - when we have a memory error on a thp tail page, the memory error handler grabs a refcount of the head page to keep the thp under us. - Before unmapping the error page from processes, we split the thp, where page refcounts of both of head/tail pages don't change. - Then we call try_to_unmap() over the error page (which was a tail page before). We didn't pin the error page to handle the memory error, this error page is freed and removed from LRU list. - We never have the error page on LRU list, so the first page state check returns "unknown page," then we move to the second check with the saved page flag. - The saved page flag have PG_tail set, so the second page state check returns "hugepage." - We call me_huge_page() for freed error page, then we hit the above panic. The root cause is that we didn't move refcount from the head page to the tail page after split thp. So this patch suggests to do this. This panic was introduced by commit 524fca1 ("HWPOISON: fix misjudgement of page_action() for errors on mlocked pages"). Note that we did have the same refcount problem before this commit, but it was just ignored because we had only first page state check which returned "unknown page." The commit changed the refcount problem from "doesn't work" to "kernel panic." Signed-off-by: Naoya Horiguchi <[email protected]> Reviewed-by: Wanpeng Li <[email protected]> Cc: Andi Kleen <[email protected]> Cc: <[email protected]> [3.9+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf menuconfig #1

perf menuconfig #1

mitake commented Jun 30, 2012

perf menuconfig #1

perf menuconfig #1

Comments

mitake commented Jun 30, 2012