Possible recursive locking #61

keshonok · 2015-02-24T10:38:30Z

This message is issued by the kernel when an SKB is moved from the receive queue to the send queue as we do in the Synchronous Sockets paradigm.

[   16.946924] =============================================
[   16.946924] [ INFO: possible recursive locking detected ]
[   16.946924] 3.10.10-tempesta #8 Tainted: G           O
[   16.946924] ---------------------------------------------
[   16.946924] swapper/0/0 is trying to acquire lock:
[   16.946924]  (slock-AF_INET/1){+.-...}, at: [<ffffffffa00002a8>] ss_send+0x28/0x160 [sync_socket]
[   16.946924] 
[   16.946924] but task is already holding lock:
[   16.946924]  (slock-AF_INET/1){+.-...}, at: [<ffffffff813433bc>] tcp_v4_rcv+0x40c/0x8b0
[   16.946924] 
[   16.946924] other info that might help us debug this:
[   16.946924]  Possible unsafe locking scenario:
[   16.946924] 
[   16.946924]        CPU0
[   16.946924]        ----
[   16.946924]   lock(slock-AF_INET/1);
[   16.946924]   lock(slock-AF_INET/1);
[   16.946924] 
[   16.946924]  *** DEADLOCK ***
[   16.946924] 
[   16.946924]  May be due to missing lock nesting notation
[   16.946924] 
[   16.946924] 3 locks held by swapper/0/0:
[   16.946924]  #0:  (rcu_read_lock){.+.+..}, at: [<ffffffff812e6463>] __netif_receive_skb_core+0x63/0x590
[   16.946924]  #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff8131d750>] ip_local_deliver_finish+0x40/0x150
[   16.946924]  #2:  (slock-AF_INET/1){+.-...}, at: [<ffffffff813433bc>] tcp_v4_rcv+0x40c/0x8b0
[   16.946924] 
[   16.946924] stack backtrace:
[   16.946924] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O 3.10.10-tempesta #8
[   16.946924]  ffffffff8155d440 ffff88003fc03828 ffffffff813bf35d ffff88003fc03898
[   16.946924]  ffffffff81073b96 ffffffff81072787 0000000000000007 00000000ffffffff
[   16.946924]  ffff880000000000 00000b005849e24f ffffffff00000000 ffffffff816f6640
[   16.946924] Call Trace:
[   16.946924]  <IRQ>  [<ffffffff813bf35d>] dump_stack+0x19/0x1b
[   16.946924]  [<ffffffff81073b96>] __lock_acquire+0xfa6/0x1240
[   16.946924]  [<ffffffff81072787>] ? trace_hardirqs_on_caller+0x1d7/0x210
[   16.946924]  [<ffffffff81074658>] lock_acquire+0x98/0xd0
[   16.946924]  [<ffffffffa00002a8>] ? ss_send+0x28/0x160 [sync_socket]
[   16.946924]  [<ffffffff813c30ca>] _raw_spin_lock_nested+0x3a/0x50
[   16.946924]  [<ffffffffa00002a8>] ? ss_send+0x28/0x160 [sync_socket]
[   16.946924]  [<ffffffffa00002a8>] ss_send+0x28/0x160 [sync_socket]
[   16.946924]  [<ffffffffa0049413>] ? tfw_sched_get_srv+0x49/0x4b [tempesta_fw]
[   16.946924]  [<ffffffffa0011c8d>] tfw_connection_send_srv+0xbf/0xc6 [tempesta_fw]
[   16.946924]  [<ffffffffa00153a2>] tfw_http_req_cache_cb+0x5d/0x5f [tempesta_fw]
[   16.946924]  [<ffffffffa000ddec>] tfw_cache_req_process+0x39/0x163 [tempesta_fw]
[   16.946924]  [<ffffffffa0015345>] ? tfw_http_adjust_resp+0x13/0x13 [tempesta_fw]
[   16.946924]  [<ffffffffa0015660>] tfw_http_req_process+0x2bc/0x32b [tempesta_fw]
[   16.946924]  [<ffffffffa0015a3b>] tfw_http_msg_process+0x43/0x5e [tempesta_fw]
[   16.946924]  [<ffffffffa0012d67>] tfw_gfsm_dispatch+0x41/0x43 [tempesta_fw]
[   16.946924]  [<ffffffffa0011e4e>] tfw_connection_recv+0x88/0x8a [tempesta_fw]
[   16.946924]  [<ffffffffa000005b>] ss_tcp_process_proto_skb+0x5b/0x90 [sync_socket]
[   16.946924]  [<ffffffffa0000167>] ss_tcp_process_skb+0xd7/0x170 [sync_socket]
[   16.946924]  [<ffffffffa0000eae>] ss_tcp_data_ready+0x10e/0x25e [sync_socket]
[   16.946924]  [<ffffffff8133425d>] tcp_data_queue+0x4dd/0xd10
[   16.946924]  [<ffffffff81337660>] tcp_rcv_established+0x560/0x620
[   16.946924]  [<ffffffff8134183c>] tcp_v4_do_rcv+0x1dc/0x4e0
[   16.946924]  [<ffffffff811eea42>] ? do_raw_spin_lock+0x82/0x140
[   16.946924]  [<ffffffff813433e7>] tcp_v4_rcv+0x437/0x8b0
[   16.946924]  [<ffffffff8131d7c5>] ip_local_deliver_finish+0xb5/0x150
[   16.946924]  [<ffffffff8131d750>] ? ip_local_deliver_finish+0x40/0x150
[   16.946924]  [<ffffffff8131df70>] ip_local_deliver+0x50/0x90
[   16.946924]  [<ffffffff8131dcde>] ip_rcv_finish+0x47e/0x580
[   16.946924]  [<ffffffff8131e26c>] ip_rcv+0x2bc/0x340
[   16.946924]  [<ffffffff812e6869>] __netif_receive_skb_core+0x469/0x590
[   16.946924]  [<ffffffff812e6463>] ? __netif_receive_skb_core+0x63/0x590
[   16.946924]  [<ffffffff812e69df>] __netif_receive_skb+0x4f/0x70
[   16.946924]  [<ffffffff812e7543>] netif_receive_skb+0xb3/0x140
[   16.946924]  [<ffffffff812d9be9>] ? __netdev_alloc_skb+0x49/0xb0
[   16.946924]  [<ffffffff81270fd4>] virtnet_poll+0x624/0x770
[   16.946924]  [<ffffffff812e8783>] net_rx_action+0xa3/0x1f0
[   16.946924]  [<ffffffff81036550>] __do_softirq+0xe0/0x1d0
[   16.946924]  [<ffffffff810367c9>] irq_exit+0x69/0xc0
[   16.946924]  [<ffffffff81003c5d>] do_IRQ+0x9d/0xc0
[   16.946924]  [<ffffffff813c3df2>] common_interrupt+0x72/0x72
[   16.946924]  <EOI>  [<ffffffff81023626>] ? native_safe_halt+0x6/0x10
[   16.946924]  [<ffffffff8100a8dd>] default_idle+0x1d/0x30
[   16.946924]  [<ffffffff8100b26a>] arch_cpu_idle+0xa/0x10
[   16.946924]  [<ffffffff810664ac>] cpu_startup_entry+0x15c/0x200
[   16.946924]  [<ffffffff813b79ad>] rest_init+0xcd/0xe0
[   16.946924]  [<ffffffff813b78e0>] ? csum_partial_copy_generic+0x170/0x170
[   16.946924]  [<ffffffff815d1da0>] start_kernel+0x34d/0x358
[   16.946924]  [<ffffffff815d185c>] ? repair_env_string+0x5e/0x5e
[   16.946924]  [<ffffffff815d1120>] ? early_idt_handlers+0x120/0x120
[   16.946924]  [<ffffffff815d155b>] x86_64_start_reservations+0x2a/0x2c
[   16.946924]  [<ffffffff815d164c>] x86_64_start_kernel+0xef/0xfc

The text was updated successfully, but these errors were encountered:

keshonok · 2015-03-01T17:21:24Z

My first thought after researching the problem was that this is a case of a lockdep false positive. However this in fact appears to be a pretty serious problem within the present architecture.

The synchronous sockets code is developed after the Linux socket code. However sending and receiving processes are decoupled from each other in the Linux stack, while they are not decoupled in sync sockets code. There's not decoupling between receiving code and sending code, data is passed from receiving to sending in one go. First the receiving socket is locked, and then the lock is taken on the sending socket as well without releasing the first lock. Problem is that data go both ways at the same time, and these same locks are taken on the reverse path and in reverse order. Lockdep warning is correct. We have a classic case of a deadlock.

vdmit11 · 2015-03-07T20:24:31Z

Recently, @keshonok and I had a discussion about this issue.

We came to an agreement, that this is actually a serious issue.

Indeed, we have a deadlock here.
Here is the description:

Setup:

A typical multi-CPU environment.
A typical reverse-proxy situation: a client is connected to a back-end HTTP server via the Tempesta FW.
There are two sockets opened in the running Tempesta FW instance: cli_sk and srv_sk.
The network traffic is processed on different CPUs concurrently.
HTTP pipelining is enabled.

Because HTTP pipelining is enabled, the server and the client may send HTTP messages to each other simultaneously.
And these messages are handled on different CPUs.
So here we have two simultaneous data transfers on different CPUs:

cli_sk -> srv_sk
srv_sk -> cli_sk

The problem is both sockets have to be locked to make a transfer.
And that is done in reverse order on different CPUs, so we have a deadlock.

Here is the scenario:

CPU1:

linux: lock(cli_sk)
linux: call cli_sk->sk_data_ready()
tempesta: lock(srv_sk)
tempesta: skb_entail(srv_sk, skb)
tempesta: unlock(srv_sk)
linux: unlock(cli_sk)

CPU2:

linux: lock(srv_sk)
linux: call srv_sk->sk_data_ready()
tempesta: lock(cli_sk)
tempesta: skb_entail(cli_sk, skb)
tempesta: unlock(cli_sk)
linux: unlock(srv_sk)

Here we have a classic deadlock case.

It seems the problem closely related to the Synchronous Sockets approach: their callbacks are invoked when the socket is locked. Therefore, we can't lock any other socket in the context of an SS callback.

It seems there is no easy solution here. @krizhanovsky, please comment.

Make SS simpler by removing lots of Linux socket code from SS. Leave that job to Linux, it does it well. When a socket leaves the ESTABLISHED state, we release all Tempesta resources linked to a socket, and do not take part in any work or actions that are needed to have the socket closed. Linux does the usual job of closing the TCP connection and related sockets in a correct way. This DOES NOT solve the problem of a deadlock (issue #61).

This is related to the fix for issue #61. That fix created a window in ss_tcp_process_date() when socket is unlocked, so it may be closed in a parallel thread. If that happens, ss_tcp_process_data() can not continue as the socket's data is cleared, and skb_queue_walk_safe() is not safe anymore. Check for that condition, and bail out if that happens.

And put a couple of empty lines lines. They are nice. :-) (#61)

And put a couple of empty lines back. They are nice. :-) (#61)

keshonok added the bug label Feb 24, 2015

keshonok self-assigned this Feb 24, 2015

vdmit11 added the crucial label Mar 7, 2015

vdmit11 assigned krizhanovsky and unassigned keshonok Mar 7, 2015

krizhanovsky assigned vdmit11 and krizhanovsky and unassigned krizhanovsky and vdmit11 Mar 10, 2015

krizhanovsky added this to the 0.3.0 Proxy Filter milestone Mar 15, 2015

krizhanovsky modified the milestones: 0.4.0 Web Server, 0.3.0 Proxy Filter May 24, 2015

keshonok mentioned this issue Jun 18, 2015

ss_send() may attempt to entail data SKBs to inactive or closed socket #118

Closed

krizhanovsky closed this as completed in 36a571a Jul 21, 2015

keshonok mentioned this issue Jul 23, 2015

Do not continue processing data if the socket has been closed. #164

Merged

keshonok added a commit that referenced this issue Jul 24, 2015

Add unlikely() around the very rare condition.

40e9355

And put a couple of empty lines lines. They are nice. :-) (#61)

keshonok added a commit that referenced this issue Jul 24, 2015

Add unlikely() around the very rare condition.

32e22c5

And put a couple of empty lines back. They are nice. :-) (#61)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible recursive locking #61

Possible recursive locking #61

keshonok commented Feb 24, 2015

keshonok commented Mar 1, 2015

vdmit11 commented Mar 7, 2015

Possible recursive locking #61

Possible recursive locking #61

Comments

keshonok commented Feb 24, 2015

keshonok commented Mar 1, 2015

vdmit11 commented Mar 7, 2015