Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible recursive locking #61

Closed
keshonok opened this issue Feb 24, 2015 · 2 comments
Closed

Possible recursive locking #61

keshonok opened this issue Feb 24, 2015 · 2 comments
Assignees

Comments

@keshonok
Copy link
Contributor

This message is issued by the kernel when an SKB is moved from the receive queue to the send queue as we do in the Synchronous Sockets paradigm.

[   16.946924] =============================================
[   16.946924] [ INFO: possible recursive locking detected ]
[   16.946924] 3.10.10-tempesta #8 Tainted: G           O
[   16.946924] ---------------------------------------------
[   16.946924] swapper/0/0 is trying to acquire lock:
[   16.946924]  (slock-AF_INET/1){+.-...}, at: [<ffffffffa00002a8>] ss_send+0x28/0x160 [sync_socket]
[   16.946924] 
[   16.946924] but task is already holding lock:
[   16.946924]  (slock-AF_INET/1){+.-...}, at: [<ffffffff813433bc>] tcp_v4_rcv+0x40c/0x8b0
[   16.946924] 
[   16.946924] other info that might help us debug this:
[   16.946924]  Possible unsafe locking scenario:
[   16.946924] 
[   16.946924]        CPU0
[   16.946924]        ----
[   16.946924]   lock(slock-AF_INET/1);
[   16.946924]   lock(slock-AF_INET/1);
[   16.946924] 
[   16.946924]  *** DEADLOCK ***
[   16.946924] 
[   16.946924]  May be due to missing lock nesting notation
[   16.946924] 
[   16.946924] 3 locks held by swapper/0/0:
[   16.946924]  #0:  (rcu_read_lock){.+.+..}, at: [<ffffffff812e6463>] __netif_receive_skb_core+0x63/0x590
[   16.946924]  #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff8131d750>] ip_local_deliver_finish+0x40/0x150
[   16.946924]  #2:  (slock-AF_INET/1){+.-...}, at: [<ffffffff813433bc>] tcp_v4_rcv+0x40c/0x8b0
[   16.946924] 
[   16.946924] stack backtrace:
[   16.946924] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O 3.10.10-tempesta #8
[   16.946924]  ffffffff8155d440 ffff88003fc03828 ffffffff813bf35d ffff88003fc03898
[   16.946924]  ffffffff81073b96 ffffffff81072787 0000000000000007 00000000ffffffff
[   16.946924]  ffff880000000000 00000b005849e24f ffffffff00000000 ffffffff816f6640
[   16.946924] Call Trace:
[   16.946924]  <IRQ>  [<ffffffff813bf35d>] dump_stack+0x19/0x1b
[   16.946924]  [<ffffffff81073b96>] __lock_acquire+0xfa6/0x1240
[   16.946924]  [<ffffffff81072787>] ? trace_hardirqs_on_caller+0x1d7/0x210
[   16.946924]  [<ffffffff81074658>] lock_acquire+0x98/0xd0
[   16.946924]  [<ffffffffa00002a8>] ? ss_send+0x28/0x160 [sync_socket]
[   16.946924]  [<ffffffff813c30ca>] _raw_spin_lock_nested+0x3a/0x50
[   16.946924]  [<ffffffffa00002a8>] ? ss_send+0x28/0x160 [sync_socket]
[   16.946924]  [<ffffffffa00002a8>] ss_send+0x28/0x160 [sync_socket]
[   16.946924]  [<ffffffffa0049413>] ? tfw_sched_get_srv+0x49/0x4b [tempesta_fw]
[   16.946924]  [<ffffffffa0011c8d>] tfw_connection_send_srv+0xbf/0xc6 [tempesta_fw]
[   16.946924]  [<ffffffffa00153a2>] tfw_http_req_cache_cb+0x5d/0x5f [tempesta_fw]
[   16.946924]  [<ffffffffa000ddec>] tfw_cache_req_process+0x39/0x163 [tempesta_fw]
[   16.946924]  [<ffffffffa0015345>] ? tfw_http_adjust_resp+0x13/0x13 [tempesta_fw]
[   16.946924]  [<ffffffffa0015660>] tfw_http_req_process+0x2bc/0x32b [tempesta_fw]
[   16.946924]  [<ffffffffa0015a3b>] tfw_http_msg_process+0x43/0x5e [tempesta_fw]
[   16.946924]  [<ffffffffa0012d67>] tfw_gfsm_dispatch+0x41/0x43 [tempesta_fw]
[   16.946924]  [<ffffffffa0011e4e>] tfw_connection_recv+0x88/0x8a [tempesta_fw]
[   16.946924]  [<ffffffffa000005b>] ss_tcp_process_proto_skb+0x5b/0x90 [sync_socket]
[   16.946924]  [<ffffffffa0000167>] ss_tcp_process_skb+0xd7/0x170 [sync_socket]
[   16.946924]  [<ffffffffa0000eae>] ss_tcp_data_ready+0x10e/0x25e [sync_socket]
[   16.946924]  [<ffffffff8133425d>] tcp_data_queue+0x4dd/0xd10
[   16.946924]  [<ffffffff81337660>] tcp_rcv_established+0x560/0x620
[   16.946924]  [<ffffffff8134183c>] tcp_v4_do_rcv+0x1dc/0x4e0
[   16.946924]  [<ffffffff811eea42>] ? do_raw_spin_lock+0x82/0x140
[   16.946924]  [<ffffffff813433e7>] tcp_v4_rcv+0x437/0x8b0
[   16.946924]  [<ffffffff8131d7c5>] ip_local_deliver_finish+0xb5/0x150
[   16.946924]  [<ffffffff8131d750>] ? ip_local_deliver_finish+0x40/0x150
[   16.946924]  [<ffffffff8131df70>] ip_local_deliver+0x50/0x90
[   16.946924]  [<ffffffff8131dcde>] ip_rcv_finish+0x47e/0x580
[   16.946924]  [<ffffffff8131e26c>] ip_rcv+0x2bc/0x340
[   16.946924]  [<ffffffff812e6869>] __netif_receive_skb_core+0x469/0x590
[   16.946924]  [<ffffffff812e6463>] ? __netif_receive_skb_core+0x63/0x590
[   16.946924]  [<ffffffff812e69df>] __netif_receive_skb+0x4f/0x70
[   16.946924]  [<ffffffff812e7543>] netif_receive_skb+0xb3/0x140
[   16.946924]  [<ffffffff812d9be9>] ? __netdev_alloc_skb+0x49/0xb0
[   16.946924]  [<ffffffff81270fd4>] virtnet_poll+0x624/0x770
[   16.946924]  [<ffffffff812e8783>] net_rx_action+0xa3/0x1f0
[   16.946924]  [<ffffffff81036550>] __do_softirq+0xe0/0x1d0
[   16.946924]  [<ffffffff810367c9>] irq_exit+0x69/0xc0
[   16.946924]  [<ffffffff81003c5d>] do_IRQ+0x9d/0xc0
[   16.946924]  [<ffffffff813c3df2>] common_interrupt+0x72/0x72
[   16.946924]  <EOI>  [<ffffffff81023626>] ? native_safe_halt+0x6/0x10
[   16.946924]  [<ffffffff8100a8dd>] default_idle+0x1d/0x30
[   16.946924]  [<ffffffff8100b26a>] arch_cpu_idle+0xa/0x10
[   16.946924]  [<ffffffff810664ac>] cpu_startup_entry+0x15c/0x200
[   16.946924]  [<ffffffff813b79ad>] rest_init+0xcd/0xe0
[   16.946924]  [<ffffffff813b78e0>] ? csum_partial_copy_generic+0x170/0x170
[   16.946924]  [<ffffffff815d1da0>] start_kernel+0x34d/0x358
[   16.946924]  [<ffffffff815d185c>] ? repair_env_string+0x5e/0x5e
[   16.946924]  [<ffffffff815d1120>] ? early_idt_handlers+0x120/0x120
[   16.946924]  [<ffffffff815d155b>] x86_64_start_reservations+0x2a/0x2c
[   16.946924]  [<ffffffff815d164c>] x86_64_start_kernel+0xef/0xfc
@keshonok keshonok added the bug label Feb 24, 2015
@keshonok keshonok self-assigned this Feb 24, 2015
@keshonok
Copy link
Contributor Author

keshonok commented Mar 1, 2015

My first thought after researching the problem was that this is a case of a lockdep false positive. However this in fact appears to be a pretty serious problem within the present architecture.

The synchronous sockets code is developed after the Linux socket code. However sending and receiving processes are decoupled from each other in the Linux stack, while they are not decoupled in sync sockets code. There's not decoupling between receiving code and sending code, data is passed from receiving to sending in one go. First the receiving socket is locked, and then the lock is taken on the sending socket as well without releasing the first lock. Problem is that data go both ways at the same time, and these same locks are taken on the reverse path and in reverse order. Lockdep warning is correct. We have a classic case of a deadlock.

@vdmit11
Copy link
Contributor

vdmit11 commented Mar 7, 2015

Recently, @keshonok and I had a discussion about this issue.

We came to an agreement, that this is actually a serious issue.

Indeed, we have a deadlock here.
Here is the description:

Setup:

  • A typical multi-CPU environment.
  • A typical reverse-proxy situation: a client is connected to a back-end HTTP server via the Tempesta FW.
  • There are two sockets opened in the running Tempesta FW instance: cli_sk and srv_sk.
  • The network traffic is processed on different CPUs concurrently.
  • HTTP pipelining is enabled.

Because HTTP pipelining is enabled, the server and the client may send HTTP messages to each other simultaneously.
And these messages are handled on different CPUs.
So here we have two simultaneous data transfers on different CPUs:

  • cli_sk -> srv_sk
  • srv_sk -> cli_sk

The problem is both sockets have to be locked to make a transfer.
And that is done in reverse order on different CPUs, so we have a deadlock.

Here is the scenario:

CPU1:

  1. linux: lock(cli_sk)
  2. linux: call cli_sk->sk_data_ready()
  3. tempesta: lock(srv_sk)
  4. tempesta: skb_entail(srv_sk, skb)
  5. tempesta: unlock(srv_sk)
  6. linux: unlock(cli_sk)

CPU2:

  1. linux: lock(srv_sk)
  2. linux: call srv_sk->sk_data_ready()
  3. tempesta: lock(cli_sk)
  4. tempesta: skb_entail(cli_sk, skb)
  5. tempesta: unlock(cli_sk)
  6. linux: unlock(srv_sk)

Here we have a classic deadlock case.

It seems the problem closely related to the Synchronous Sockets approach: their callbacks are invoked when the socket is locked. Therefore, we can't lock any other socket in the context of an SS callback.

It seems there is no easy solution here. @krizhanovsky, please comment.

@vdmit11 vdmit11 added the crucial label Mar 7, 2015
@vdmit11 vdmit11 assigned krizhanovsky and unassigned keshonok Mar 7, 2015
@krizhanovsky krizhanovsky added this to the 0.3.0 Proxy Filter milestone Mar 15, 2015
keshonok added a commit that referenced this issue Mar 16, 2015
Make SS simpler by removing lots of Linux socket code from SS.
Leave that job to Linux, it does it well. When a socket leaves
the ESTABLISHED state, we release all Tempesta resources linked
to a socket, and do not take part in any work or actions that
are needed to have the socket closed. Linux does the usual job
of closing the TCP connection and related sockets in a correct way.

This DOES NOT solve the problem of a deadlock (issue #61).
@krizhanovsky krizhanovsky modified the milestones: 0.4.0 Web Server, 0.3.0 Proxy Filter May 24, 2015
keshonok added a commit that referenced this issue Jul 23, 2015
This is related to the fix for issue #61. That fix created a window
in ss_tcp_process_date() when socket is unlocked, so it may be closed
in a parallel thread. If that happens, ss_tcp_process_data() can not
continue as the socket's data is cleared, and skb_queue_walk_safe()
is not safe anymore. Check for that condition, and bail out if that
happens.
keshonok added a commit that referenced this issue Jul 24, 2015
And put a couple of empty lines lines. They are nice. :-) (#61)
keshonok added a commit that referenced this issue Jul 24, 2015
And put a couple of empty lines back. They are nice. :-) (#61)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants