-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Server failovering may cause crashes under load or during getting of perfstat #692
Comments
Please note that this crash occurred in Tempesta with code base at the state before #670 was merged into master. |
I use to have the issue during every other run, but since some moment everything is ok. I can't reproduce the issue for a long time. I have tried a lot of different kernels and TempestaFW revisions and configurations. The issue never happen to me again. Nobody had the same issue and the issue happened only in my virtual machines. May be there was some side effect from the host machine, I'm not sure. |
I got a different Oops on the same test scenario except that I ran
at host 1 through SSH session. Also I ran wrk with higher concurrency
and used Apache HTTPD with
|
The issue is basically hard to reproduce. Linux 4.8.15-tfw works smoothly under 1h workload, however I got several Oopses when Tempesta is loaded, so I assume that original issue was caused by #697, while now the problem is in Tempesta's code. I received following Oopses, which are different from the above and which basically mean dirty socket destruction.
Also there is deplock issue on socket work queue overrun (UPD this one is fixed in 649eca9):
|
UPD Fixed in ed6ae06 I got one more oops for restarting Tempesta under heavy wrk workload. There is a deadliock on
|
I reproduced the issue once with following kernel patch: diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 1eade37..1d51186 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3191,6 +3191,10 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets,
if (!fully_acked)
break;
+ if (!skb->next) {
+ pr_err("AK_DBG peer=%x:%x\n", sk->sk_daddr, sk->sk_dport);
+ continue;
+ }
tcp_unlink_write_queue(skb, sk);
sk_wmem_free_skb(sk, skb);
if (unlikely(skb == tp->retransmit_skb_hint)) The patch gave me many lines like
It means that the problem occurs in communications between Tempesta and local HTTPD (I use Apache HTTPD). During other runs I saw different crashes in different places, including earlier code in Since all skb's sent from Tempesta to an upstream are copied by Also
|
…et_qlock and the socket lock, the first one is absolutely harmless
One more deplock warning on Tempesta restart under the wrk load (UDP fixed in 805cca7):
|
…xt, so disable softirq on fwd_qlock locking
One more interesting trace got by Tempesta restart under heavy wrk load (
|
There were a bug in FPU context switch, fixed with https://github.com/tempesta-tech/linux-4.8.15-tfw/commit/9cea1ec0156145217ac6320c40189b40b8160780 . The problem is that However, with the patch I got following oops on workload UPD Created #752 for the crash, so it's out of the issue.
|
…ant comments - that is OK to close a socket several times in unlikely case
…et_qlock and the socket lock, the first one is absolutely harmless
…xt, so disable softirq on fwd_qlock locking
The crashes are gone in my tests in a VM after I replace diff --git a/tempesta_fw/sock.c b/tempesta_fw/sock.c
index fca11c4..4e09d7e 100644
--- a/tempesta_fw/sock.c
+++ b/tempesta_fw/sock.c
@@ -324,7 +324,7 @@ ss_send(struct sock *sk, SsSkbList *skb_list, int flags)
ss_skb_queue_head_init(&sw.skb_list);
for (skb = ss_skb_peek(skb_list); skb; skb = ss_skb_next(skb)) {
/* tcp_transmit_skb() will clone the skb. */
- twin_skb = pskb_copy_for_clone(skb, GFP_ATOMIC);
+ twin_skb = pskb_copy(skb, GFP_ATOMIC);
if (!twin_skb) {
SS_WARN("Unable to copy an egress SKB.\n");
r = -ENOMEM; Both functions eventually call static inline struct sk_buff *pskb_copy(struct sk_buff *skb,
gfp_t gfp_mask)
{
return __pskb_copy(skb, skb_headroom(skb), gfp_mask);
}
static inline struct sk_buff *__pskb_copy(struct sk_buff *skb, int headroom,
gfp_t gfp_mask)
{
return __pskb_copy_fclone(skb, headroom, gfp_mask, false);
}
static inline struct sk_buff *pskb_copy_for_clone(struct sk_buff *skb,
gfp_t gfp_mask)
{
return __pskb_copy_fclone(skb, skb_headroom(skb), gfp_mask, true);
}
struct sk_buff *__pskb_copy_fclone(struct sk_buff *skb, int headroom,
gfp_t gfp_mask, bool fclone)
{
unsigned int size = skb_headlen(skb) + headroom;
int flags = skb_alloc_rx_flag(skb) | (fclone ? SKB_ALLOC_FCLONE : 0);
struct sk_buff *n = __alloc_skb(size, gfp_mask, flags, NUMA_NO_NODE);
/* THE REST IS CUT OUT. */
} This leads to a thought that something is not right in the implementation of Tempesta's |
One more Oops during debugging. It seems that all the crashes are occurred on receiving ACK from local HTTP server (Apache HTTPD in my case): sometimes the socket write queue is broken, sometimes we're freeing the socket with non-empty write queue.....
|
After the last changes in #757 , I still get following crash. I noticed that Apache HTTPDs
|
…ant comments - that is OK to close a socket several times in unlikely case
…xt, so disable softirq on fwd_qlock locking
For completeness, the issue is still present in the combination of the new kernel 4.9.35 at https://github.com/tempesta-tech/linux-4.9.35-tfw/commit/1d9fd7abef7457b64cbb200479c1b2c7b65d8f6a and the latest changes in ak-692 branch of Tempesta at c2b5c53.
|
After 1 hour test caught crash at the below, which is essentially #693. Master after merge #771 was used.
|
I can confirm that after the merge of #771 similar crashes still occur in the neighbourhood of
|
Fix memory corruption in __copy_ip_header(): don't write IP header after reserved, and only allocated, skb room; Use native Linux skb_entail(); Add assertions to debug #692; Don't use virt_to_head_page() if we're unsure that the page is heading.
Fix #691: free() skbs in ss_tx_action() is a connection is dead; Fix memory corruption in __copy_ip_header(): don't write IP header after reserved, and only allocated, skb room; Use native Linux skb_entail(); Add assertions to debug #692; Don't use virt_to_head_page() if we're unsure that the page is heading.
Last couple of days I have rather frequent crashes on latest kernel (https://github.com/tempesta-tech/linux-4.9.35-tfw/commit/5dd6e7d8bc48838763cf1c7fcdced1a4f8e17358) and latest TempestaFW a66752a Steps to reproduce:
Crash log:
|
The latest issue is caused by the recent addition to Line 79 in a66752a
skb->next pointer, and the list is attached to skb_shinfo(skb)->frag_list . In this process the skb->next pointer is not NULLed, hence the crash on the new BUG_ON() .
Yes, it's possible to rework the code, not use the convenient When an skb gets into Tempesta, it is removed from the receive queue of a socket, and then orphaned from the socket. The only exception to this in regards to the value of So, while an skb is inside Tempesta, We can just remove the I must add a note that there's an skb dumping function |
BUG_ON is not needed here. This was found by @keshonok, see #692 (comment) for details.
@keshonok I can confirm that i have no crash described above in #692 (comment) after removing of the BUG_ON statement. Thank you for investigation! |
BUG_ON is not needed here. This was found by @keshonok, see #692 (comment) for details.
The reason for the bug is really the I spent incredibly much time to come to the instrumentation patch, in the attachment. The sense of the patch is to make all skbs in the kernel to be allocated in separate pages with the new 692_instr.txt (This is .gz file renamed to .txt to make Github attach it). |
2. ss_skb_init_for_xmit() initializes forwarded skbs; 3. Protect tcp_v4_connect() -> tcp_connect() by the socket lock to let it finish it's operation before getting response (e.g. ACK) from the peer (crucial for loopback connections sending packets just by function calls); 4. Add assertions to guarantee that socket lock is aquired by current CPU; 5. Many replacements of BUG() assertions by WARN_ON_ONCE() for better reliability. 6. Define TFW_CLASSIFIER_ACCSZ in dependence on CONFIG_DEBUG_LOCK_ALLOC and several other minor fixes and cleanups; I reviewed UDP/datagram code as well and the patch introduces several fixes applicable to datagrams (e.g. skb_morph()), so I believe #615 can be safely closed.
Hard to reproduce.
Ways to reproduce:
In this case i have 80-95% probability of crash during
wrk
job or when reading statistics. Running all the programs on same host does not cause crash. If nginx's keep-alive requests limit will be set to some big value (e.g. 1M, nginx does not allow to switch off the limitation) crash does not happen.Here are my configs:
Nginx:
Tempesta:
Script to reproduce, run it on host 2.
Crash dump:
Tempesta rev: 7672713
The text was updated successfully, but these errors were encountered: