Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SS: properly implement ss_drain_accept_queue() #145

Closed
krizhanovsky opened this issue Jul 10, 2015 · 1 comment
Closed

SS: properly implement ss_drain_accept_queue() #145

krizhanovsky opened this issue Jul 10, 2015 · 1 comment
Assignees
Milestone

Comments

@krizhanovsky
Copy link
Contributor

Rework ss_drain_accept_queue(), see TODO and FIXME comments in the function.

Somewhat linked with #100 in sense that Linux socket kernel API must be patched and/or partially reimplemented in Tempesta SS.

@krizhanovsky
Copy link
Contributor Author

This becomes a bug after fixing #116 and #254. To reproduce the issue upstream server should be shutdown to generate multiple errors sent to client (I used wrk -c 1000 -t 20 -d 300s http://172.16.0.5/ to generate client load). This leads to many clients reconnection and with enough load we get following crash on Tempesta restart:

    [10888.819577] [tempesta] Stopping all modules...
    [10888.821087] BUG: unable to handle kernel paging request at 0000010000000198
    [10888.822001] IP: [<ffffffff810c072c>] __lock_acquire+0x9c/0x1620
    [10888.822001] PGD 0 
    [10888.822001] Oops: 0002 [#1] SMP
    [10888.822001] Modules linked in: tfw_sched_rr(O) tfw_sched_http(O) tfw_sched_hash(O) tempesta_fw(O) tempesta_db(O) tempesta_tls(O) nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_CHECKSUM iptable_mangle bridge stp llc ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables xfs libcrc32c crc32c_intel ghash_clmulni_intel aesni_intel ppdev aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd parport_pc parport i2c_piix4 input_leds led_class pcspkr dm_mirror dm_region_hash acpi_cpufreq i2c_core serio_raw dm_log dm_mod uinput btrfs xor raid6_pq ata_generic pata_acpi e1000 ata_piix floppy ipv6 crc_ccitt autofs4 [last unloaded: tempesta_tls]
    [10888.822001] CPU: 2 PID: 15123 Comm: sysctl Tainted: G           O    4.8.15-tfw #16
    [10888.822001] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1-1.fc24 04/01/2014
    [10888.822001] task: ffff88011824e440 task.stack: ffff8800b39a8000
    [10888.822001] RIP: 0010:[<ffffffff810c072c>]  [<ffffffff810c072c>] __lock_acquire+0x9c/0x1620
    [10888.822001] RSP: 0018:ffff8800b39abb70  EFLAGS: 00010006
    [10888.822001] RAX: 0000010000000000 RBX: ffff88010454f560 RCX: 0000000000000000
    [10888.822001] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
    [10888.822001] RBP: ffff8800b39abc08 R08: 0000000000000001 R09: 0000000000000000
    [10888.822001] R10: ffff88011824e440 R11: 0000000000000001 R12: 0000000000000000
    [10888.822001] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
    [10888.822001] FS:  00007ff5c3827740(0000) GS:ffff88013fd00000(0000) knlGS:0000000000000000
    [10888.822001] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [10888.822001] CR2: 0000010000000198 CR3: 00000000ab3e1000 CR4: 00000000003406e0
    [10888.822001] Stack:
    [10888.822001]  ffffffff82df7150 0000000000000003 ffff88011824e440 0000000000000003
    [10888.822001]  ffff88011824e440 ffff880118247000 ffff8800b39abbd0 0000000000000003
    [10888.822001]  ffff88011824e440 0000000000000003 ffff88011824e440 0000000000000003
    [10888.822001] Call Trace:
    [10888.822001]  [<ffffffff810c20ae>] lock_acquire+0xbe/0x1f0
    [10888.822001]  [<ffffffff815bae78>] ? inet_csk_listen_stop+0xe8/0x2a0
    [10888.822001]  [<ffffffff81652e08>] _raw_spin_lock+0x38/0x50
    [10888.822001]  [<ffffffff815bae78>] ? inet_csk_listen_stop+0xe8/0x2a0
    [10888.822001]  [<ffffffff815bae78>] inet_csk_listen_stop+0xe8/0x2a0
    [10888.822001]  [<ffffffff815bdf45>] tcp_close+0x45/0x460
    [10888.822001]  [<ffffffffa076fbc9>] ss_release+0x19/0x20 [tempesta_fw]
    [10888.822001]  [<ffffffffa0771c77>] tfw_sock_clnt_stop_all+0x47/0xd0 [tempesta_fw]
    [10888.822001]  [<ffffffffa0759d02>] tfw_cfg_stop+0x32/0x90 [tempesta_fw]
    [10888.822001]  [<ffffffffa076e015>] handle_sysctl_state_io+0x185/0x1b0 [tempesta_fw]
    [10888.822001]  [<ffffffffa076dec2>] ? handle_sysctl_state_io+0x32/0x1b0 [tempesta_fw]
    [10888.822001]  [<ffffffff8127a5e5>] proc_sys_call_handler+0xc5/0xe0
    [10888.822001]  [<ffffffff8127a614>] proc_sys_write+0x14/0x20
    [10888.822001]  [<ffffffff811f9f08>] __vfs_write+0x28/0x120
    [10888.822001]  [<ffffffff810bcbdc>] ? percpu_down_read+0x5c/0x90
    [10888.822001]  [<ffffffff811fd8d1>] ? __sb_start_write+0xd1/0xf0
    [10888.822001]  [<ffffffff811fd8d1>] ? __sb_start_write+0xd1/0xf0
    [10888.822001]  [<ffffffff811fa602>] vfs_write+0xb2/0x1b0
    [10888.822001]  [<ffffffff811fb959>] SyS_write+0x49/0xa0
    [10888.822001]  [<ffffffff810029c8>] do_syscall_64+0x58/0x110
    [10888.822001]  [<ffffffff8165385a>] entry_SYSCALL64_slow_path+0x25/0x25
    [10888.822001] Code: 10 44 89 44 24 1c 89 4c 24 20 e8 c0 da ff ff 48 85 c0 8b 4c 24 20 44 8b 44 24 1c 44 8b 4c 24 10 4c 8b 54 24 08 0f 84 d7 07 00 00 <f0> ff 80 98 01 00 00 8b 35 3f 19 21 02 45 8b aa 18 08 00 00 85 
    [10888.822001] RIP  [<ffffffff810c072c>] __lock_acquire+0x9c/0x1620
    [10888.822001]  RSP <ffff8800b39abb70>
    [10888.822001] CR2: 0000010000000198
    [10888.822001] ---[ end trace fb09fa3d4b91c406 ]---
    [10888.822001] Kernel panic - not syncing: Fatal exception in interrupt
    [10888.822001] Kernel Offset: disabled
    [10888.822001] Rebooting in 10 seconds..

The reason for the crash is that request socket which inet_csk_listen_stop() tries to lock is already closed and freed, because we removed wrong socket at ss_drain_accept_queue().

krizhanovsky added a commit that referenced this issue Feb 13, 2017
Fix #145: do not use socket accept queue at all
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant