-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warnings on dirty socket destruction during regression.test_stress_pipeline.Pipeline test #984
Comments
Probably the same problem with non-updating socket memory after adjusting skb size as in https://github.com/tempesta-tech/linux-4.9.35-tfw/commit/60883ec7aeec1000af98b69e62ec173ff5c1b988 |
Just hit the bug on Tempesta FW proxying 5000-byte index.html sericed by Nginx running on the same VM with Tempesta FW:
Looks very close to #926, it seems socket buffers weren't updated in all the places correctly. |
Just got plenty of |
* Encrypt hash for server finished (missed functionality). * Multiple fixes in handling scatter lists; * Multiple fixes for IV handling in encryption and decryption code. * Fix TLS record header and tag allocation in skb (linked with #391.11). * Many cleanups and nicer debug and errors reporting. Kernel: * Fix TLS skb type handling to call sk_write_xmit() callback. * Reserve room for TLS header in skb headroom. * Reset TCP connection if we can not encrypt data on it instead of retransmit it in plaintext. This leads to warning similar to #984 - leave as TODO for now.
* Encrypt hash for server finished (missed functionality). * Multiple fixes in handling scatter lists; * Multiple fixes for IV handling in encryption and decryption code. * Fix TLS record header and tag allocation in skb (linked with #391.11). * Many cleanups and nicer debug and errors reporting. Kernel: * Fix TLS skb type handling to call sk_write_xmit() callback. * Reserve room for TLS header in skb headroom. * Reset TCP connection if we can not encrypt data on it instead of retransmit it in plaintext. This leads to warning similar to #984 - leave as TODO for now.
The assertions fail on client sockets on Tempesta FW's side and seem caused by wrong data writings to the client sockets. Able to reproduce the issue just with wrk -d 10 -c 2 -t 1 http://192.168.100.4/index.html , where Only one backend connection is enough:
However, bigger number of backend connections doesn't affect the issue. There are no connection resets from the Apache backend. One more concurrency scenario to reproduce the issue is putting single connection to a backend directly and one more through Tempesta:
Again, both the connections must request the big file, if only one of them requests the big file while the other one requests a small file, then the warnings don't appear. |
1. accurately fix skb->truesize and TCP write memory in kernel by tcp_skb_unclone(); 2. __split_pgfrag_del() if we just move pointers, then we do not free TCP write memory, so do not change skb->truesize. 3. ss_skb_unroll(): truesize and data_len/len are completely different counters, so do not mix them in ss_skb_adjust_data_len(). By the way, during the tests I saw crazy skb overheads - truesize can be larger than len in tens kilobytes. The explanation for such overheads is various fragments stoling (e.g. our __split_pgfrag_del) and cloning. 4. cleanup: move ss_skb coalescing functions closer to their calls.
1. accurately fix skb->truesize and TCP write memory in kernel by tcp_skb_unclone(); 2. __split_pgfrag_del() if we just move pointers, then we do not free TCP write memory, so do not change skb->truesize. 3. ss_skb_unroll(): truesize and data_len/len are completely different counters, so do not mix them in ss_skb_adjust_data_len(). By the way, during the tests I saw crazy skb overheads - truesize can be larger than len in tens kilobytes. The explanation for such overheads is various fragments stoling (e.g. our __split_pgfrag_del) and cloning. 4. cleanup: move ss_skb coalescing functions closer to their calls.
1. accurately fix skb->truesize and TCP write memory in kernel by tcp_skb_unclone(); 2. __split_pgfrag_del() if we just move pointers, then we do not free TCP write memory, so do not change skb->truesize. 3. ss_skb_unroll(): truesize and data_len/len are completely different counters, so do not mix them in ss_skb_adjust_data_len(). By the way, during the tests I saw crazy skb overheads - truesize can be larger than len in tens kilobytes. The explanation for such overheads is various fragments stoling (e.g. our __split_pgfrag_del) and cloning. 4. cleanup: move ss_skb coalescing functions closer to their calls.
Tempesta is at 7ee5ded linux kernel is at
4.9.35-tfw6
tag (latest release). Simply runregression.test_stress_pipeline.Pipeline
functional tests and multiple oopses will happen onwrk
shutdown (client disconnects). At least100
concurrent connections was required to reproduce the issue.Test log:
Kernel log (a part of it since it was flooded and overflowed):
Seems that same Oopses was seen in #692
The text was updated successfully, but these errors were encountered: