-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove chunked encoding in http/1 responses #1418
Conversation
7d56a5c
to
ee0c325
Compare
ee0c325
to
57aa2ed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had the discussion in the chat and I wrote my thought in the comments.
Also there was an assumption that the PR might fix #1438 , so please check this and if it actually fixes the problem, then a scenario for the bug is required.
45f7203
to
0407428
Compare
0407428
to
cd7f001
Compare
bb61ad6
to
fbd667a
Compare
For stripping chunked parts at body parsing stage we flag chunks that should be cut, then cut chunked parts from body during HTTP1 to HTTP2 message transformation. For cached responses we always store into cache HTTP2 friendly version. During caching we not modify message, only copy data from body excluding chunks. Modification take place only during HTTP2 framing. Besides body modification we also remove If message doesn't have |
I get errors which are not in master:
dmesg
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
94b540d
to
d87b195
Compare
4001b02
to
6954b69
Compare
28a7203
to
022e523
Compare
For this purposes was extended Transfer-Encoding parser, while we don't have complex parser logic we can use single parser function for both headers. For h2 used separate function
Got rid of unnecessary error print during request body less method check
and Content-Encoding at the same time Such responses look suspicious, because Transfer-Encoding other then chunked used very rare especially with Content-Encoding in same request.
to be used in the same response
Responses to HTTP2 clients must not have connection-specific headers such as Transfer-Encoding, therefore we drop Transfer-Encoding from response by marking it "hop-by-hop" and copy encodings except "chunked" from Transfer-Encoding to new header Content-Encoding. During caching we skip Transfer-Encoding and save Content-Encoding to cache with value from Transfer-Encoding except "chunked", i.e data in cache store in HTTP2 friendly fashion. That implies in response from cache we will have Content-Encoding for both protocols HTTP2 and HTTP1. Also now we add "Content-Length" header instead of chunking body when response dosn't have framing information and conneсtion was terminated by server.
Choosed another strategy to strip chunks from body. Now we don't fill *cut* pool as previously, just flag each chunk and then remove it during caching and http2 framing. Cut pool not used. Chunked message for http1 clients not strip, just forwards as is.
cutting data and parsig Content-Encoding Also has been removed old test file.
Previous value was the same as TFW_STR_FULL_INDEX that leads to confusing tfw_hpack_encode function. As result all headers from trailer part was considered by tfw_hpack_encode as hpack full indexable. Also has been fixed inappropriate usage of TFW_STR_FULL_INDEX in http2 unit tests
The purpose of TFW_STR_CUT is flag body chunks which must be stripped from chunked body. Previously for this purposes was used TFW_STR_NAME which is confusing.
…splitting Added comments and TODO. Clarified existing comments.
What has been done: -tfw_cache_h2_add_hdr: Added index bound check in BUG_ON. -__cache_entry_size: Prettified using of assingment operator. -tfw_http_resp_copy_encodings: Changed return code on error. -tfw_http_resp_terminate: Now we try to cache response here. -__parse_check_encodings: Moved before using in macro. -__body_fixup_append: Rmoved unused argument. -tfw_perfstat_seq_show: Now socket buffers count not only for DEBUG. -__tfw_h2_make_frames has been reworked to function. -tfw_http_resp_copy_encodings now works with splitted TfwStr chunk. e.g ["gz", "ip"] will be copied as ["gzip"]
022e523
to
df3b5ba
Compare
Chunked encoding is stripped from the response while it's
being parsed, if the response is forwarded to h1 client, the
encoding is applied back for correct message framing.
Since HTTP/2 is first citizen, we don't worry about performance
degradation for http1 clients, and this solution allows us to
avoid any further worries about unneeded chunked encoding in other
subsystems like cache.
Obsoletes #1410 Fixes #1379
Instead of stripping skb fragments during parsing, another approach is used:
a new
TfwStr cut
member was added into http parser object to track all the fragments to be removed from the message. It's generic enough to allow removal of any fragments fromboth response and request.
Once parser has finished its job (on both TFW_POSTPONE TFW_PASS return codes), the
cut
storage is stripped from the message. Since http message is processed skb-by-skb, thecut
function will be called for every skb only once. This approach doesn't modifyskb
in parser internals, thus there is no need to pass modified skb parts to lower levels, which complicates code and makes it hardly tangled.Since
cut
may (and will!) grow together with usual strings not supposed to be removed, arealloc
-race is possible: both strings will grow together, which will block zero-copy realloc operations. To avoid that I've created a new TfwPool for parser needs. It's created for every new server connection. Once message is parsed and wholecut
object was stripped, the memory required forcut
object is released. This is not usual, since we normally never free objects from pools manually, but it allows to have only one pool for all messages in the single connections, and it shouldn't be exhausted.Thus during parsing, the body from h1 responses is split into two parts:
resp->body
andparser->cut
. First contains only data, the second - only chunk descriptors. Current realisation uses the same code for request and response parsing, and makes it effectively: usuallyMOVE_fixup
macroses creates a lot of chunks with 1 or two characters, which is not suitable for large bodies. Instead chunks are combined together by their role, i.e. adjastment TfwStr chunks with chunk descriptors are glued together. According to my observations, it allows to have 3-6 times less TfwStr chunks to store single chunk descriptor.Unfortunately the same approach can't be used to remove
chunked
token from theTransfer-Encoding
header. Server is allowed to apply chunked as not final encoding, sochunked
token must be stripped only if it's final encoding. We can't guarantee this during parsing, so thechunked
token is marked with flag, and removed later manually.UPD: See comment below