-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't cache and don't forward hop-by-hop headers #650
Conversation
msg->parser.hdr.flags |= TFW_STR_HBH_HDR; | ||
|
||
/** | ||
* Mark raaw header @hdr as hop-by-hop if its name was listed in Connection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raaw -> raw
if (!ht) | ||
return; | ||
|
||
for (i = 0; i < ht->off; ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does RFC allow setting headers, mentioned in Connection, before Connection header? The code is dangerous, just consider the example:
GET / HTTP/1.1\r\n
Connection: keep-alive, Host, User-Agent\r\n
Host: foo.com\r\n
User-Agent: malicious client\r\n
\r\n
So server will receive malformed request.
We should run the loop at least from TFW_HTTP_HDR_RAW. Probably there is some limitation which headers can be mentioned in Connection header... The other way (I think the safest one) is to remove Keep-Alive only and ignore any other mentioned headers in Connection. However, we still use current mechanism for the headers removal such that we'll be able to remove other headers which are known to be mentioned in Connection. I argue against blind headers removal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- RFC does not have any restrictions where headers listed in Connection header must be placed. So both variants are ok: before or after Connection header. More over i tried to set up custom hop-by-hop headers in ngingx and apache, both allow setting custom headers before and after the header. Here example from RFC 6455 with hop-by-hop header before Connection header:
GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Origin: http://example.com
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Version: 13
- Table you point to stores names for only raw headers. I agree, it could be confusing, but i used data structure which suit me perfectly. In situation you show user-agent and every other spec header listed in connection header (except keep-alive) will not be removed
Obsoleted now, in new revision this array is statically allocated
TfwStr *hdr, *append; | ||
int r; | ||
TfwHttpHdrTbl *ht = hm->parser.hbh_parser.raw; | ||
if (!ht) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't mentioned in our Coding Style, but FreeBSD KNF requires split variables definitions and code. It's bit hard to read when code immediately follows variable definitions immediately.
} | ||
|
||
/** | ||
* Add header name listed in Connection header to rable of raw headers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rable -> table
if (hid == ht->size) | ||
if (__tfw_http_msg_grow_hdr_tbl(&hm->parser.hbh_parser.raw, | ||
hm->pool)) | ||
return -ENOMEM; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is true and powerful headers parser. Do we really allow that Connection mentioning more than 16 header? I believe it has sense to block the header. Meantime it will make the code faster and easier and that's good.
@@ -917,8 +1069,24 @@ __parse_connection(TfwHttpMsg *hm, unsigned char *data, size_t len) | |||
* it could be names of any headers, including custom headers. | |||
*/ | |||
__FSM_STATE(I_ConnOther) { | |||
__FSM_I_MATCH_MOVE(token, I_ConnOther); | |||
/* Unwind __FSM_I_MATCH_MOVE(token, I_ConnOther): */ | |||
__fsm_n = __data_remain(p); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is MATCH_MOVE macroses which embody the code.
case 'k': | ||
if (likely(__data_available(p, 11) | ||
&& C4_INT_LCM(p + 1, 'e', 'e', 'p', '-') | ||
&& C4_INT_LCM(p + 5, 'a', 'l', 'i', 'v') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The discussed bug to fix
@@ -877,6 +877,219 @@ TEST(http_parser, cookie) | |||
"\r\n"); | |||
} | |||
|
|||
TEST(http_parser, req_hop_by_hop) | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add test case described above for "Connection: Host\r\n' to mangled message.
@@ -128,7 +128,7 @@ module_exit(ttls_exit); | |||
|
|||
MODULE_AUTHOR("Tempesta Technologies"); | |||
MODULE_VERSION("2.3.0"); | |||
MODULE_LICENSE("GPLv2"); | |||
MODULE_LICENSE("GPL v2"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't it taints kernel? The change is fine if not.
Please address the comment #409 (comment) or move the requirement to #634 |
c05dcd7
to
8bf9ac6
Compare
Changes since last review:
The feature have no special requirements for #634 . (Headers listed there are not hop-by-hop) I tried to compare behaviour with nginx, but it support hop-by-hop headers in terms of obsoleted RFC 2616, not 7230 |
} else { | ||
--ce->hdr_num; | ||
continue; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that n
is not used as an array index, perhaps it's best to postpone its calculation until the else if
case.
if () {
} else if (field - resp->h_tbl->tbl < TFW_HTTP_HDR_RAW) {
} else {
}
else if (n < TFW_HTTP_HDR_RAW) | ||
h = ∅ | ||
else | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the comment above regarding the use of n
variable. In this case, n
is not used anywhere else.
* Http headers table. | ||
* | ||
* Singular headers (in terms of RFC 7230 3.2.2) go first to protect header | ||
* repetition attacks. See __hdr_is_singular() and don't forget to | ||
* update the static headers array when add a new singular header here. | ||
* If new header can be hop-by-hop header update of __parse_connection() is | ||
* also needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole comment should be reworked to clearly state which arrays and where must be updated when adding a certain type of an HTTP header field.
* group. | ||
* | ||
* @spec - bit array for special headers. Hop-by-hop special header is | ||
* stored as (0x1 << tfw_http_hdr_t[hid]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps a corresponding proper BUILD_BUG_ON()
is justified to ensure that the type used for the structure member can accommodate the necessary number of headers (has the required number of bits).
@@ -262,6 +288,7 @@ typedef struct { | |||
* @content_length - the value of Content-Length header field; | |||
* @conn - connection which the message was received on; | |||
* @jtstamp - time the message has been received, in jiffies; | |||
* @keep_alive - keep-alive timeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timeout specified in the Keep-Alive
header field? Or something else?
memmove(&hbh->raw[i], &hbh->raw[i + 1], | ||
sizeof(TfwStr) * (TFW_HTTP_HDR_NUM - i -1)); | ||
TFW_STR_INIT(&hbh->raw[TFW_HTTP_HDR_NUM - 1]); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to exclude this header from further comparisons, then why not use the same TFW_STR_HBH_HDR
flag on headers in hbh->raw[]
and skip those headers in the comparison loop? That is easier and faster than doing a memmove()
.
@@ -619,6 +739,7 @@ __FSM_STATE(st_curr) { \ | |||
/* The header value is fully parsed, move forward. */ \ | |||
if (saveval) \ | |||
__msg_hdr_chunk_fixup(p, __fsm_n); \ | |||
MARK_SPEC_HBH(id); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this should be made an inline function to complement mark_raw_hbh()
.
hdr = &hbh->raw[i]; | ||
break; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that there should be the current index into the array - analogous to ht->off
. You don't have to search for the current entry.
|
||
h = &ht->tbl[hid]; | ||
TFW_STR_FOR_EACH_DUP(d, h, end) | ||
d->flags |= TFW_STR_HBH_HDR; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to mark each duplicate as HBH? I don't think so, but maybe there's a case to prove otherwise?
* name only is enough in @hdr. | ||
*/ | ||
void | ||
tfw_http_msg_mark_hbh_hdr(TfwHttpMsg *hm, TfwStr *hdr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that this function (under a better name) belongs in http_parser.c
.
If I understand correctly, the reason for placing it here was that __hdr_lookup()
was located here, and it's a local (static) function to this sub-module. In my opinion though it should be the other way around: __hdr_lookup()
should be made public (and given a proper name for that) and used from http_parser.c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Only minor changes requested.
@@ -2002,7 +2080,7 @@ tfw_http_parse_req(void *req_data, unsigned char *data, size_t len) | |||
TfwHttpReq *req = (TfwHttpReq *)req_data; | |||
__FSM_DECLARE_VARS(req); | |||
|
|||
__hbh_parser_init(&parser->hbh_parser, true); | |||
__hbh_parser_init_req(req); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be called once per a message. However here it's called each time the parsing function is called. If a message is transferred one byte at a time, then this will be executed for each byte. That's unnecessary. I would suggest to call it in Req_0
state, and make another state Req_0_CRLF
for skipping empty lines at the start of a message. What do you think?
@@ -3452,7 +3530,7 @@ tfw_http_parse_resp(void *resp_data, unsigned char *data, size_t len) | |||
TfwHttpResp *resp = (TfwHttpResp *)resp_data; | |||
__FSM_DECLARE_VARS(resp); | |||
|
|||
__hbh_parser_init(&parser->hbh_parser, false); | |||
__hbh_parser_init_resp(resp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the comment above.
unsigned int id; | ||
|
||
for (id = 0; id < TFW_HTTP_HDR_RAW; ++id) { | ||
hdr = &ht->tbl[id]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TfwStr *hdr = &ht->tbl[id];
Or perhaps even
TfwStr *hdr = &hm->h_tbl->tbl[id];
@@ -495,29 +516,52 @@ static void | |||
mark_raw_hbh(TfwHttpMsg *hm, TfwStr *hdr) | |||
{ | |||
TfwHttpHbhHdrs *hbh = &hm->parser.hbh_parser; | |||
TfwStr * hbh_name; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra space.
TfwStr *hbh_name;
* corresponding hop-by-hop header was found. | ||
*/ | ||
for (i = 0; i < hbh->off; ++i) { | ||
hbh_name = &hbh->raw[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TfwStr *hbh_name = &hbh->raw[i];
for (i = 0; i < hbh->off; ++i) { | ||
hbh_name = &hbh->raw[i]; | ||
if ((hbh_name->flags & TFW_STR_HBH_HDR) | ||
&& !(tfw_stricmpspn(&hbh->raw[i], hdr, ':'))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The body of this if
statement is better visible in this case when the opening bracket is on a separate line.
/* Add spec header of id @hid and @name to list of spec hop-by-hop headers */ | ||
#define TRY_HBH_SPEC_HDR_LAMBDA(name, hid, lambda) \ | ||
TRY_HBH_TOKEN(name, { \ | ||
BUG_ON(hid > sizeof(parser->hbh_parser.spec) * CHAR_BIT);\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be an one-time operation per whole Tempesta run. Please take a look at BUILD_BUG_ON()
macro which is a compile-time check.
{ | ||
TfwHttpHbhHdrs *hbh_hdrs = &req->parser.hbh_parser; | ||
/* Connection is hop-by-hop header by RFC 7230 6.1 */ | ||
hbh_hdrs->spec |= 0x1 << TFW_HTTP_HDR_CONNECTION; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this should be performed only once per message, shouldn't this be a direct assignment, not an OR operation?
*/ | ||
hbh_hdrs->spec |= 0x1 << TFW_HTTP_HDR_CONNECTION; | ||
if (!req) | ||
hbh_hdrs->spec |= 0x1 << TFW_HTTP_HDR_SERVER; | ||
hbh_hdrs->spec |= 0x1 << TFW_HTTP_HDR_SERVER; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the comment above regarding direct assignment vs OR operation.
__hdr_is_singular() claims to be slow, so no reason to call it multiple times.
replace duplicated code with macro
Applied only to headers parsed by TFW_HTTP_PARSE_RAWHDR_VAL macro
3f101a6
to
386f212
Compare
@@ -1348,6 +1348,7 @@ __parse_transfer_encoding(TfwHttpMsg *hm, unsigned char *data, size_t len) | |||
/* Main (parent) HTTP request processing states. */ | |||
enum { | |||
Req_0, | |||
Req_CRLF, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion, this may appear too generic. Something like Req_0_CRLF may visibly clarify that this is one of the initial states.
__FSM_STATE(Req_0) { | ||
__hbh_parser_init_req(req); | ||
/* fall through */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect that it's not correct to move from one parser state to another by just falling through. Generally speaking, let's suppose there are no more bytes to process. The parser still believes it's in the state where it came from, while it should be in the new state already, and it should return to the new state when more bytes are available.
All moves from one FSM state to another must be done via a macro that moves the parser to the new state. Please check a few other occurrences of this fall through use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have checked out all the places in code, where fall through used instead of FSM_MOVE
directives. No errors there and fall through can be replaced with __FSM_JMP
. Since desired state is the next one fall through is used in order not to make jump. I have no idea if jump could be optimized out, so fall through is the best option here.
__FSM_STATE
saves current parser state to __fsm_const_state
: https://github.com/tempesta-tech/tempesta/blob/master/tempesta_fw/http_parser.c#L120. In all the places in parser fall through is used only when we do not move our current position in message and we want to work with the same data in next stage or stages.
Any flavour of FSM_MOVE*
is not an option here since parser->to_go
will be set to 0 for situation you are describing ( https://github.com/tempesta-tech/tempesta/blob/master/tempesta_fw/http_parser.c#L137 ). And next stage will take next character, not the same one.
Add unit test with messages with leading CRLF
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
@@ -466,6 +470,186 @@ parse_int_hex(unsigned char *data, size_t len, unsigned long *acc) | |||
return CSTR_POSTPONE; | |||
} | |||
|
|||
/** | |||
* Add spec header indexes to list of hop-by-hop headers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to issue a special fix now, but comments are written in English, so they should obey English rules: start from capital letter and finish with dot. Exception is same-line comments which are continuation of C operators, i.e.
a += b; /* this is same-line comment */
* adjusting as well as saves cache storage. | ||
* | ||
* Headers unconditionaly treated as hop-by-hop must be listed in | ||
* __hbh_parser_init() function and must be members of Special headers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems the comment is outdated since there is no function with name __hbh_parser_init()
.
} | ||
|
||
/** | ||
* Mark existing spec headers of http message @hm as hop-by-hop if they was |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"they was" -> "they were"
static int | ||
__hbh_parser_add_data(TfwHttpMsg *hm, char *data, unsigned long len, bool last) | ||
{ | ||
TfwStr *hdr = NULL, *append; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to initialize hdr
.
parser->hbh_parser.spec |= 0x1 << hid; \ | ||
if (!TFW_STR_EMPTY(&msg->h_tbl->tbl[hid])) \ | ||
msg->h_tbl->tbl[hid].flags |= TFW_STR_HBH_HDR; \ | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The macro is used in only one place, so the code would be more readable if you place the macro as inline code at the appropriate place.
msg->h_tbl->tbl[hid].flags |= TFW_STR_HBH_HDR; \ | ||
}) | ||
|
||
#define TRY_SPEC_HBH_HDR(name, hid) TRY_SPEC_HBH_HDR_LAMBDA(name, hid, {}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The macro is not used at all.
__FSM_STATE(Req_0) { | ||
__hbh_parser_init_req(req); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you create a new parser state for the initialization routine? It seems you can call _hbh_parser_init*() functions in tfw_http_msg_alloc()
where base parser is initialized.
__FSM_STATE(Resp_0) { | ||
__hbh_parser_init_resp(resp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same as for request
|
||
/* Connection header lists end-to-end spec headers */ | ||
FOR_REQ(REQ_HBH_START | ||
"Connection: Host, Content-Length, Content-Type, Connection," |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the request must be blocked since there is special header in Connection
Fix issue #409
TBD:
When response contains
Connection: keep-alive
,Keep-alive
header will be marked as hop-by-hop header and will not be saved in cache (that was the requirement in #409 ). RFC 7230 6.1 also say:Currently we have no realised way to set default
Keep-Alive
header (not found in issues but here is corresponding TODO: https://github.com/tempesta-tech/tempesta/blob/master/tempesta_fw/http.c#L639 ). So question is: deleteKeep-Alive
header in forwarded messages or not? But messages served from cache will have no such header. Currently i remove it from both forwarded messages: response and request, disabling that is one-line patch.