-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enforce the correct order of responses. Handle non-idempotent requests. #660
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have couple of questions needed to be clarified.
My IDE showed a bunch of errors in the patch, like as comparing signed to unsigned, shortening 64 bit types to 32 and dropping const qualifier. I marked that lines, but there still can be more of them.
@@ -81,7 +81,8 @@ static struct { | |||
|
|||
TfwHttpReq *req; | |||
TfwHttpResp *resp; | |||
TfwConnection connection; | |||
TfwConnection conn_req; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alignment with spaces instead of tabs
tempesta_fw/sched/tfw_sched_hash.c
Outdated
if (unlikely(tfw_connection_restricted(ch->conn)) | ||
|| unlikely(tfw_server_queue_full(ch->conn)) | ||
|| unlikely(!tfw_connection_live(ch->conn))) | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tfw_sched_hash_get_srv_conn()
can schedule message to connections with non-idempotent requests. Is tfw_connection_hasnip()
check missed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The presence of non-idempotent requests in a forwarding queue does not prohibit sending though the connection. It's just it may be a little while before this particular message is sent out, as Tempesta has to wait for responses to non-idempotent requests in the queue before forwarding more requests.
The goal for a hash scheduler is to always send over the same hash-bound connection. So how do you suggest we skip connections with non-idempotent requests in them? Do you want the algorithm to "wait out" looping in hopes that the resulting connection gets available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, i got it, you right. Speeding up by skipping connections with non-idempotent requests in them is not a priority for hash scheduler. Didn't think in that way.
tempesta_fw/sched/tfw_sched_rr.c
Outdated
continue; | ||
if (skipnip && tfw_connection_hasnip(conn)) { | ||
if (likely(tfw_connection_live(conn))) | ||
nipconn++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just save first connection with non-idempotent requests and return it if better candidate is not found? That won't require to rerun the whole cycle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The good question and the answer should be commented for the function. RR scheduler must be the fastest one. Usually the optimistic approach gives us the fastest solution: we're optimistic in assumption that there are not too many non-idempotent request or there are enough server connections. So we can save two CMPXCHG instructions on getting and putting the saved connection in quick path.
tempesta_fw/sched/tfw_sched_rr.c
Outdated
int c, s, i; | ||
TfwConnection *conn; | ||
unsigned long idx; | ||
int c, s, skipnip = 1, nipconn = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a bug here due to limitation of possible values, but c
and s
should be type of size_t
, since they are compared to conn_n
and srv_n
.
BUG_ON(!loc); | ||
BUG_ON(!arg); | ||
|
||
for (i = 0; i < loc->nipdef_sz; ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-Wsing-compare: warning: comparison of integers of different signs: 'int' and 'unsigned int'
tempesta_fw/vhost.c
Outdated
}; | ||
|
||
/* Mappings for HTTP request methods. */ | ||
static const TfwCfgEnum const __read_mostly tfw_method_enum[] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicated const
qualifiers.
tempesta_fw/vhost.c
Outdated
@@ -734,7 +971,14 @@ static TfwCfgSpec tfw_location_specs[] = { | |||
.allow_repeat = true, | |||
.cleanup = tfw_cleanup_locache | |||
}, | |||
{} | |||
{ | |||
"nonidempotent", NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New configuration option is added, but readme and configuration example do not mention it
tempesta_fw/vhost.c
Outdated
@@ -734,7 +971,14 @@ static TfwCfgSpec tfw_location_specs[] = { | |||
.allow_repeat = true, | |||
.cleanup = tfw_cleanup_locache | |||
}, | |||
{} | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alignment with spaces, not tabs
tempesta_fw/vhost.c
Outdated
|
||
/* The method: one of GET, PUT, POST, etc. in form of a bitmask. */ | ||
in_method = (char *)ce->vals[0]; | ||
ret = tfw_cfg_map_enum(tfw_method_enum, in_method, &method); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the RFC 7231 4.2:
Of the request methods defined by this specification, the GET, HEAD, OPTIONS, and
TRACE methods are defined to be safe.
...
Of the request methods defined by this specification, PUT, DELETE, and safe request
methods are idempotent.
Maybe we should add new enum with methods that could be non-idempotent?
tempesta_fw/http.c
Outdated
* Forwarding to a server is considered to be on hold after | ||
* a non-idempotent request is forwarded to the server. The hold | ||
* is removed when the holding non-idempotent request is followed | ||
* by another request from the same client, which enables pipelining. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that suits RFC 7230 6.3.2?
A user agent SHOULD NOT pipeline requests after a non-idempotent method,
until the final response status code for that method has been received, unless
the user agent has a means to detect and recover from partial failure conditions
involving the pipelined sequence.
As i got it, we cannot remove hold untill request was answered
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. As it's said in the cited excerpt from the RFC, a user agent should not pipeline after a non-idempotent method. However, if a user agent has a means to detect and recover from from partial failure conditions involving the pipelined sequence, then it may still pipeline requests after a non-idempotent method. So if another message is received from a client before a response comes to a non-idempotent request, that means that the client knows what it's doing and can recover from failure if necessary. Effectively, that removes the requirement of waiting for a response to a non-idempotent request and makes it possible to continue the pipelining.
tempesta_fw/sock_srv.c
Outdated
.allow_repeat = false, | ||
.cleanup = tfw_clean_srv_groups, | ||
}, | ||
{ 0 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add description of the options to README or better create a server configuration page at the Wiki and put description of the fields there.
tempesta_fw/vhost.c
Outdated
.allow_none = true, | ||
.allow_repeat = true, | ||
.cleanup = tfw_cleanup_locache | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also please add description of the configuration option to README/Wiki
tempesta_fw/sock_srv.c
Outdated
* FIXME: The limit on the number of reconnect attempts is used | ||
* to re-schedule requests that would never be forwarded otherwise. | ||
* Still, attempts to reconnect may be continued in hopes that the | ||
* connection will be established sooner or later. Otherwise thei |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"thei" -> "the"?
tempesta_fw/sock_srv.c
Outdated
*/ | ||
static const unsigned long timeouts[] = { 1, 10, 100, 250, 500, 1000 }; | ||
|
||
if (unlikely(srv_conn->max_attempts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big deal, but '0' treated as 'unlimited' is useful for a user, but it's better for computer to initialize max_attempts
to ~0
such that you don't need extra check for zero.
tempesta_fw/connection.h
Outdated
struct timer_list timer; | ||
TfwMsg *msg; | ||
TfwMsg *msg_sent; /*srv*/ | ||
TfwMsg *msg_resent; /*srv*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as you have to comment the members as srv
, the better place for them in TfwSrvConnection. Otherwise you just break classes hierarchy: TfwConnection is generic class handling members common for server and client connections. Also keep in mind that we can handle millions of client connections, so it's very undesired to extend TfwConnection, while we hare relatively small number of server connections.
tempesta_fw/sched/tfw_sched_rr.c
Outdated
*/ | ||
static TfwConnection * | ||
tfw_sched_rr_get_srv_conn(TfwMsg *msg, TfwSrvGroup *sg) | ||
{ | ||
int c, s, i; | ||
TfwConnection *conn; | ||
unsigned long idx; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't rename simple and widespread name to something more complex and long. i
is widespread identifier for index
and is frequently used as a loop iterator.
tempesta_fw/sched/tfw_sched_rr.c
Outdated
continue; | ||
if (skipnip && tfw_connection_hasnip(conn)) { | ||
if (likely(tfw_connection_live(conn))) | ||
nipconn++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The good question and the answer should be commented for the function. RR scheduler must be the fastest one. Usually the optimistic approach gives us the fastest solution: we're optimistic in assumption that there are not too many non-idempotent request or there are enough server connections. So we can save two CMPXCHG instructions on getting and putting the saved connection in quick path.
tempesta_fw/sched/tfw_sched_rr.c
Outdated
* Initially, connections with non-idempotent requests are also skipped | ||
* in attempt to increase throughput. However, if all live connections | ||
* contain non-idempotent requests, then re-run the algorithm and get | ||
* the first live connection as it is usually done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is non trivial thing. Please add this to README/Wiki as unique property of RR scheduler.
tempesta_fw/sock.c
Outdated
@@ -433,7 +433,7 @@ ss_droplink(struct sock *sk) | |||
int | |||
__ss_close(struct sock *sk, int flags) | |||
{ | |||
if (unlikely(!sk)) | |||
if (unlikely(!(sk && ss_sock_live(sk)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we close for example half-open or half-closed sockets? I believe we should be able to close such sockets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal here is to avoid calling close()
on opposite courses. One is a close() that comes from Tempesta, and the other is a close() that comes from the other side of the connection. Both would call connection_drop()
, and that should not be happening. So if the socket that we want to close is not in established state, then it must be that it's being closed already (apparently on a close request from the other side).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TCP socket can normally operate in half-closed state for relatively long time. Meantime, the issue of concurrent closing from both sides should be resolved by proper synchronization and reference counting. I'm working on it in context of #116 and #254 (https://github.com/tempesta-tech/tempesta/tree/ak-shutdown).
tempesta_fw/sock_srv.c
Outdated
TFW_ERR("conn_init() hook returned error\n"); | ||
return r; | ||
} | ||
|
||
/* Let schedulers use the connection hereafter. */ | ||
tfw_connection_revive(conn); | ||
|
||
/* Repair the connection is necessary. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"is" -> "if"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add the QoS logic as in the comment.
tempesta_fw/http.c
Outdated
* scheduling work flow. When the first request in a session is | ||
* scheduled by the generic logic, TfwSession->srv_conn must be | ||
* initialized to point at the appropriate TfwConnection, so that | ||
* all subsequent session hits are scheduled much faster. | ||
*/ | ||
srv_conn = tfw_sched_get_srv_conn((TfwMsg *)req); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please consider this article about QoS. It has sense to add new configuration option for server group which specifies the queue size. If all the server connections are busy (tfw_server_queue_full()
returns true) we should put a request to the new queue of configured size. If srv_conn->qsize
decrements, then we can schedule next request from head of the new queue. If the new queue is full, then just do as currently - send error response to a client (however, it should be 50x response instead of 40x response.
Also please add sizes of introduced queues (and maybe some existing queues) to /proc/tempesta/perfstat
. The queue sizes are very important performance characteristics.
tempesta_fw/connection.h
Outdated
* @seq_queue - queue of client's messages in the order they came; | ||
* @fwd_qlock - lock for accessing @fwd_queue and @nip_queue; | ||
* @seq_qlock - lock for accessing @seq_queue; | ||
* @ret_qlock - lock for accessing @ret_queue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems outdated comment: there is no ret_queue
.
tempesta_fw/http.c
Outdated
tfw_http_send_500(req); | ||
TFW_INC_STAT_BH(clnt.msgs_otherr); | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say that code complexity grows significantly with the patch. Maybe we win something from the optimization for the one message case, but lets rather keep the code simpler. There are a lot of lists and locks and I think we should pay more attention to optimization of normal work flow: now it seems very complex and probably we find a way to optimize it. Hopefully it'd be easier with smaller and simpler code.
tempesta_fw/http.c
Outdated
* from the client in the @seq_queue will be handled at | ||
* the time the client connection is released. | ||
*/ | ||
if (!tfw_connection_live(cli_conn)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to do this at later state https://github.com/tempesta-tech/tempesta/blob/ak-shutdown/tempesta_fw/sock.c#L282 , leaving more probability that the socket is alive before we enqueue it in ss work queue.
tempesta_fw/sock.c
Outdated
@@ -433,7 +433,7 @@ ss_droplink(struct sock *sk) | |||
int | |||
__ss_close(struct sock *sk, int flags) | |||
{ | |||
if (unlikely(!sk)) | |||
if (unlikely(!(sk && ss_sock_live(sk)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TCP socket can normally operate in half-closed state for relatively long time. Meantime, the issue of concurrent closing from both sides should be resolved by proper synchronization and reference counting. I'm working on it in context of #116 and #254 (https://github.com/tempesta-tech/tempesta/tree/ak-shutdown).
505f1b6
to
8afe6ee
Compare
Limit the number of reconnect attempts, and use the initial timeout between the attempts. The timeout still grows exponentially as the number of attempts increases.
The "on hold" state is now derived on the fly as it can be dynamic. If another request comes from a client after an non-idempotent request, then the client knows what it is doing, and these requests can be pipelined. In that case remove the flag of non-idempotency from the preceding request.
Non-idempotent requests make up an internal @nip_queue within the server's fwd_queue. @nipcnt keeps the count of those requests in @nip_queue that can be used by schedulers when making decision. Special care is taken for the case where a non-idempotent request is followed by another requests, which re-enables pipelining of those messages.
This commit implements the logic of re-sending requests that were in the server connection's queue when the connection failed. When the connection is restored, it's not scheduled until all requests in the forwarding queue are re-sent or sent to the server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please DO EXHAUSTIVE STRESS TESTING AT THE TESTBED, in particlular #383 must be tested with different configurations and load generators.
@@ -528,14 +528,14 @@ static inline void | |||
__tfw_apm_state_next(TfwPcntRanges *rng, TfwApmRBEState *st) | |||
{ | |||
int i = st->i, r, b; | |||
unsigned short rtime; | |||
unsigned short rtt; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"rtt" means "round trip time"? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, indeed it's "round trip time".
tempesta_fw/cache.c
Outdated
@@ -935,24 +935,24 @@ tfw_cache_purge_method(TfwHttpReq *req) | |||
|
|||
/* Deny PURGE requests by default. */ | |||
if (!(cache_cfg.cache && vhost->cache_purge && vhost->cache_purge_acl)) | |||
return tfw_http_send_403(req, "unconfigured purge request"); | |||
return tfw_http_send_403(req); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like an inaccurate merge. We need the reason for requests blocking, see #658
tempesta_fw/cfg.c
Outdated
@@ -88,6 +88,7 @@ | |||
#include <linux/kernel.h> | |||
#include <linux/moduleparam.h> | |||
#include <linux/vmalloc.h> | |||
#include <net/net_namespace.h> /* for sysctl */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like inaccurate merge - now sysctls are in main.c. I also noticed that cfg.c has several comment with sysctl
- they are also good to delete, but I think it's better to do in #51.
tempesta_fw/connection.h
Outdated
#define tfw_conn_hook_call(proto, c, f, ...) \ | ||
conn_hooks[proto]->f ? conn_hooks[proto]->f(c, ## __VA_ARGS__) : 0 | ||
#define TFW_CONN_HOOK_CALL(c, f...) \ | ||
tfw_conn_hook_call(TFW_CONN_TYPE2IDX(TFW_CONN_TYPE(c)), c, f) | ||
|
||
/* | ||
* Tell if a server connection connection is restricted. A restricted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double "connection"
tempesta_fw/connection.h
Outdated
tfw_cli_conn_get(TfwCliConn *cli_conn) | ||
{ | ||
tfw_connection_get((TfwConn *)cli_conn); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed empty line
if (atomic_read(&conn->refcnt) == TFW_CONN_DEATHCNT) | ||
tfw_connection_release(conn); | ||
else | ||
ret = ss_close_sync(conn->sk, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, looks good for me.
etc/tempesta_fw.conf
Outdated
@@ -4,25 +4,33 @@ | |||
|
|||
# TAG: sched. | |||
# | |||
# Specifies the scheduler used to distribute load among servers | |||
# Specifies the scheduler used to distribute the load among servers within | |||
# a group. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Space at the end of the line.
tempesta_fw/connection.h
Outdated
* @refcnt - number of users of the connection structure instance; | ||
* @timer - The keep-alive/retry timer for the connection; | ||
* @msg - message that is currently being processed; | ||
* @peer - TfwClient or TfwServer handler; | ||
* @sk - an appropriate sock handler; | ||
* @destructor - called when a connection is destroyed; | ||
* @forward - called when a request is forwarded to server; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no @forward
member in the definition.
tempesta_fw/http.c
Outdated
} | ||
|
||
/* | ||
* Put @req on the list of non-idempotent requests in @srv_conn. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Space at the end of the line
tempesta_fw/server.h
Outdated
*/ | ||
static inline bool | ||
tfw_srv_conn_need_resched(TfwSrvConn *srv_conn) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many spaces at the end of line
tempesta_fw/http.c
Outdated
"an available back end server"; | ||
static const char *s_reason_req_common = "request dropped: processing error"; | ||
static const char *s_reason_resp_common = "response dropped: processing error"; | ||
static const char *s_reason_resp_filter = "response dropped: filtered out"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We define constants in capital case. Secondly, do we really need the constants? Many of them are used only once and basically there is no strict requirements to print exactly the same error messages. Using the constants without visible need just makes the code harder to read since now we have to jump from tfw_http_send_*()
to definitions of the strings.
tempesta_fw/http.c
Outdated
@@ -523,7 +540,7 @@ tfw_http_req_move2equeue(TfwSrvConn *srv_conn, TfwHttpReq *req, | |||
* request and then sent to the client in proper seq order. | |||
*/ | |||
static void | |||
tfw_http_req_zap_error(struct list_head *equeue) | |||
tfw_http_req_zap_error(struct list_head *equeue, const char *source) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function is always called with s_source_proxy
as source
. We've already discussed that at this iteration we do not propagate the error strings to upper layers. Typically it's better to show bit vague error message, rather than mangle whole the code with additional function arguments. The alternate approach is to keep source
in some global state variable. I.e. if we're entering cache workflow, then we assign "proxy cache" to it, but it we forward a message, then we use "proxy forward". Thus please back to original version of the code, when tfw_http_send_*()
calls only are aware about error strings.
tempesta_fw/http.c
Outdated
tfw_http_req_init_ss_flags(TfwSrvConn *srv_conn, TfwHttpReq *req) | ||
{ | ||
TfwSrvGroup *sg = ((TfwServer *)(srv_conn->peer))->sg; | ||
if (tfw_cache_msg_cacheable(req) || (req->retries < sg->max_refwd)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Eliminate the "source" part of an error message. Not needed now. - Eliminate static global constant strings. Use strings in place. - Rename tfw_http_req_move2equeue() to a shorter tfw_http_req_error(). - tfw_http_req_init_ss_flags() now set to copy all SKBs. Make a TODO. - Better comment to tfw_http_conn_evict_timeout().
Good to merge |
No description provided.