Large number of failures in ab benchmark tests #383

keshonok · 2015-12-28T14:23:37Z

When running Tempesta under benchmark tests such as Apache's ab utility, the result is a very large number of failures. All of those failures are non-2xx responses. Tempesta generates error responses on internal errors, but it this case the error in question is 404 that is generated when a back end server is not available.

The issue closely correlates with how fast Tempesta restores connections to back end servers when those connections are closed. Current timeouts for re-establishing the connections with back end servers are too long to work well under high load. A different reconnect timeout algorithm is needed that would allow multiple reconnect attempts in a short time frame, and only after that would gradually increase the delay between attempts.

The text was updated successfully, but these errors were encountered:

keshonok · 2016-01-19T16:35:36Z

In addition to making Tempesta restore connections to back end servers faster, there are back end server configuration options that help to keep the connections active for longer periods of time. While it's not recommended to set these options to very high values for traditional HTTP server operation, in our case it's justified.

Nginx

There's no way to specify unlimited number of requests. There's a recommendation from one of Nginx developers.

Apache

Use a smaller initial retry interval for faster reconnects. (#383)

krizhanovsky · 2016-02-16T17:11:36Z

Actually we need to try next server if a scheduled server is dead, see Nginx's proxy_next_upstream_tries. However, we shouldn't retry all the servers since if we have a request which crashes an upstream server, then the whole backend server farm fail one by one.

krizhanovsky · 2016-09-27T12:19:04Z

There are also must be one more new configuration option - length limit and timeout for the message queue. To avoid bufferbloat problem we have to evict too old requests from the queue head with sending 504 error response to the client as well as send error response to the client if the queue is full and all the requests aren't timed out.

krizhanovsky · 2016-09-29T15:46:35Z

Note that Nginx provide non_idempotent option for proxy_next_upstream and we should do the same.

keshonok · 2017-03-03T08:27:01Z

Implemented in #660 (merge commit c40924b).

keshonok added crucial performance labels Dec 28, 2015

krizhanovsky assigned keshonok Dec 28, 2015

krizhanovsky added this to the 0.5.0 Web Server milestone Dec 28, 2015

keshonok added a commit that referenced this issue Jan 25, 2016

Use a smaller initial retry interval for faster reconnects. (#383)

ddf09c5

keshonok added a commit that referenced this issue Jan 25, 2016

Merge pull request #401 from natsys/ab-retry-interval

c934f8d

Use a smaller initial retry interval for faster reconnects. (#383)

keshonok mentioned this issue Feb 19, 2016

Return result of execution from send() and close() operations #413

Closed

krizhanovsky assigned milabs and keshonok and unassigned keshonok and milabs May 17, 2016

krizhanovsky added enhancement and removed crucial labels May 17, 2016

krizhanovsky assigned keshonok and unassigned milabs Jul 6, 2016

krizhanovsky mentioned this issue Dec 4, 2016

Schedulers functional test #659

Closed

keshonok closed this as completed Mar 3, 2017

vankoven added a commit that referenced this issue Mar 14, 2017

tests: regression: more strict testing of issue #383

e0ad7af

vankoven mentioned this issue Mar 14, 2017

[Functional tests failure] Failure of scheduler tests based on stress test #702

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Large number of failures in ab benchmark tests #383

Large number of failures in ab benchmark tests #383

keshonok commented Dec 28, 2015

keshonok commented Jan 19, 2016

krizhanovsky commented Feb 16, 2016

krizhanovsky commented Sep 27, 2016 •

edited

Loading

krizhanovsky commented Sep 29, 2016

keshonok commented Mar 3, 2017

Large number of failures in ab benchmark tests #383

Large number of failures in ab benchmark tests #383

Comments

keshonok commented Dec 28, 2015

keshonok commented Jan 19, 2016

Nginx

Apache

krizhanovsky commented Feb 16, 2016

krizhanovsky commented Sep 27, 2016 • edited Loading

krizhanovsky commented Sep 29, 2016

keshonok commented Mar 3, 2017

krizhanovsky commented Sep 27, 2016 •

edited

Loading