Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[possible bug?] worker_max_concurrency, requests queue and greaceful shutdown #5614

Closed
mrVrAlex opened this issue Dec 12, 2024 · 13 comments
Closed

Comments

@mrVrAlex
Copy link

Lets say we have max_concurrency > worker_max_concurrency settings like in example below:

$http = new Swoole\Http\Server("0.0.0.0", 9501, SWOOLE_PROCESS);
$http->set([
    'worker_num' => 1,
    'enable_coroutine' => true,
    'hook_flags' => SWOOLE_HOOK_ALL,
    'enable_preemptive_scheduler' => true,
    'max_concurrency' => 3,
    'worker_max_concurrency' => 1,
    'reload_async' => true,
    'max_wait_time' => 30,
    'dispatch_mode' => 1,
    'discard_timeout_request' => false,
]);

If we make 2 LONG request (response time will be 10sec for each) to server, then 1 request will be handled onRequest() callback immediately, but second will wait in some queue backlog - it is expected and worked fine.
If after these 2 request we will start shutdown process (for example send SIGTERM to master process), then swoole will run

  • onBeforeShutdown callback
  • onWorkerExit 2 times (BTW: why?)
    then first request will be done and send response to client
    BUT second request will get
* Empty reply from server
* Closing connection

Questions:

  1. Why second request (all connections in backlog queue) closing only after all dispatched requests to workers will be finished?
  2. Maybe exists some options handle all requests backlog (send to workers) and then shutdown?
  3. how can close backlog connections early (as soon as we already know about shutdown server)? I tried in onBeforeShutdown get info about connection (getClientInfo) - but there no information about which status request/connection have - so I see both requests here and can not detect which one already dispatched to worker and which not.
@NathanFreeman
Copy link
Member

Because you have enabled reload_async, during a restart, the worker processes will wait for asynchronous events to complete before exiting, while not processing new requests. If the worker process does not exit at this time, onWorkerExit will continue to be triggered.

@mrVrAlex
Copy link
Author

I tried change reload_async => it is not changed behaviour above.
Full example script:

$http->on('request', function ($request, $response) use ($http) {
    $startTime = round(microtime(true),6);
    $reqStartTime = $request->server['request_time_float'];
    $info = $http->getClientInfo($response->fd);
    echo "Worker {$http->worker_id} received request {$startTime} \n";
    echo "Request CLIENT_STARTED_TIME: {$request->get['time']}\n";
    echo "Request SWOOLE_STARTED_TIME: {$reqStartTime}\n";
    echo "Request SWOOLE_LAST_TIME: {$info['last_recv_time']}\n";
    echo "Request SWOOLE_LAST_SEND_TIME: {$info['last_dispatch_time']}\n";

    sleep((int) $request->get['sleep'] ?? 1);

    $response->end("<h1>Hello Swoole. #".rand(1000, 9999)."</h1>");
    $endTime = round(microtime(true),6);
    echo "Worker {$http->worker_id} finished request $endTime\n";
});
$http->on('start', function ($server) {
    echo "Swoole http server is started at http://{$server->host}:{$server->port}\n";
    swoole_set_process_name("swoole_http_server_master");
});

$http->on('workerStart', function ($server, $workerId) {
    swoole_set_process_name("swoole_http_server_worker");
    echo "Worker {$workerId} started\n";
});

$http->on('workerStop', function ($server, $workerId) {
    echo "Worker {$workerId} stopped\n";
});

$http->on('workerError', function ($server, $workerId, $workerPid, $exitCode, $signal) {
    echo "Worker {$workerId} error\n";
});

$http->on('shutdown', function ($server) {
    echo "Swoole http server is shutdown\n";
});

$http->on('beforeShutdown', function ($server) {
    echo "Before shutdown\n";
    foreach ($server->connections as $fd) {
      var_dump($fd);
    }
});

$http->on('managerStart', function ($server) {
    swoole_set_process_name("swoole_http_server_manager");
    echo "Manager started\n";
});

$http->on('managerStop', function ($server) {
    echo "Manager stopped\n";
});

$http->on('workerExit', function ($server, $workerId) {
    echo "Worker {$workerId} exit\n";
});

$http->on('beforeReload', function ($server) {
    echo "Before reload\n";
});

$http->on('afterReload', function ($server) {
    echo "After reload\n";
});

$http->start();

Then run 1.php

Make 2 curl requests (in parallel shell tabs):

curl "http://localhost:9501/?sleep=10&time=$(date +%s.%6N)"

curl "http://localhost:9501/?sleep=2&time=$(date +%s.%6N)"

First request pass to worker and sleep on 10sec (IO wait)
Second request NOT pass to worker because of 'worker_max_concurrency' => 1
But swoole accepted second request as well and wait

In this moment we stop server

kill -15 $(pgrep -f swoole_http_server_master)

And here final result in server console:

php 1.php
Manager started
Swoole http server is started at http://0.0.0.0:9501
Worker 0 started
Worker 0 received request 1734012413.0269 
Request CLIENT_STARTED_TIME: 1734012413.021169
Request SWOOLE_STARTED_TIME: 1734012413.0268
Request SWOOLE_LAST_TIME: 1734012413.0267
Request SWOOLE_LAST_SEND_TIME: 1734012413.0267
Before shutdown
int(1)
int(2)
Worker 0 exit
Worker 0 exit
Worker 0 finished request 1734012428.0354
Worker 0 stopped
Manager stopped
Swoole http server is shutdown

As we see second request not going to worker after Before shutdown event. And we see here both connections.

Second curl request fail.

* Empty reply from server
* Closing connection

@NathanFreeman
Copy link
Member

What is the version of Swoole?

@mrVrAlex
Copy link
Author

mrVrAlex commented Dec 12, 2024

I checked on Swoole v5.1.5 and also v6.0.0-rc1 (build on latest PHP 8.3)

Video - https://imgur.com/a/yA3RH8v

@mrVrAlex
Copy link
Author

mrVrAlex commented Dec 12, 2024

I'm not sure
I see methods swoole_http_server_onBeforeRequest and also in swoole_http_server_onAfterResponse (https://github.com/swoole/swoole-src/blob/master/ext-src/swoole_http_server.cc#L393)
have same checking sw_worker()->is_shutdown()
in first push request context to queue queued_http_contexts.push(ctx); after check worker_max_concurrency)
in second we pop from queue. But maybe on moment onAfterResponse for second request we have is_shutdown() == true??

Anyway I made trace log by situation in shared video.
https://gist.github.com/mrVrAlex/acab5f214b1d0d43e351ab10b0d89385

Looks on fd=10 (first request) and fd=11 (second request) which not handled by swoole properly...

@mrVrAlex mrVrAlex changed the title [question] worker_max_concurrency, backlog and greaceful shutdown [possible bug?] worker_max_concurrency, backlog and greaceful shutdown Dec 12, 2024
@mrVrAlex mrVrAlex changed the title [possible bug?] worker_max_concurrency, backlog and greaceful shutdown [possible bug?] worker_max_concurrency, requests queue and greaceful shutdown Dec 12, 2024
@NathanFreeman
Copy link
Member

When the process restarts, it attempts to complete the current request within a specified time, which is controlled by max_wait_time. After completing the current request, it will not execute any requests that are queued. If those requests also need to be executed, the process may take a long time to restart.

@mrVrAlex
Copy link
Author

mrVrAlex commented Dec 16, 2024

After completing the current request, it will not execute any requests that are queued. If those requests also need to be executed, the process may take a long time to restart.

If queued requests will remain in max_wait_time term why swoole should drop it?
I think it unexpected because we do not know which requests was only queued and which executing now, but we set async_reload and big max_wait_time and expect process ALL request during reload/shutdown process.
Now it even not normally response (do not send http status, do no close connection, etc), just cleaning FD in reactor...

If we will not set limit 'worker_max_concurrency' then worker will just start all requests in coroutines and restart/graceful shutdown will works as expected. But in my case I use worker_max_concurrency == to DB pool size - no sense take much request because of limit on DB pool size, so internal swoole request queue - great opportunity

@mrVrAlex
Copy link
Author

mrVrAlex commented Dec 16, 2024

I wrote test to demonstrate what I'm saying.

If you change const WORKER_MAX_CONCURRENCY to 1 then this test will be failed.

BTW: @matyhtf what you think about it?

--TEST--
swoole_http_server: swoole should same behavior with any number of worker_max_concurrency
--SKIPIF--
<?php require __DIR__ . '/../include/skipif.inc'; ?>
--FILE--
<?php
require __DIR__ . '/../include/bootstrap.php';
const WORKER_MAX_CONCURRENCY = 2; // value with "1" produce test fail!
$pm = new ProcessManager;
$pm->parentFunc = function () use ($pm) {
    go(function () use ($pm) {
        $list[] = go (function () use ($pm) {
          echo httpGetBody("http://127.0.0.1:{$pm->getFreePort()}/test?id=1") . PHP_EOL;
        });
        usleep(100_000); // first request should by start first
        $list[] = go (function () use ($pm) {
            echo httpGetBody("http://127.0.0.1:{$pm->getFreePort()}/test?id=2") . PHP_EOL;
        });
        usleep(100_000); // second request should start after first request
        $pm->kill(); // then shutdown server until all request are done
        \Swoole\Coroutine::join($list);
    });
};

$pm->childFunc = function () use ($pm) {
    $server = new Swoole\Http\Server('0.0.0.0', $pm->getFreePort(), SWOOLE_PROCESS);
    $server->set([
        'log_file' => '/dev/null',
        'worker_num' => 1,
        'enable_coroutine' => true,
        'hook_flags' => SWOOLE_HOOK_ALL,
        'enable_preemptive_scheduler' => true,
        'max_concurrency' => 10,
        'worker_max_concurrency' => WORKER_MAX_CONCURRENCY,
        'reload_async' => true,
        'max_wait_time' => 30,
        'dispatch_mode' => 1,
        'discard_timeout_request' => false,
    ]);
    $server->on('start', function () use ($pm) {
        $pm->wakeup();
    });
    $server->on('request', function (Swoole\Http\Request $request, Swoole\Http\Response $response) use ($pm) {
        usleep(300_000); // we should wait processing until server will not start shutdown process
        $response->end('OK'.$request->get['id'] ?? 0);
    });
    $server->start();
};
$pm->childFirst();
$pm->run();
?>
--EXPECT--
OK1
OK2

@NathanFreeman
Copy link
Member

When the process is restarted, the requests in the queue will be discarded and not executed; otherwise, the process would take a long time to restart successfully.

@mrVrAlex
Copy link
Author

So you think it is normal just drop silently all requests in the queue?
If I will have worker_max_concurrency == max_concurrency - then all requests will send to process and will slow down restart as much as longest request. when worker_max_concurrency < max_concurrency - it's same requests just process by batches with worker_max_concurrency value. We already have max_wait_time option for manage how long restart will take. maybe should give user choose mode: want send all requests from queue to process or just drop it. Anyway swoole now even not close http connection correctly (answer like 502 bad gateway) in this situation in test above...

@NathanFreeman
Copy link
Member

You're right; it would be better to return an error code to the client before rejecting these requests. I'll see how to modify it.

@matyhtf
Copy link
Member

matyhtf commented Jan 9, 2025

@mrVrAlex Your perspective is correct; when a worker process shuts down, the current service can be considered unavailable. Requests queued within the process should receive a 503 error code. Upon receiving a 503, the client can be assured that the request has not been processed, allowing for a safe retry.

@matyhtf
Copy link
Member

matyhtf commented Jan 9, 2025

Fixed 4cb03b4

@matyhtf matyhtf closed this as completed Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants