-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: fix test-async-wrap-getasyncid flakyness #14329
test: fix test-async-wrap-getasyncid flakyness #14329
Conversation
The test used to assume that if the client successfully writes data to the server and closes the connection, then the server must have received that data and ran its connection callback wrapped by common.mustCall. However, it is not necessarily the case, and as a result the test was flaky. With this change, the server is closed only once the server's connection callback has been called, which guarantees that the server can accept connections until it actually accepted the single connection that this test requires to accept, making the test not flaky. Fixes: nodejs#13020
As mentioned in #13020:
and:
|
LGTM |
/cc @bnoordhuis, @indutny, @nodejs/streams
Isn't that a reasonable assumption? |
To be more precise: [addition] [2nd addition] |
That "something" is the kernel, not necessarily any user process. The kernel can acknowledge connections that have not been "accepted" yet. It can use receive buffers to buffer any incoming data until that data is read by any user process. These buffers can have their size changed by user processes, but their default size can be larger than the size of the message sent in that test, thus allowing the client to perform the full connect/send/close chain of operations without a user process needing to accept the connection and read from it. Does that clarify things? |
I assumed it the kernel that currently acknowledges the connection, thus fulfilling the obligations of TCP. What I'm asking is — is that enough? Can't we give this guarantee at the application layer as well? Re buffer sizes, we choose a "large" message ( So my question still stands: Isn't it a bug that a client can INIT => SEND 2MB => FIN+ACK without the server (at the application layer, i.e.
|
The client and the server are operating asynchronously from each other. There is no guarantee from the OS that the fd polling mechanism will identify the server socket as readable (or "acceptable") on the next poll after a successful "connect" call from the client. The same can be said for any other interaction between the client and the server. So if the client doesn't wait for data from the server and destroys the socket that the server is listening on when it's done, then there can be no guarantee that the server will be able to perform any operation on that socket before it's closed, including accepting a connection and thus emitting a 'connection' event.
Unless I'm missing something, it seems to me that the test sends 200KB of data, not 2MB. It is quite possible that increasing the amount of data sent would reduce the number of test failures, but there would still be a chance of failures. The test would still rely on luck to succeed and would still be incorrect. |
@misterdjules, funny thing, googling this brought up nodejs/node-v0.x-archive#15443 (comment) You are right it's just 200K, but still I would imagine that should fill the receive buffer... I think I understand what is my wrong assumption, we are not using the So, I have no objections to landings this. |
The buffer is larger than 200K by default on SmartOS, its size is 1MB:
It seems on OSX (at least on the version I'm using), the TCP receive buffer might be 512KiB:
which could explain why #13020 was also reproduced on OSX. Thank you for all the questions and the review, I'll land this PR asap now. |
CI: https://ci.nodejs.org/job/node-test-pull-request/9330/. |
New CI test after CI platform failure: https://ci.nodejs.org/job/node-test-pull-request/9335/. |
Jobs failures seem to be unrelated to the actual change, and the stress test has run ~7500 tests so far on SmartOS without any failure, so I'll land this change shortly. |
Landed in 2da1af0. |
The test used to assume that if the client successfully writes data to the server and closes the connection, then the server must have received that data and ran its connection callback wrapped by common.mustCall. However, it is not necessarily the case, and as a result the test was flaky. With this change, the server is closed only once the server's connection callback has been called, which guarantees that the server can accept connections until it actually accepted the single connection that this test requires to accept, making the test not flaky. PR-URL: #14329 Fixes: #13020 Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Refael Ackermann <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Wyatt Preul <[email protected]>
The test used to assume that if the client successfully writes data to the server and closes the connection, then the server must have received that data and ran its connection callback wrapped by common.mustCall. However, it is not necessarily the case, and as a result the test was flaky. With this change, the server is closed only once the server's connection callback has been called, which guarantees that the server can accept connections until it actually accepted the single connection that this test requires to accept, making the test not flaky. PR-URL: #14329 Fixes: #13020 Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Refael Ackermann <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Wyatt Preul <[email protected]>
The test used to assume that if the client successfully writes data to
the server and closes the connection, then the server must have received
that data and ran its connection callback wrapped by common.mustCall.
However, it is not necessarily the case, and as a result the test was
flaky.
With this change, the server is closed only once the server's connection
callback has been called, which guarantees that the server can accept
connections until it actually accepted the single connection that this
test requires to accept, making the test not flaky.
Fixes: #13020
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes[ ] tests and/or benchmarks are includednot relevant in this case[ ] documentation is changed or addednot relevant in this caseAffected core subsystem(s)
test