-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
shutdown ENOTCONN on TLS.Socket._final #26315
Comments
I've run a git bisect between v1.15.0 and v1.15.1. c84b420 seems to be the cause of the problem. |
@addaleax any thoughts on this? |
@tcolgate I don’t suppose you have an easy reproduction available? Does this also occur for you with v11.x (v11.4.0 and above)? |
I don't have a cut down test at the moment. But I suspect the problem is with idle TLS connections in the knex pool. It's also awkward to test as it shows up after "about 5 minutes" in a lightly loaded app. I'll try and reduce the problem down a bit though. |
We're running into the same issue. I don't have a fully runnable repro that I can share, but it reproduces when the connection is closed while the query is still in flight. It looks roughly like this:
There is just [email protected], no pooling. |
@tcolgate @syrnick Would either of you be able to try a patch like this? diff --git a/lib/net.js b/lib/net.js
index b649c6779a94..71feba0cfd10 100644
--- a/lib/net.js
+++ b/lib/net.js
@@ -36,7 +36,8 @@ const {
const assert = require('internal/assert');
const {
UV_EADDRINUSE,
- UV_EINVAL
+ UV_EINVAL,
+ UV_ENOTCONN
} = internalBinding('uv');
const { Buffer } = require('buffer');
@@ -359,7 +360,7 @@ Socket.prototype._final = function(cb) {
req.callback = cb;
var err = this._handle.shutdown(req);
- if (err === 1) // synchronous finish
+ if (err === 1 || err === UV_ENOTCONN) // synchronous finish
return afterShutdown.call(req, 0);
else if (err !== 0)
return this.destroy(errnoException(err, 'shutdown')); That seems like a cleaner solution to me, but either way, I’m a bit wary of adding or re-adding code that we do not test for and that should not be triggered in the first place… |
I'll try some time this week. It seems like it's either
|
We're also seeing this using We cannot update to node v12 at the moment so it would be great if this bug can be looked into. |
We are experiencing this issue as well when using |
We are also facing this issue using knex and node 10.16. Any update on this? Adding log trace:
|
I am also experiencing this issue on ghost 2.22.0 and node 10.16.0 with Amazon RDS:
|
Also seeing this issue (with AWS Lambda 10.x, RDS and [email protected]):
|
This error is killing me. I was forces to return to Node version 10.15.0 just to keep things working. I would really like to move to 12.9.1, but I am not stuck and can not take advantage of improvements and new features until this is fixed. We tried with 12.9.1 and the bug is still there. |
Just to reiterate, anybody who can either a) provide a reproduction or b) confirm that the patch in #26315 (comment) solves the issue is likely to get this resolved really, really fast. |
I ran into this issue as well. I went ahead and applied the patch mentioned in the comment, and built node from master (+ the patch). As expected, it looks like the patch resolved the error. If I switch back to node 12.x, the issue returns immediately. Let me know if you want me to throw a project together to attempt reproduction. Some notes: |
Now that Richard has verified the fix, can we get this in? There have already been two releases since Richard verified. Can we please get it in the next release? |
We have also been experiencing this issue in our app. Downgrading to 10.15.0 resolved the issue. We confirmed the bug exists in all newer versions. Any update on getting this fixed in latest release? |
While it is not entirely clear why this condition is being triggered, it does resolve a reported bug. Fixes: nodejs#26315
I’ve opened #29912 with the patch above. Fwiw, anybody can open PRs here, you don’t need to wait for somebody else to do it, especially when it has been confirmed to resolve an issue. |
@addaleax Thanks for raising that PR. Quite a few people were having issues with AWS Lambda and |
While it is not entirely clear why this condition is being triggered, it does resolve a reported bug. Fixes: nodejs#26315 PR-URL: nodejs#29912 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Luigi Pinca <[email protected]>
@anthonynovatsis It should get picked up automatically; our rules generally say that something must have lived in a current release for 2 weeks before it is backported to LTS, but I’ve commented on #29875 so that maybe we can skip that part for the patch, and opened a backport PR @ #29968. |
Thanks for that @addaleax. Much appreciated..! |
While it is not entirely clear why this condition is being triggered, it does resolve a reported bug. Fixes: #26315 Backport-PR-URL: #29968 PR-URL: #29912 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Luigi Pinca <[email protected]>
How can we know what versions sill have this fix?? Is there an overall bug # to build # lookup table? |
@intervalia You can check the releases tab to verify which commits are included in which releases. This particular patch was landed in v10.17.0 and in the latest v13 releases. Thank you @addaleax for pushing this through! |
What's the process for getting it into v12.x LTS branch? I checked that (https://github.com/nodejs/node/blob/v12.x/lib/net.js is unpatched) |
While it is not entirely clear why this condition is being triggered, it does resolve a reported bug. Fixes: #26315 PR-URL: #29912 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Luigi Pinca <[email protected]>
@syrnick Yeah, thanks for pointing that out … I think this is unfortunate timing because it was merged so close to the Node.js 12 LTS release. I’ve added it to the |
When using v10.15.1 we are seeing processes exiting with:
This appears in services connecting to Amazon RDS instances via TLS. We are using
https://registry.npmjs.org/mysql/-/mysql-2.15.0.tgz
and https://registry.npmjs.org/knex/-/knex-0.14.4.tgz for connection pooling.
We do not see these errors on earlier node image (I have tested v10.15.0, v10.14.2 and v10.0.0).
Could this be impacted by #24290 and related commits?
The errors could possibly be bettered handled higher up (by knex or the mysql client), but this does seem to be a significant change in behaviour.
The text was updated successfully, but these errors were encountered: