-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closing fs streams early could call close during or before I/O #2006
Comments
I think you're right and it's a rather insidious data corruption bug. I put together a test case that demonstrates the race condition: 'use strict';
const assert = require('assert');
const fs = require('fs');
const buf = Buffer('.');
pummel();
function pummel() {
const a = fs.createWriteStream('a.txt');
// Assumes UV_THREADPOOL_SIZE=4
for (var i = 0; i < 3; i += 1) a._write(buf, 'buffer', function() {});
a.on('open', function() {
const afd = a.fd;
a.close();
for (;;) {
const bfd = fs.openSync('b.txt', 'a');
if (afd === bfd) break;
fs.closeSync(bfd);
}
setTimeout(function() {
fs.closeSync(afd);
assert.equal(fs.statSync('b.txt').size, 0);
pummel();
}, 50);
});
} It's easier to hit the assert when you attach strace to the process, otherwise file operations complete just too darn fast. The issue is that:
I could have sworn As a workaround, you can set |
The queue is now part of the inner workings of writable streams – only one write (or "end") is outstanding at a time in streams2+. The streams problem is not the queue, it's that another actor is killing the resource out from under the stream (and that the other actor has no way of knowing whether the stream is in the middle of a write – otherwise you could |
Hi @bnoordhuis , Why you using _write? for (var i = 0; i < 3; i += 1) a._write(buf, 'buffer', function() {}); I tried the above test after replacing _write with write and after a lag i got below error: events.js:141
throw er; // Unhandled 'error' event
^
Error: EBADF: bad file descriptor, write
at Error (native) |
@saquibkhan |
@bnoordhuis Well, going through the streams machinery introduces a work queue which avoids the race condition. You can write() a million times and then end(), and it'll only close the fd when it's done writing. |
Was my last comment unclear? No snark, honest question. The race condition still exists with |
Oh I see. I was missing the context of your test. It looks like close should be put in the same queue as end and write, yes. That method was left out of the streams API proper because it differs so much between streams. If you want an eventual close, then the method to call is |
Is the solution to basically diff --git a/lib/fs.js b/lib/fs.js
index 58704e5..9631094 100644
--- a/lib/fs.js
+++ b/lib/fs.js
@@ -1869,7 +1869,11 @@ WriteStream.prototype._write = function(data, encoding, cb) {
WriteStream.prototype.destroy = ReadStream.prototype.destroy;
-WriteStream.prototype.close = ReadStream.prototype.close;
+
+WriteStream.prototype.close = function(cb) {
+ this.once('end', ReadStream.prototype.close.bind(this, cb));
+ return this.end();
+};
// There is no shutdown() for files.
WriteStream.prototype.destroySoon = WriteStream.prototype.end; By the way, I think |
@bnoordhuis @isaacs |
I think the solution here might be for @bnoordhuis In your patch, it should be What's essentially happening is that there's one or more writes in progress, and the In most cases, you shouldn't even be calling Can we zoom out a bit on this issue? Why are you calling |
The reason that I used The reason that I want to abort a write stream early is to handle the case that some other part of my program has an error, and I want to stop and clean everything up. As a user, I'm okay if the answer is to just use It does seem like this is a common error. For example, the pump package makes this mistake despite having a knowledgeable author: https://github.com/mafintosh/pump/blob/dc0a3c33ac51a37f2ac3551d1a292620fdc5ad91/index.js#L39 Also, during a conversation on a previous issue about
and indutny responded:
|
Yeah, it's a common enough error that we should handle an in-process _read or _write in close(). The only question is whether it should drop any pending (but not yet started) writes. And, this gets into an area where we probably don't want |
As of now, most streams actually do implement For instance:
The idea to add |
@bnoordhuis @isaacs ... any further thoughts on this one? Suggestions on how to proceed? |
Someone from @nodejs/streams should probably take a look. |
I agree with @isaacs:
In the next meeting of @nodejs/streams we will start discussing just that. @calvinmetcalf @mafintosh we should probably also cover |
Also a question if taking "close" in: is it possible for other stream-interaction events to happen after As of now, |
👍 on adding |
@mcollina Did anything notable happen with this? |
We implemented destroy(err), and it is standardized across all of core. |
So this issue can be closed? Or not quite? |
I would need to look into this with a bit more detail to be certain. |
IMHO we should deprecate
I can take care of the change, but I do not know how write a unit test to reproduce this issue just using the stream API, as #2006 (comment) uses all the internals. @bnoordhuis can you help? This change will likely be semver-major. |
ping @mcollina and @bnoordhuis |
A tentative fix in #15407. |
Refactor close() to use destroy() and not vice versa in ReadStream. Avoid races between WriteStream.close and WriteStream.write, by aliasing close to end(). Fixes: nodejs#2006
Refactor close() to use destroy() and not vice versa in ReadStream. Avoid races between WriteStream.close and WriteStream.write, by aliasing close to end(). Fixes: nodejs/node#2006 PR-URL: nodejs/node#15407 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Refael Ackermann <[email protected]> Reviewed-By: Colin Ihrig <[email protected]>
Refactor close() to use destroy() and not vice versa in ReadStream. Avoid races between WriteStream.close and WriteStream.write, by aliasing close to end(). Fixes: nodejs/node#2006 PR-URL: nodejs/node#15407 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Refael Ackermann <[email protected]> Reviewed-By: Colin Ihrig <[email protected]>
Refactor close() to use destroy() and not vice versa in ReadStream. Avoid races between WriteStream.close and WriteStream.write, by aliasing close to end(). Fixes: nodejs#2006 PR-URL: nodejs#15407 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Refael Ackermann <[email protected]> Reviewed-By: Colin Ihrig <[email protected]>
Suppose you
fs.createWriteStream
, pipe something into it, and then need to close the stream early because of an error somewhere else.Calling
close
on the write stream in this case could causeclose
to be called on the underlying file descriptor while a write operation is still pending. Or, if more than one worker thread is being used, it's possible for theclose
to happen before the write begins.Specifically,
WriteStream.close
does not check whether afs.write
operation is pending before callingfs.close
:https://github.com/nodejs/io.js/blob/41951d45b6df789d7e9cf134f0029b0e791706c4/lib/fs.js#L1770
It seems like this makes it impossible to safely close a write stream early. I've never seen bad behavior from this in practice though, so maybe I'm misunderstanding something.
Are we instead supposed to call
Writable.end
and should never useWriteStream.close
?The text was updated successfully, but these errors were encountered: