Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

file stream resume() can cause duplicate data emitted, leading to crash #3258

Closed
davepacheco opened this issue May 12, 2012 · 2 comments
Closed
Labels

Comments

@davepacheco
Copy link

Here's my test program:

var mod_fs = require('fs');

var file = mod_fs.createReadStream('tmpfile');

file.on('open', function () {
    console.log('file opened');
    process.nextTick(function () {
            console.log('pausing for 500ms (bufsize = %d)',
                file.buffer ? file.buffer.length : 0); 
            file.pause();
            setTimeout(function () {
                    console.log('ending delay (bufsize = %d)',
                        file.buffer ? file.buffer.length : 0); 
                    file.resume();
            }, 500);
    }); 
});

file.on('data', function (chunk) {
    console.log('data (length %s) (buffer = %d)', chunk.length,
        file.buffer ? file.buffer.length : 0); 
    file.resume();
});

Run it like this (be sure to create "tmpfile" first -- which can be a tiny file):

$ echo line1 > tmpfile
$ node resume-test.js 
file opened
pausing for 500ms (bufsize = 0)
ending delay (bufsize = 6)
data (length 6) (buffer = 6)
data (length 6) (buffer = 6)
data (length 6) (buffer = 6)
...
data (length 6) (buffer = 6)

timers.js:96
                if (!process.listeners('uncaughtException').length) throw e;
                                                                      ^    RangeError: Maximum call stack size exceeded

As you can see, the program enters an infinite loop, but one where it keeps calling new functions and eventually runs out of stack space and crashes.

When I first saw this, not knowing a good way to debug it, I modified V8 to dump core when an uncaught exception was thrown. (Importantly, it must dump core when the exception is thrown. Doing it on the uncaughtException event doesn't work because the stack has been unwound by that point. The change to do this is pretty trivial: just add a call to abort(3C) in Isolate::DoThrow, around here: https://github.com/joyent/node/blob/master/deps/v8/src/isolate.cc#L1156. To reproduce this, though, you need use something other than setTimeout or process.nextTick in the above example because those catch and rethrow, so you lose the stack by the time you abort.) Then I used mdb_v8 to see what the stack was, and found that my call to resume() was emitting the buffered data here: https://github.com/joyent/node/blob/master/lib/fs.js#L1288.

The problem is that it emits the buffered data, then clears this.buffer. But if the event itself triggers another resume() (as it does here), then it ends up emitting the data again (which is wrong) and then entering the infinite loop until it runs out of stack space and crashes.

Obviously this program is somewhat contrived, but I this is the simplest case I could come up with. I initially hit this bug in a more complex program (node-bunyan), where I was calling resume() very indirectly from the on('data') handler, which seems to me like a totally reasonable thing to do.

@koichik
Copy link

koichik commented May 12, 2012

@davepacheco - Thanks for investigating.

Can someone review koichik/node@5f9ffa1 (for v0.6)?

isaacs added a commit that referenced this issue May 14, 2012
* windows: skip GetFileAttributes call when opening a file (Bert Belder)

* crypto: add PKCS12/PFX support (Sambasiva Suda)

* #3240: child_process: delete NODE_CHANNEL_FD from env in spawn (Ben Noordhuis)

* windows: add test for path.normalize with UNC paths (Bert Belder)

* windows: make path.normalize convert all slashes to backslashes (Bert Belder)

* fs: Automatically close FSWatcher on error (Bert Belder)

* #3258: fs.ReadStream.pause() emits duplicate data event (koichik)

* pipe_wrap: don't assert() on pipe accept errors (Ben Noordhuis)

* Better exception output for module load and process.nextTick (Felix Geisendörfer)

* zlib: fix error reporting (Ben Noordhuis)

* http: Don't destroy on timeout (isaacs)

* #3231: http: Don't try to emit error on a null'ed req object (isaacs)

* #3236: http: Refactor ClientRequest.onSocket (isaacs)
isaacs added a commit that referenced this issue May 15, 2012
* windows: skip GetFileAttributes call when opening a file (Bert Belder)

* crypto: add PKCS12/PFX support (Sambasiva Suda)

* #3240: child_process: delete NODE_CHANNEL_FD from env in spawn (Ben Noordhuis)

* windows: add test for path.normalize with UNC paths (Bert Belder)

* windows: make path.normalize convert all slashes to backslashes (Bert Belder)

* fs: Automatically close FSWatcher on error (Bert Belder)

* #3258: fs.ReadStream.pause() emits duplicate data event (koichik)

* pipe_wrap: don't assert() on pipe accept errors (Ben Noordhuis)

* Better exception output for module load and process.nextTick (Felix Geisendörfer)

* zlib: fix error reporting (Ben Noordhuis)

* http: Don't destroy on timeout (isaacs)

* #3231: http: Don't try to emit error on a null'ed req object (isaacs)

* #3236: http: Refactor ClientRequest.onSocket (isaacs)
@craigyk
Copy link

craigyk commented May 18, 2012

Did this bug creep in recently? I have been noticing that a resumable file uploader I wrote to handle large uploads has recently been saving corrupted data under heavy load, and I don't remember this happening before. I'll upgrade to .18 and see if it is still happening.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants