Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stream: fix sync callback leak #31765

Closed
wants to merge 1 commit into from

Conversation

ronag
Copy link
Member

@ronag ronag commented Feb 13, 2020

This PR currently is mostly a suggestion to discuss related to #31756. It's not properly cleaned up and is mostly to illustrate the problem and possible solution.

Today when working with callbacks and events we need to keep in mind ordering and whether or not a callback is invoked synchronously or asynchronously.

The current pattern for this is something like:

Stream.write = function (chunk) {
  state.sync = true;
  this._write(chunk, this._onWriteComplete);
  state.sync = false;
}

Which mostly works fine. However, as in #31756, if the callback is leaked outside it can still be invoked in the same tick breaking the assumption that when state.sync === false a tick has elapsed.

i.e.

const s = new Stream();
let cb;
s._write = (chunk, callback) => {
  cb = callback();
};
s.write('asd');
cb(); // uhoh! sync but the `Stream` thinks it's async.
s.on('someevent', common.mustCall()); // not called

This PR tries to resolve this by keeping track of which tick we are currently in and making sure that state.sync === false always is a different tick.

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines

@ronag ronag added wip Issues and PRs that are still a work in progress. stream Issues and PRs related to the stream subsystem. process Issues and PRs related to the process subsystem. labels Feb 13, 2020
@ronag ronag requested review from mcollina and jasnell February 13, 2020 10:25
Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, this approach would make adding this feature extremely hard in readable-stream. I’m -1.

@ronag
Copy link
Member Author

ronag commented Feb 13, 2020

Unfortunately, this approach would make adding this feature extremely hard in readable-stream. I’m -1.

The only other way I can think of is to fallback to always using nextTick which would have significant performance implications.

Should we go the documentation route then? i.e. explicitly explain that the callback must be called inside _write (sync) or in a future tick (async).

@addaleax
Copy link
Member

@mcollina Your concern is definitely valid, but do you have other suggestions, other than just documenting the current state? I do feel like this is a bug worth fixing…

@mcollina
Copy link
Member

I think the solution in 4d7eddc is perfectly feasible and solves the problem at a cost of an additional nextTick in case a write errors.

I'm open to other fixes, but I'm strongly -1 in adding a feature that cannot be ported to readable-stream easily.

Copy link
Contributor

@mscdex mscdex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this solution incurs some significant performance regressions:

                                                                                confidence improvement accuracy (*)   (**)  (***)
streams/writable-manywrites.js callback='no' writev='no' sync='no' n=2000000           ***    -28.95 %       ±1.50% ±1.99% ±2.59%
streams/writable-manywrites.js callback='no' writev='no' sync='yes' n=2000000          ***    -30.11 %       ±3.86% ±5.18% ±6.82%
streams/writable-manywrites.js callback='no' writev='yes' sync='no' n=2000000          ***    -16.49 %       ±3.07% ±4.10% ±5.39%
streams/writable-manywrites.js callback='no' writev='yes' sync='yes' n=2000000         ***    -31.43 %       ±2.28% ±3.04% ±3.99%
streams/writable-manywrites.js callback='yes' writev='no' sync='no' n=2000000          ***    -28.67 %       ±1.74% ±2.31% ±3.01%
streams/writable-manywrites.js callback='yes' writev='no' sync='yes' n=2000000         ***    -32.39 %       ±2.74% ±3.65% ±4.76%
streams/writable-manywrites.js callback='yes' writev='yes' sync='no' n=2000000         ***    -16.59 %       ±3.40% ±4.56% ±6.00%
streams/writable-manywrites.js callback='yes' writev='yes' sync='yes' n=2000000        ***    -30.27 %       ±2.81% ±3.75% ±4.92%

@ronag
Copy link
Member Author

ronag commented Feb 13, 2020

Also this solution incurs some significant performance regressions:

Yes, that's expected. We should inline the sync accessor. When that is done I don't expect any significant performance impact (I hope), i.e. I think this can be fixed.

@ronag
Copy link
Member Author

ronag commented Feb 13, 2020

I think the solution in 4d7eddc is perfectly feasible and solves the problem at a cost of an additional nextTick in case a write errors.

This doesn't actually fix the root problem though. It just fixes one of the edge cases with the 'error' event. The same problem occurs (I think) with other events such as 'drain' and 'final' and possibly other less obvious edge cases.

@@ -70,7 +71,7 @@ function processTicksAndRejections() {
let tock;
do {
while (tock = queue.shift()) {
const asyncId = tock[async_id_symbol];
const asyncId = tickId = tock[async_id_symbol];
Copy link
Member Author

@ronag ronag Feb 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might not be exactly correct since it will only run if there is a nextTick scheduled and won't take other possible async entry points into account such as IO and what not. This is not my area of expertise.

Maybe would require some help from @addaleax if this approach is deemed worth looking further into.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I’m happy to investigate this in more detail… It looks like you’re essentially going for an id for each individual synchronous block of JS execution, which should be doable, but I agree that it makes sense to get a green light for this approach first :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, fwiw I've been mulling over the details of a proposal to TC-39 that does precisely that: assign execution and trigger IDs for every JS execution at the language level. It would make several things significantly easier.

I'm not convinced that this PR is the way to go forward tho and definitely need to give it a bit more thought.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm yeah … I’m not sure if this would work fully when using async_hooks because we’d basically always want the lowest async id on the stack, not the highest. But maybe we could expose + use that? That would also solve the readable-stream issue for all supported versions of Node.js (although obviously not browsers…)

ronag added a commit to nxtedition/node that referenced this pull request Feb 15, 2020
Clarifies a userland invariant until a better
solution can be found.

Also moves a misplaced sentence from _write to
write.

Refs: nodejs#31756
Refs: nodejs#31765
@ronag
Copy link
Member Author

ronag commented Feb 15, 2020

Maybe @lpinca has some input/ideas on this?

ObjectDefineProperties(WritableState.prototype, {
sync: {
get() {
return this.tick === process.tickId;
Copy link
Member Author

@ronag ronag Feb 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mcollina: one way to get around the readable-stream problem could be:

process.tickId === undefined || this.tick === process.tickId

The downside would be that it would be slower for sync streams (but probably correct?) on old node versions, i.e . it would always think it is "sync" and use nextTick even though it is not strictly necessary.

We could wait with backporting this to readable-stream until v14 becomes LTS?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readable-stream has to work on old version of Node.js as well, as long as browsers. I would prefer to avoid this completely.

@addaleax addaleax mentioned this pull request Feb 18, 2020
4 tasks
ronag added a commit that referenced this pull request Feb 18, 2020
Clarifies a userland invariant until a better
solution can be found.

Also moves a misplaced sentence from _write to
write.

Refs: #31756
Refs: #31765

PR-URL: #31812
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: James M Snell <[email protected]>
@ronag
Copy link
Member Author

ronag commented Feb 25, 2020

I think the consensus was to document the limitation since this cannot be easily ported to readable-stream. Closing.

@ronag ronag closed this Feb 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
process Issues and PRs related to the process subsystem. stream Issues and PRs related to the stream subsystem. wip Issues and PRs that are still a work in progress.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants