Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib,src: eagerly exit process on unhanded promise rejections #12734

Closed

Conversation

Fishrock123
Copy link
Contributor

@Fishrock123 Fishrock123 commented Apr 28, 2017

This PR is an alternate form of #12010, which takes a different approach to providing a good default for unhandled promise rejections.

This makes unhandled promise rejections exit the process though the usual exit handler if a promise is not handled by the time the 'unhandledRejection' event occurs AND no event handler existed for 'unhandledRejection'.

This is as per the current deprecation message.

This no longer waits, or attempts to wait for GC. The reasons for which may be summed up in more detail in @chrisdickinson's lengthy post in the previous thread.

The change is also partially due to GC handlers having potential issues with emitting the process 'exit' event back to JavaScript for regular exception cleanup. This approach avoids that problem.

In addition, this publicly exposes (docs & naming pending) process.promiseFatal(), allowing custom promise implementations to implement identical behavior when a promise should exit the process.

Edit: as a note I am not opposed to keeping the warnings behavior as an option, e.g. with a flag.

Refs: #12010
Refs: #5292
Refs: nodejs/promises#26
Refs: #6355
PR-URL: #6375

Once again, cc @nodejs/ctc, @chrisdickinson, & @benjamingr.

CI: https://ci.nodejs.org/job/node-test-pull-request/7738/

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines
Affected core subsystem(s)

lib, src, promise

@Fishrock123 Fishrock123 added c++ Issues and PRs that require attention from people who are familiar with C++. ctc-agenda lib / src Issues and PRs related to general changes in the lib or src directory. process Issues and PRs related to the process subsystem. promises Issues and PRs related to ECMAScript promises. semver-major PRs that contain breaking changes and should be released in the next major version. labels Apr 28, 2017
@Fishrock123 Fishrock123 requested review from jasnell and addaleax April 28, 2017 22:34
@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. labels Apr 28, 2017
@addaleax addaleax added the notable-change PRs with changes that should be highlighted in changelogs. label Apr 29, 2017

Local<Promise> promise = args[0].As<Promise>();

CHECK(promise->State() == Promise::PromiseState::kRejected);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tiny nit: CHECK_EQ? :)

void InternalFatalException(v8::Isolate* isolate,
v8::Local<v8::Value> error,
v8::Local<v8::Message> message,
bool from_promise);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn’t used outside of node.cc, right?

gc();
gc();
gc();
/* eslint-enable no-undef */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as in the other PR(s?): Do you need the linter comments if you use global.gc() instead?

@addaleax
Copy link
Member

Edit: as a note I am not opposed to keeping the warnings behavior as an option, e.g. with a flag.

I think I would like that, yes. It doesn’t seem like it would be hard to edit this PR for that?

@jasnell
Copy link
Member

jasnell commented Apr 29, 2017

I'm liking this approach better but I'll have to dig through the details a bit later. In general, +1 on this tho.

@domenic
Copy link
Contributor

domenic commented Apr 29, 2017

I don't want to get re-involved in this whole promises-in-node thing very much, but I hope I can drop some words here and maybe influence some people who will continue this discussion.

I am very against this, as it misunderstands the idea that promise exception handling can happen asynchronously. There is no good analogy here with sync exception handling. Of course it's reasonable to abort the process if a sync exception isn't handled synchronously; that's the only way it could be handled. But it's not reasonable to abort the process if an async exception isn't handled synchronously.

Collection on GC seemed like a great way to go for me, since that's when you actually know something is not handled. The objections against it are not very strong, IMO. You can make it opt-in if there's a performance concern. The GC thing doesn't work for other promise libraries, but neither does this PR, so I don't understand that argument at all. (It just provides a utility function for them, but that can be done in both cases.)

The last thing I want to draw attention to is that this essentially breaks any attempts at doing parallelism with async/await, as illustrated in #12010 (comment). This seems pretty terrible to me, as that coding pattern is a great one that people will expect to work across all environments and will use increasingly as async/await gains ground.

I think this crash thing might be fine as an opt-in, so people who really want it can use it instead of installing an npm package that gives you the right hook. But not as a default; it just breaks the model of promises too much.

@Jamesernator
Copy link

Jamesernator commented Apr 30, 2017

I agree with @domenic, exiting because a handler wasn't added synchronously doesn't really make sense, there's lots of times you want to handle asynchronously for example:

const request = require('request-promise')

async function concurrentRequest(urls, concurrency=5) {
    // Start of the first lot of requests
    // we don't care about errors yet
    const urlQueue = [...urls]
    const promises = urlQueue.splice(0, concurrency).map(request)

    const results = []
    for (const url of urls) {
        try {
            // wait for a promise, now is when we actually care if it failed
            // or not
            const data = await promises.shift()
            results.push({
                url,
                data,
                isError: false
            })
        } catch (error) {
            results.push({
                url,
                error,
                isError: true
            })
        }
        // Put another promise on the queue
        if (urlQueue.length) {
            promises.push(request(urlQueue.shift()))
        }
    }
    return results
}

However with this change I'd need to add empty .catch handlers everywhere I create a Promise for absolutely no good reason:

const request = require('request-promise')

const pointlessNoopOnlyForNode = _ => {}

async function concurrentRequest(urls, concurrency=5) {
    // Start of the first lot of requests
    // we don't care about errors yet
    const urlQueue = [...urls]
    const promises = urlQueue.splice(0, concurrency).map(request)
        .map(promise => promise.catch(pointlessNoopOnlyForNode))

    const results = []
    for (const url of urls) {
        try {
            // wait for a promise, now is when we actually care if it failed
            // or not
            const data = await promises.shift()
            results.push({
                url,
                data,
                isError: false
            })
        } catch (error) {
            results.push({
                url,
                error,
                isError: true
            })
        }
        // Put another promise on the queue
        if (urlQueue.length) {
            promises.push(
                request(urlQueue.shift())
                .catch(pointlessNoopOnlyForNode)
            )
        }
    }
    return results
}

And personally I think it's even worse than that, by adding at a noop handler prematurely you might actually mask bugs that could've been prevented by the GC approach, because if someone adds a noop handler and then forgets for real to actually handle it then it'll just be swallowed anyway whereas if we rely on garbage collection to check for it then if they don't actually handle it for real then it will still be an error.


Another point I'd like to make is that checking synchronously after Promise rejection happens makes no sense because it means someone could still add a handler asynchronously as long as they do it before the rejection happens.

For example this will fail only ~50% of the time whereas based on this solution I'd expect it to fail 100% of the time, essentially all this behavior does is add bizarre race conditions that aren't intuitive, predictable or useful.

function delay(time) {
    return new Promise(resolve => setTimeout(resolve, time))
}


async function example() {
    const p = new Promise((_, reject) => {
        setTimeout(reject, Math.random()*2000)
    })
    await delay(1000)
    // This will definitely cause people bugs given that sometimes asynchronous handling will work just
    // perfectly fine, while this isn't a real example this sort've race condition will definitely happen in
    // real code
    p.catch(_ => {})
    console.log("Reached")
}

example()

@ljharb
Copy link
Member

ljharb commented Apr 30, 2017

This will break the ecosystem. Unhandled rejections are normal and should not exit the process.

@benjamingr
Copy link
Member

@ljharb note that this is only on GC of an unhandled rejection.

@addaleax
Copy link
Member

@benjamingr That is the case for #12010, this PR is the nextTick one ;)

@benjamingr
Copy link
Member

@addaleax Oh, then I agree with @domenic, exiting by default on unhandled rejections is not the way to go, although admittedly not doing so would be worse for post-mortem reasons.

I think that we should give the ecosystem some time to stabilize now that async/await is out, focus on making our own APIs work with async/await (like util.promisify) and see how common/uncommon unhandled-in-a-microtick rejections can be.

If people complain a lot about concurrency, we might even have to relax detection - but I assume we won't.

The only positive thing about this PR is that users can easily opt out of it with a handler added. It would probably work for my code base (I don't buy the particular examples above since no one uses .catch inside an async function like that) - I think it won't for some others though.

@Jamesernator
Copy link

@benjamingr Can you explain what you mean when you say it would be good for post-mortem reasons? Because in the case of GC you definitely know the value wasn't used.

Regardless of if you use async functions or not you can essentially imagine Promises having an implied stack trace (there's even ideas on how to formalize here), however what's important to realize is that the way values propagate in very different ways, in a synchronous exception a error propagates up the stack tree until either it's handled or it reaches the top of the stack (in the case of Node this causes a process exit because it clearly wasn't handled).

However because with Promises you can listen to a value anytime you want, and this is often useful, while my final example in the above post isn't realistic in async functions sure however the first one is (I just wanted to give the absolute minimal example to demonstrate that you'll get races conditions that you're actually racing the runtime not even your own logic) .

The first example I provided is mostly real, it's a simplified version of a piece of code from a scraper I wrote which is perfectly sound and reasonable, yet this pull request would make that not so, sure I can just add .catch(pointlessNoopToSatisfyNode) to prevent that behavior but it doesn't really make sense to me as to why I'd need to add that in lots of places to get concurrency. Now if one of those requests were accidentally skipped somehow then it would actually be helpful to know they had simply vanished (which did happen in initial versions, the logic wasn't quite a zip so the last few requests were accidentally skipped).

The thing is Promises (and async functions) aren't just about turning callbacks into synchronous looking code despite people often saying "Now you can write async code like sync code" (I'm not saying anyone here is saying that but there's certainly a mindset I've seen arising of it), if that were true then it'd be safe to actually block for real on a Promise, within some async functions that might actually be fine, however for the most part you probably want to use them in an environment where lots of tasks are running concurrently, in such cases it's important that consumers of Promises can decide to consume them when they're ready.


The worst bit for me about this pull request is that I actually thought the Deprecation warning was always referring to garbage collection given that the error message states promise rejections that are not handled will terminate the Node.js process which I thought meant that I simply had to add a handler at some point, I only learnt today that there's even another warning if you add the handler after it's already been rejected because I've never actually hit that (for example in the case of the web scraper the extra work it does is a lot faster than the time taken for a request to finish so it successfully so I managed to successfully race the warning Promise rejection was handled asynchronously).

@benjamingr
Copy link
Member

@Jamesernator

@benjamingr Can you explain what you mean when you say it would be good for post-mortem reasons? Because in the case of GC you definitely know the value wasn't used.

Good question, and sure.

When you exit after a microtick (rather than wait for GC) then I/O has no chance to come in and "corrupt" the heap. When you wait for GC, an arbitrary amount of modifications can happen in the meantime - so the core dump you might get at the end is a lot less useful.

That said, I am not in favor of breaking the abstraction in order to do this.

@benjamingr
Copy link
Member

Also, I've been meaning to suggest changing the deprecation error myself, the naming of it is very poor, "unhandled rejections" aren't "deprecated". This is my mistake twice, once for not arguing for a better message when the error was introduced in 6.6 and one for not fixing it yet even though @domenic has pointed out it's a problematic message (which I agree with).

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really LGTM.

@Fishrock123
Copy link
Contributor Author

@benjamingr So are you saying your position has changed since #12010 (comment)? (It's fine if it has)... or else I am misunderstanding something.

@chrisdickinson
Copy link
Contributor

This is the approach I'd like to see Node take with regards to promises.

Every time I've had an unhandled rejection log in one of our services, it represented a programmer error. The service ends up in a bad state — my desire would be for Node to crash (within a tick) with a log of the reason so that upstart can bring the process back up. I do not need the stack trace intact in the core (or really, the core itself!)

It is always possible to write code using promises such that it doesn't trip the unhandled rejection handler, even in async functions of high concurrency. Either the author can explicitly mark the promise as handled, by creating a no-op derived promise via .catch(() => {}), or the author can await Promise.all(<all concurrent operations>). Aesthetics aside, this approach is feasible and does not reduce the expressiveness of Promises or async functions. This approach doesn't misunderstand that Promises can be asynchronously handled, rather, it accepts that and asks that authors take the necessary steps to indicate which promises they expect to handle at a later time.

@ljharb
Copy link
Member

ljharb commented May 1, 2017

However, that's not a step that is required in the JavaScript language, nor in browsers - so this is asking everybody who wants to write interoperable code - even non-node users - to complete extra steps. This will also mean that code written with the browser in mind that already works in node, might suddenly not work in node without modification. Please don't undersell the interoperability and maintenance burden that this change will impose, even if node ends up deciding to do it.

@ChALkeR
Copy link
Member

ChALkeR commented May 7, 2017

@jasnell To further explain my point that I mentioned earlier:

Let's label noticing an unhandled rejection on next tick as TICK and noticing a garbage collection of an unhandled rejection as GC.

I think that, in the current situation:

  1. The current UnhandledPromiseRejectionWarning warning (and the corresponding event) should be moved from TICK to GC.
  2. There should be a flag that makes the abovementioned GC warning a crash (process exit) — either on or off by default (no specific opinion on that).
  3. It might be reasonable to introduce a separately named event for TICK, for those people who absolutely wish to crash/debug there. This is likely to get broken by thirdparty modules in this setup, though.
  4. Under a separate flag, preserve the required info for core dumps on TICK and use that on GC to produce meaningful core dumps and debug info.
  5. It might make sense to trigger a gc() on TICK — that would make the process exit faster in some situations. It is not going to be reliable, though, and might require significant logic, e.g. ratelimiting. It could improve the average situation a bit, though — but that requires testing.

I am aware of the concerns about delaying to GC and that Chromium displays a warning immediately (like TICK), but the concerns here are that:

  1. Per the spec, that is the allowed behaviour. There were mentioned usecases that are affected by crashing/warnings on TICK. I am aware that there are workarounds for those, but neverthenless — we are going to break the spec if we crash on TICK, and that is very very bad.
  2. In Chromium, only developers are going to see those warnings, and the page is going to still be usable, and that is true no matter how exactly those warnings would look like as long as they are prited only to the developers console. This significantly differs to users seing those warnings in the console (which is going to happen with Node.js).
  3. In Chromium, those warnings don't stop any execution — nothing would change in the program logic or observed behaviour if those warnings weren't printed to the console. We are considering crashing the whole process, which is a very different thing.
  4. That affects portable code which is not expecting Node.js Promise implementation to differ from the spec.
  5. Not only the top-level programs are going to be affected — undesired behaviour (that is correct by spec but will trigger a crash) could be in some third-party library, e.g. a dependency from npm or even some library that was originally written for browsers and is operating, for example, with jsdom.
    Btw, this is the reason why even a separate event on TICK is controversial — the end developer could have no direct control of all the modules that could be triggering it in their valid code, so there could always be a chance that the situation is bening.

That said, I see an alternative path forward, given the current actions on the Chromium side.

An alternative way would be to somehow change the spec to directly forbid that behaviour.
Then:

  1. Cooperate with v8/Chromium devs to push a spec change.
  2. Leave the current default behaviour as is until the spec is changed, possibly adding an opt-in flag for crashing on unhandled rejections on TICK.
  3. Once the spec change lands, reverse the flag to an opt-out in the next major version, making the process creash on TICK by default.

This would perhaps be the better choice for the Node.js core than the first one (less logic, less changes, less flags, better debugging), but it requires a spec change.

/cc @ljharb, @domenic , perhaps?


Shortly put: I think that violating the spec is worse than all the other concerns against delaying to GC.
I would be in favor of crashing on TICK if there is going to be a path forward to a spec change.

I have no idea atm how exactly should that spec change look like to cover both the browsers and Node.js behaviour without an UB there.

@benjamingr
Copy link
Member

@ChALkeR well written.

I think a problematic point is that GC does not work for global objects, module level objects and other objects that are still referenced in places that can't be GC'd very quickly or at all. A single false negative could leave the server in inconsistent state and swallow an error. That's why I suggested "exit on GC, warn on TICK".

Note that in practice, I've done "exit on tick" and no library ever broke because of it, I've found it to be the case that no libraries "in the wild" really use this capability very much, and the few who do opt out explicitly (bluebird has a suppressUnhnaldedRejection and with native promises it's typically .catch(() => {}).

That said, I think the real path forward is spec change, and introducing synchronization contexts on top of Zones, this is similar to what C# and other languages with tasks and async/await do. If you're unfamiliar with Zones think "domains without the error handling bits".

  • When a promise is created in a Zone, it gets assigned to the Zone and added to a WeakMap (or similarly, to a map of pointers on the C++ side).
  • When a zone "ends" (nothing else is set to run on its context), any unhandled rejections attached to it cause the process to abort.
  • When the process exits, any promises attached to it that are unhandled are logged to stderr (or otherwise).
  • unhandledRejection is still made available to override the above functionality, but no longer logs on tick. Unhandled rejections in the global zone do log on TICK. GC still crashes the Zone.
  • It is impossible for an unhandled rejection to recover from the zone - even if a catch handler is added to the promise. This is far easier than exposing them via zone.exceptions or something like that.

Alternatively, allowing the user to set the promise scheduler solves all this too, but gives users a lot of power I'm not sure we should give them (this can be done at the platform level, without spec changes).

Splitting the app into zones this way allows for clear "context"s the library/framework defines for when something won't be handled.

Some examples:

  • An express/koa/hapi/restify app would likely set the zone on every request/response pair. So when the response ends any unhandled rejections throw.
  • A careful library would set its bounds in its own code.
  • A user would set zones wherever they make sense.

This is a little bit like domains, but without the domain problems with error recovery and handling - since we're not recovering from errors - just detecting errors people forgot to listen to.

@jasnell
Copy link
Member

jasnell commented May 9, 2017

@benjamingr ... while I have many concerns around zones, at an abstract level what you describe is certainly ideal... that is, the idea that some additional spec changes that provide clear hook points for all of this would be absolutely fantastic... and that is something that I can bring back to TC-39 to discuss. It would be excellent.

At this point, I think there are still way too many open questions and potential issues for us to reasonably come up with a reasonable default behavior. I think we need to take a more iterative approach. At the collaborator summit we talked about a flag driven option that would allow users to opt in. It would also allow us to experiment with various options to see what is the best strategy. So here's what I would recommend:

We add a --on-unhandled-rejection={action} command line argument where {action} can be one of: ignore, warn-on-tick, warn-on-gc, throw-on-tick, or throw-on-gc. With the default option being the current behavior of warning on gc.

I know that this is not an ideal solution because it kicks the ball down the road a bit, and adding multi-valued command line arguments is always irritating and causes it's own set of problems, but going with this approach would allow us collectively to experiment over a longer period of time with a range of possible strategies. It also gives us time to go back to TC-39 or the VM implementors to work through possible spec modifications that can provide a significantly better solution.

@addaleax
Copy link
Member

addaleax commented May 9, 2017

With the default option being the current behavior of warning on gc.

Huh – just to be clear, the current behavior is to warn on nextTick. And if we want to leave the option of either warning or crashing on nextTick open, then we should keep that, because it’s much easier to move the warning/crash to a later point in time than to an earlier one.

@jasnell
Copy link
Member

jasnell commented May 9, 2017

oh.. right sorry, I get those mixed up. Yes that, keep the current default but allow a flag to enable the other behaviors

@mcollina
Copy link
Member

mcollina commented May 9, 2017

@jasnell just do add one note: we also discussed about printing the stack trace of the offending exception as well. I would prefer to do so for every 'unhandledRejection'.

@benjamingr
Copy link
Member

@mcollina

@jasnell just do add one note: we also discussed about printing the stack trace of the offending exception as well. I would prefer to do so for every 'unhandledRejection'.

The stack trace of the offending exception is already printed if --trace-warnings is on (and it should probably be on), although if we want to change it to trace by default I'd be in favor.

@benjamingr
Copy link
Member

@jasnell I'm not sure what a command line switch would get us - users can define an "unhandledRejection" handler themselves.

The default behavior gave users a bad experience before warnings were added - I think they really helped. I also think that async/await seeing wide-scale adoption will change how users feel about this issue in the coming months.

I think adding a flag won't really solve much - I think Node should lead the push for solving scheduling and unhandled promise rejections with technology - I'll hopefully set up a prototype next month (super busy :( ) that:

  • Uses the Zone polyfill (https://github.com/angular/zone.js)
  • Whenever a promise is created (by subclassing Promise and then maybe overriding the global) - add the promise to the zone in an array.
  • Collect unhandledRejections in a WeakMap.
  • When a zone is "done" - if there are any promises in the zone that are in the map - throw and terminate the Node process.

@jasnell
Copy link
Member

jasnell commented May 9, 2017

What the command line switch does is provide a more opinionated view of what the options are. Yes users can handle unhandledRejection already, but the command line options would provide a more narrow focus on what we think the right things to do may be, rather than leaving it completely wide open. Having the multiple values allows us to experiment between very specific choices.

@benjamingr
Copy link
Member

the command line options would provide a more narrow focus on what we think the right things to do may be

I think providing multiple options is a recipe for confusion - saying "we're not sure what the right thing to do is, here are some options" in core is something I'd rather avoid. We're aware of better alternatives for the culture, and what we currently do works pretty well and solves the hardest problem (swallowing rejections) already for us.

I'm definitely in favor of throwing on GC and keeping the warnings on tick in place. I think it's good middle ground until we build synchronization contexts or schedulers to address this (what other platforms do).

@mcollina
Copy link
Member

@benjamingr

The stack trace of the offending exception is already printed if --trace-warnings is on (and it should probably be on), although if we want to change it to trace by default I'd be in favor.

That would print the stacktrace of the warning, not the original stacktrace that caused the warning. I propose to print the stacktrace of every error that reaches the unhandledRejection.

More or less, make this the default behavior, while maintaining the warning:

process.on('unhandledRejection', function (err) {
  console.log(err.stack);
})

Promise.reject(new Error('kaboom'))

@@ -673,6 +673,8 @@ asyncTest('Throwing an error inside a rejectionHandled handler goes to' +
tearDownException();
done();
});
// Prevent fatal unhandled error.
process.on('unhandledRejection', common.noop);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: common.mustCall() instead of common.noop?

@benjamingr
Copy link
Member

@mcollina yes, that makes sense

@mhdawson
Copy link
Member

I think discussion in the last CTC meeting was that we should pull together the group of interested people to start working on post-mortem/promises. I think @jasnell took the action to push that forward. I'm available to help with that at well if necessary.

@benjamingr
Copy link
Member

I would definitely be interested in being a member of such a group @jasnell

@medikoo
Copy link

medikoo commented Sep 12, 2017

Does anyone know the current status of this (and #12010)?

Is this going to be taken in at some point?

Situation is that by not having that, ecosystem is really broken. Just recently I approached a popular project (19k+ stars, 1.9k+ forks, 100+ contributors) which had failing tests in master and nobody knew about it. Purely because unhandled rejections are not exposed as exceptions.
More on that here: serverless/serverless#4139

@benjamingr
Copy link
Member

There is a newer one by @BridgeAR that's fairly recent #15126 (comment)

Contribution and involvement are welcome

@BridgeAR
Copy link
Member

@Fishrock123 I am closing this for now. I am going to proceed with my PR soon. If you would like to continue working on this, I would very much appreciate that as well (especially as C++ is hard for me - I learn it while working on the code).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. notable-change PRs with changes that should be highlighted in changelogs. process Issues and PRs related to the process subsystem. promises Issues and PRs related to ECMAScript promises. semver-major PRs that contain breaking changes and should be released in the next major version.
Projects
None yet
Development

Successfully merging this pull request may close these issues.