Skip to content
This repository has been archived by the owner on Oct 23, 2023. It is now read-only.

Breadcrumbs & context wrapping broken with native promises #265

Closed
adambiggs opened this issue Feb 3, 2017 · 18 comments
Closed

Breadcrumbs & context wrapping broken with native promises #265

adambiggs opened this issue Feb 3, 2017 · 18 comments

Comments

@adambiggs
Copy link

adambiggs commented Feb 3, 2017

*** Maintainer edit/resolution ***

This issue is resolved on Node.js 8.0.0, thanks to the following line item of the 8.0.0 changelog:

Native Promise instances are now Domain aware

Partial workarounds have been identified for version of Node prior to 8. Recommendation:

  • If you can, use Node.js 8 and this problem will disappear
  • If you can't use Node.js 8, use Bluebird promises instead of native promises
  • If you can't do either of those, you'll have to avoid relying on context() or wrap(); you can still use the global-scope context, which may be sufficient for certain use cases like Lambda

Original issue below:


I've been pulling my hair out trying to get breadcrumbs working in my Serverless Lambda services, but I can never seem to get it working. Like not even a little bit...

I've gone over the docs multiple times, and it seems like I'm doing everything as recommended. At this point I don't know what else to try.

Background

  • Code written in ES6
  • Built with Webpack v1 via serverless-webpack plugin
  • Compiled using babel-loader
  • Auto-breadcrumbs are disabled
  • Using latest v1.1.1 of raven-node

Example Code

Here's a simplified example of one of my functions.

// Lambda function handler
import Raven from 'raven'
Raven.config('my-dsn', { /* my config */ }).install()

import MyClass from './my-class'

export const handler = Raven.wrap((event, context, callback) => {
  new MyClass(event, callback).startDoingStuff()
})
// my-class.js
import Raven from 'raven'
import AWS from 'aws-sdk'
const ddbClient = new AWS.DynamoDB.DocumentClient()

export default class MyClass {
  constructor (event, callback) {
    this.event = event
    this.callback = callback
    Raven.setContext({ extra: this.event })
  }

  startDoingStuff () {
    Raven.captureBreadcrumb({ message: 'started doing stuff' })
    return this.doTheStuff()
      .then(this.itWorked.bind(this))
      .catch(this.itBroke.bind(this))
  }

  doTheStuff () {
    return new Promise((resolve, reject) => {
      ddbClient.query({ /* query params */ }, (error, data) => {
        if (error) {
          Raven.captureBreadcrumb({ message: 'lol nope' })
          return reject(error)
        }
        resolve(data)
      })
    })
  }

  itWorked (data) {
    this.callback(null, data)
  }

  itBroke (error) {
    Raven.captureException(error, (sentryError, eventId) => {
      this.callback(error)
    })
  }
}

Results

If something goes wrong in doTheStuff(), the events that get logged to Sentry don't have any of the extra context I tried setting in the constructor, and don't have any breadcrumbs other than the final exception...

Also, my Lambda logs on AWS are full of entries like [email protected] alert: getContext called without context; this may indicate incorrect setup - refer to docs on contexts.

So it seems like a problem with the Node.js domain API. Maybe because raven-node seems to rely on implicit binding?

I'm not really familiar with the domain API, but if a different library also tried to use domains (say maybe the AWS SDK), would changing the active domain break raven-node since it relies on domain.active.sentryContext? IDK if that's even how it works...

Anyway, is there a way to disable node domains all together and just run everything in a global context? Domains aren't really useful in an AWS Lambda environment IMO (also, they're deprecated).

@LewisJEllis
Copy link
Contributor

LewisJEllis commented Feb 3, 2017

Hey @adambiggs, thanks for opening this. I also created a proof-of-concept example here demonstrating our failure to carry context across promise resolution boundaries with some comments on exactly what's going on.

In short, our contexts don't carry across promises like we'd like them to, and that's mostly the source of the problem you're running into. With the additional background you've provided here, I think I think I can propose the following:

  • We accept/merge the change in capture breadcrumbs from global context if not in active context #252 to make globally-scoped breadcrumbs be captured
  • We make any other changes necessary to stop treating the global context as a thing that should be avoided
  • You can just throw everything in the global context and avoid using wrap/context methods (which lets you dodge the issue of contexts not propagating across promise resolution boundaries)
  • Your breadcrumbs should start working due to bullet 1 and you should stop seeing strange warning messages due to bullet 2
  • Especially in the Lambda use case, leaning on global context isn't a big deal since we don't have to worry about juggling multiple contexts anyway: each lambda run is sort of a one-off execution

Longer term - I might be able to come up with a clever way to make our contexts propagate through Promises (initial impression: cautiously optimistic but need to experiment to see how viable, will investigate), or we might just note that our contexts don't work with Promises until async_hooks (upcoming Node core feature) lands and we migrate our context implementation to be based on async_hooks instead of based on domains.

I'll write more on our take on domains being noted as "deprecated" and our usage of them in #264. The implicit binding bit is related but a little bit of a separate topic - there's a sort of rift between promises and all other async stuff in Node because of how the V8 microtask queue works. It's an ongoing issue in Node core that we're following, but maybe in the meantime we can find a better workaround than what we have now (just being broken).

@adambiggs
Copy link
Author

Thanks for the quick reply @LewisJEllis. And thanks for the validation that I'm not going crazy!

All your points seem sensible, and it looks like merging #252 would fix my problems. As a user, global context is what I would expect as default behaviour anyway, and wrap/context would seem more of an "advanced" use case instead of a best-practice (especially given the current async gotchas).

@LewisJEllis
Copy link
Contributor

@adambiggs it just occurred to me while writing up my response to #264 that if you use Bluebird instead of native promises (just throw var Promise = require("bluebird"); at the top of your script), everything should work for you. It's obviously not ideal to introduce an additional dependency, but bluebird is quite reliable and it's probably the easiest solution. It's also a viable solution for people who aren't in your fortunate position of being okay with using only the global context.

Note: I'm still going to make the changes I described here/in #252.

@LewisJEllis LewisJEllis changed the title Breadrumbs & context wrapping completely broken? Breadcrumbs & context wrapping broke with native promises Feb 4, 2017
@LewisJEllis LewisJEllis changed the title Breadcrumbs & context wrapping broke with native promises Breadcrumbs & context wrapping broken with native promises Feb 4, 2017
@LegNeato
Copy link

LegNeato commented Feb 6, 2017

FWIW I believe I need this outside of the Lambda use-case. Because I use async/await in scripts w/ a top-level native promise, if I want to attach breadcrumbs in those scripts I need something like a global context or native promise support fixed. As it is, I am seeing the same (non-attach) behavior as this issue, even though I am using the express middleware.

I can also confirm using bluebird in the top-level script causes the breadcrumbs to be attached. This stopgap is unworkable though, as it doesn't appear to work in the express middleware case.

@adambiggs
Copy link
Author

@LewisJEllis thanks for the tip, maybe bluebird it can hold us over until #252 is ready. But bluebird isn't ideal as a long-term solution since we're trying to keep our Lambda code as light-weight and dependency-free as possible.

@LewisJEllis
Copy link
Contributor

LewisJEllis commented Feb 9, 2017

I've released v1.1.2 with the changes I described a few comments above; should help @adambiggs's case.

I've also realized that trying to automagically work with native promises used by any other dependencies you might have isn't going to be feasible, and it's the same deal with async/await. It'd be one thing if Promises were a node core library and we could piggyback on module caching, but since they're a global-scope native, the scope of what we can pull off via monkeypatching is much more limited. I don't think there's a good solution for native promises or async/await until async_hooks lands in node core and we can migrate as described in #264, so unfortunately I think we'll just have to point people to the bluebird workaround and hope for the best :/

@LegNeato I'm not sure exactly what the "bluebird with express middleware" case you're describing is, but it sounds like something we might be able to get working. Can you post an example snippet?

@LegNeato
Copy link

LegNeato commented Feb 9, 2017

@LewisJEllis Even though I am using the express middleware (and thus have a sentry context), because I use async/await with native promises (?) breadcrumbs don't seem to be working. Even when I use bluebird in the module that creates the express app, I still don't get breadcrumbs. Interestingly, when I set the user context it does get sent up with the exception.

Breadcrumbs only appear to work when I use bluebird in every module used in the request...which is painful as I have hundred of modules. For production I can probably have webpack inject bluebird promises everywhere, but in development we use babel-node directly.

I'll try to distill my code into an example snippet.

@LewisJEllis
Copy link
Contributor

LewisJEllis commented Feb 9, 2017

If you're using async/await, it's not gonna be able to work - async/await runs on top of native promises, so even if you include bluebird I wouldn't expect it to help. I don't think there's an easy way to make async/await run on top of bluebird instead; maybe through some goofy babel/build setup to have it shim promises and then compile async/await in terms of that, but I'm not really sure.

@LegNeato
Copy link

LegNeato commented Feb 9, 2017

Ah, yeah. That makes sense why it worked for some scripts using async/await...I use a top-level promise and that is where I was testing the bluebird workaround.

Looks like I may be able to use https://github.com/chpio/babel-plugin-transform-promise-to-bluebird. Pretty big bummer to not be able to use native promises though.

@wearhere
Copy link

Does this mean that the Express middleware does not work with async/await/native promises? i.e. will this work?

var app = require('express')();
var Raven = require('raven');

Raven.config('__DSN__').install();
app.use(Raven.requestHandler());

// async function now! So it catches the error and returns/rejects a promise.
app.get('/', async function mainHandler(req, res) {
    throw new Error('Broke!');
});

// Will this + `requestHandler` handle `mainHandler`'s return value's rejection, or no?
app.use(Raven.errorHandler());

app.listen(3000);

If it doesn't work, is Sentry aware of some other technique for adding a "global" rejection handler to log the error and return a 500?

@wearhere
Copy link

wearhere commented Apr 8, 2017

It looks like mainHandler's rejection will not be handled by default. What do the Sentry folks think of this technique for patching express' Layer object to catch route handler rejections? From here.

@LewisJEllis
Copy link
Contributor

LewisJEllis commented Apr 9, 2017

Oops, didn't see your earlier comment. No, I wouldn't expect that to work unless express automatically routes that rejected promise's rejection "reason" to the errorHandler middleware, which doesn't seem to be the case.

That patching technique seems sound and viable at first glance/understanding, but I would have to experiment with it and better understand exactly how promises/async functions interact with express and its layer/middleware stuff. We could potentially have a raven config option to enable that sort of patching, but we'd have to think about the right way to do it so as to affect behavior as little as possible and keep it easy for users to understand what to expect when they turn it on.

Unfortunately that doesn't help much with the native promises situation, but I did recently see hapijs/lab#695 with an idea for binding the active domain by just patching Promise.prototype.then - I don't think it'll work 100%, but it would probably make a lot of common cases work like we'd want. Something definitely worth revisiting once I finish some higher-priority/lower-hanging things.

@wearhere
Copy link

Hey @LewisJEllis, thanks for your response.

Unfortunately I do not think that hapijs/lab#695 is of much use because it only handles uncaught exceptions, not rejections:

const Raven = require('raven');

Raven.config('__DSN__').install();

const app = require('express')();

app.use(Raven.requestHandler());

/* Insert the patch from https://github.com/hapijs/lab/issues/695 here */

app.get('/reject', async () => {
  // This request will hang (unless you have this patch: https://gist.github.com/q42jaap/f2fb93d96fda6384d3e3fc51977dec90)
  throw new Error('boo');
});

app.get('/reject-in-promise', () => {
  // This request will hang.
  new Promise(() => {
    throw new Error('boo');
  });
});

app.get('/reject-in-then', () => {
  // This request will hang.
  Promise.resolve().then(() => {
    throw new Error('boo');
  });
});

app.get('/reject-in-then-async', () => {
  // Only this response will terminate.
  Promise.resolve().then(() => {
    setTimeout(() => {
      throw new Error('boo');
    });
  });
});

app.use(Raven.errorHandler());

app.use((err, req, res, next) => {
  res.status(500).send(err.message);
});

requestHandler's domain doesn't emit errors in any case except requests to /reject-in-then-async, presumably because all the other errors are caught—by Promise, internally, only then to be reported as unhandled rejections.

I say that hapijs/lab#695 isn't of much use because it's not adding support for promises per se, it's adding support for mixing callbacks with promises—but I'm looking to move my code over to use promises pretty much entirely; and I don't really plan to chain promises using then because I'll be awaiting them.

So for me, I think the Express patch is pretty much what I need because I expect all uncaught exceptions to bubble up to the top layer via await. And then I'll probably shut down the process on 'unhandledRejection', like I currently do on 'uncaughtException', to make sure I'm not missing anything.

I understand and appreciate your concern to do what's right! At the same time, I hope you do investigate this situation more in the near term as I would imagine that more and more of the ecosystem will begin migrating towards the use of promises since Node 7.6.0 shipped with async/await support.

@ORESoftware
Copy link

ORESoftware commented Apr 18, 2017

@wearhere yeah I wrote that Promise patch to support domains. I don't follow the previous conversation 100% and my use case might be different then yours. But given what you said, I think it's better to rely on the Promise API for error-handling. Using domains (for servers) is IMO only for a last ditch effort to pin an error to a particular request and send an error response.

If you don't have domains, then you pretty much have to shut down a server, upon an uncaughtException, otherwise leaks everywhere, because you will likely not be able to send a response for the request that caused the error.

Here is how I use domains in production servers:

const domain = require('domain');

app.use(function(req, res, next){

   const d = domain.create();   // create a new domain for this particular request
   d.once('error', makeHandleError(res));
   d.run(next);

});

Your handleError fn would look like

function makeHandleError(res){

  return function(err){

     // log the error here, and see your stack trace

     if(!res.headersSent){
        /// Send the error response how you like
      }

   }

}

IMO, domains are only a latch ditch effort to catch errors. I never saw them as reliable enough to catch all errors. Use the Promise API to do 99.99% percent of that.

The Promise.prototype.then patch will help you catch the weirdest of errors...but again domains should only be used as a last line of defense.

On the other hand, for applications that are not servers, for example test frameworks, domains are more useful, TapJS, Suman, Lab, all use them, for good things.

@idris
Copy link

idris commented Jun 3, 2017

Good news! The Node.js 8.0.0 release notes say:

Native Promise instances are now Domain aware [84dabe8373] #12489.

I think that means you don't need to patch Promise.prototype or use Bluebird anymore

@LewisJEllis
Copy link
Contributor

LewisJEllis commented Jun 7, 2017

Yes, it does mean that; I was quite happy to see that note. The example I made in my first comment behaves differently on Node.js 8 - the key equality comparison now prints false instead of true! 8.x also brings lots of other exciting possibilities that will be coming up in raven-node 3.0 :)

I'm going to close this with a suggested resolution of:

  • If you can, use Node.js 8 and this problem will disappear
  • If you can't use Node.js 8, use Bluebird promises instead of native promises
  • If you can't do either of those, you'll have to avoid relying on context() or wrap(); you can still use the global-scope context, which may be sufficient for certain use cases like Lambda

I also added this suggestion to the top of the original comment.

@WanLinLin
Copy link
Contributor

I think this Node.js 8 is required when using Promise prerequisite should be added to README

@kamilogorek
Copy link
Contributor

@WanLinLin feel free to open a PR if you mind adding it :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants