Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nodejs10.x Error: Missing AWS Lambda trace data for X-Ray. #143

Closed
EGrun opened this issue Jun 18, 2019 · 35 comments
Closed

Nodejs10.x Error: Missing AWS Lambda trace data for X-Ray. #143

EGrun opened this issue Jun 18, 2019 · 35 comments
Labels

Comments

@EGrun
Copy link

EGrun commented Jun 18, 2019

After updating to the 'nodejs10.x' runtime environment, I am now seeing this error message at the start of the cloudwatch logs:

2019-06-14T21:24:54.713Z	undefined	ERROR	Error: Missing AWS Lambda trace data for X-Ray. Expected _X_AMZN_TRACE_ID to be set.
at Object.contextMissingLogError [as contextMissing] (/var/task/handler.js:37802:19)
at Segment.resolveLambdaTraceData (/var/task/handler.js:167833:43)
at Object.getSegment (/var/task/handler.js:37870:17)
at Object.resolveSegment (/var/task/handler.js:37849:19)
at captureOutgoingHTTPs (/var/task/handler.js:94553:31)
at Object.captureHTTPsRequest [as request] (/var/task/handler.js:94638:12)
at Promise (/var/runtime/RAPIDClient.js:91:31)
at new WrappedPromise (/var/task/handler.js:161080:18)
at RAPIDClient.nextInvocation (/var/runtime/RAPIDClient.js:90:12)
at Runtime.handleOnce (/var/runtime/Runtime.js:52:51)

We are using, AWSXRay.captureHTTPsGlobal(require('http')); to capture all http requests and this is the source of the error. It is also capturing all calls to 127.0.0.1 as part of the runtime event loop.

A fix was to move the captureHTTPsGlobal command inside of the lambda function function handler. Before, it was running from the global module level. We are still getting x-ray results for calls to localhost for the lambda runtime environment. Is there a way to ignore requests to 127.0.0.1?

@awssandra
Copy link
Contributor

Hi EGrun,

Lambda does not have the X-Ray header until the function handler is run. You should be able to put AWSXRay.captureHTTPsGlobal(require('http')) outside your handler, but using the http client will have to be in the function handler code.

https://docs.aws.amazon.com/lambda/latest/dg/downstream-tracing.html

With captureHTTPsGlobal, the http library itself is patched to capture on every http client initialized. There is not a concept to opt out matching request URLs today. However, you can use captureHTTPs, to enable tracing on a single http client instance, where ever you need the call captured.

Hope this helps,
Sandra

@Imran99
Copy link

Imran99 commented Jul 26, 2019

Hi Guys,

I'm having the same issue but I noticed we have closed this one and #701 is only a partial fix.

You should be able to put AWSXRay.captureHTTPsGlobal(require('http')) outside your handler, but using the http client will have to be in the function handler code.

This works in node 8.10 but not node 10.x. The exact same code for node 10.x is giving me Error: Missing AWS Lambda trace data for X-Ray. Expected _X_AMZN_TRACE_ID to be set.

Im using the captureHTTPglobal code from the docs and im only using the http client inside the handler.

Can we reopen this as its preventing us from upgrading to node10.

@hotgazpacho
Copy link

By any chance, are you enabling http keep-alives for the AWS SDK, per https://theburningmonk.com/2019/02/lambda-optimization-tip-enable-http-keep-alive/

We're doing this in a single file that we include for every handler, and we see this issue on cold starts.

@Imran99
Copy link

Imran99 commented Aug 27, 2019

@hotgazpacho yup we are, so I guess thats partly the issue but for some reason it did not cause a problem on node 8.

@hotgazpacho
Copy link

@Imran99 trying to maybe help diagnose here...
In what order do you set up:

  • require('https')
  • https.agent setup
  • require('aws-xray-sdk-core')
  • AWSXRay.captureHTTPsGlobal

My thinking here is that the order that these things happen is important.

@awssandra awssandra reopened this Aug 27, 2019
@Imran99
Copy link

Imran99 commented Aug 28, 2019

Thanks @hotgazpacho we have two internal npm packages that do it in this order:

normal aws sdk tracing package:

  • require('aws-xray-sdk-core')
  • require('aws-sdk')
  • require('https')
  • https.agent setup

and another http client with tracing package that does:

  • require('aws-xray-sdk-core')
  • require('http')
  • AWSXRay.captureHTTPsGlobal

@hotgazpacho
Copy link

hotgazpacho commented Sep 6, 2019

So, I initialize thusly:

const AWSXRay = require('aws-xray-sdk-core');
const AWS = AWSXRay.captureAWS(require('aws-sdk'));
AWSXRay.captureHTTPsGlobal(require('https'));

function getKeepAliveAgent() {
  const options = {
    keepAlive: true,
    maxSockets: 50, // from aws-sdk
    rejectUnauthorized: true, // from aws-sdk
  };
  const agent = new https.Agent(options);
  agent.setMaxListeners(0); // from aws-sdk
  return agent;
}

AWS.config.update({
    httpOptions: {
      agent: getKeepAliveAgent(),
    },
  });

Where I'm seeing this error is during cold starts of my Lambda function, wherein I use SecretsManager outside of the handler in the global scope to get the secrets config once and keep it around for the lifetime of the function instance. This global initialization is a pretty common scenario in Lambda.

@Shahab96
Copy link

Shahab96 commented Oct 9, 2019

Are there any updates on this issue? We are experiencing this as well and with Node 8 being EoL at year end this would become a problem

@willarmiros
Copy link
Contributor

willarmiros commented Oct 9, 2019

We are actively looking into this issue and discussing it with the Lambda team. Unfortunately this change would be a fairly substantial refactor for them. In the meantime, we recommend using the non-global HTTP capture API.

If Lambda is unable to complete the change, we will consider adding some kind of string matching to not trace these requests in the SDK. The issue with this is 1) It is an issue with Lambda so that feature would need to be replicated across all X-Ray SDKs and 2) special-case not-capturing with string matching is not exactly a best practice, so we'd like to avoid it if possible. Hope this clarifies some confusion about this issue.

@tqhoughton
Copy link

So basically we have to choose between being able to update Lambda functions or being able to use X-Ray.

I think it's obvious what most organizations are going to choose. Can you ask the Lambda team if they can extend the deadline to upgrade Lambda functions to Node 10.x until this is resolved?

@willarmiros
Copy link
Contributor

Lambda has told us a potential solution is underway. I will update this thread once it is released. If the solution does not work, our team will add the filtering logic to ignore these internal calls before Node 8 EOL is reached.

@willarmiros
Copy link
Contributor

Lambda has released what should be a fix for this problem globally! Please follow up if this issue persists, otherwise I'll close this issue.

@kwongkz
Copy link

kwongkz commented Nov 23, 2019

Now actually can directly use env - AWS_NODEJS_CONNECTION_REUSE_ENABLED to 1, no longer need these couple of lines of code New way to keepalive

@Imran99
Copy link

Imran99 commented Nov 27, 2019

I've noticed that the errors have disappeared but I'm not seeing http calls in traces at all anymore, could be doing something wrong on our end haven't had a chance to investigate yet. Anyone else seeing this?

@willarmiros
Copy link
Contributor

Hi @Imran99,
If you are still not seeing HTTP traces please follow up in a separate issue. I'm going to close this issue since no one else seems to be experiencing it anymore.

@chearmstrong
Copy link

@willarmiros I'm getting Error: Missing AWS Lambda trace data for X-Ray. Expected _X_AMZN_TRACE_ID to be set. when upgrading to version 3.0.0-alpha.1.

I'm using Node v12, and I'm using HTTP keep-alive.

I didn't get it with version 2.5.0-experimental.1 btw.

@willarmiros
Copy link
Contributor

Hi @chearmstrong,
I'm not able to reproduce this error on a Lambda running node 12 and the X-Ray SDK 3.0.0-alpha.1. I initialized my HTTPS client in the same way that @hotgazpacho did and made a request with it and everything ran & was instrumented as expected.

Can you please post a snippet that reproduces this issue?

@willarmiros willarmiros reopened this Mar 2, 2020
@chearmstrong
Copy link

chearmstrong commented Mar 5, 2020

Hi @willarmiros - my code is something like this.

const __AWS__ = require('aws-sdk')
const AWSXRay = require('aws-xray-sdk')
// ...require some other stuff

const AWS = AWSXRay.captureAWS(__AWS__)

// ...setup xray
AWSXRay.captureHTTPsGlobal(require('http'))
AWSXRay.captureHTTPsGlobal(require('https'))
AWSXRay.capturePromise()

// ... other code/helper functions

// ...lambda handler
const handler = async (event, context = {}) => {
  return AWSXRay.captureAsyncFunc('handler', subSegment => {
    subSegment.addMetadata('body', event.body || {}, 'request')
    subSegment.addMetadata('headers', event.headers || {}, 'request')
    subSegment.addAnnotation('userId', getUserId(event))

    return handleEvent(event)
      .then(response => {
        subSegment.addMetadata('body', response, 'response')
        subSegment.close()

        return getSuccessResponse(response)
      })
      .catch(error => {
        subSegment.addAnnotation('error', error.message || 'UNKNOWN')
        subSegment.close(error)

        return getErrorResponse(error)
      })
  })
}

Worth nothing - I'm only seeing it on cold starts too.

@willarmiros
Copy link
Contributor

Hi @chearmstrong,

I resolved the issue by moving AWSXRay.capturePromise() into the function handler, immediately before your return statement. Feel free to reopen if you still have issues.

@chearmstrong
Copy link

@willarmiros Sorry for the delay - I will give this a try, thanks.

@awilson28
Copy link

@willarmiros I'm experiencing the same issue - why is it necessary to move AWSXRay.capturePromise()?

@cluedtke
Copy link

cluedtke commented Apr 8, 2020

This remains an issue for us, even after moving AWSXRay.captureHTTPsGlobal(require('https')); into the handler.

Versions of aws-xray-sdk-core tried:

  • 2.5.0
  • 3.0.0-alpha.1

Before -- we see "Missing AWS Lambda trace data for X-Ray"

const { ApolloServer } = require("apollo-server-lambda");
const xray = require('aws-xray-sdk-core');
xray.captureHTTPsGlobal(require('https'), false);

// ... other stuff

const server = new ApolloServer({ /** config **/ })
exports.handler = server.createHandler();

After -- we still see "Missing AWS Lambda trace data for X-Ray"

const { ApolloServer } = require("apollo-server-lambda");
const xray = require('aws-xray-sdk-core');

// ... other stuff

const server = new ApolloServer({ /** config **/ })
const createHandler = function() {
  const handler = server.createHandler();
  xray.captureHTTPsGlobal(require('https'), false);
  return handler;
}

exports.handler = createHandler();

@chearmstrong
Copy link

I resolved the issue by moving AWSXRay.capturePromise() into the function handler, immediately before your return statement. Feel free to reopen if you still have issues.

This worked for me, but only with v3. Thanks.

@aripalo
Copy link

aripalo commented May 29, 2020

I am using node-fetch with X-Ray SDK and could not get it working initially.

Finally #143 (comment) solved it, but that seems a bit "annoying trick" 😅 Moving this kind of "initialization code" in the end of handler just feels a bit wrong to me.

@aripalo
Copy link

aripalo commented May 29, 2020

Well… since I'm using Middy I refactored this into middleware so I don't feel so dirty 🙂

In case anyone's interested:

// middlewares/instrument.ts
import { Handler } from 'aws-lambda'
import { NextFunction } from "middy";
const AWSXRay = require('aws-xray-sdk');

const instrumentationMiddleware = {
  after: (handler: Handler, next: NextFunction) => {
    AWSXRay.capturePromise();
    next();
  },
}

export default instrumentationMiddleware;
// index.ts
import { APIGatewayEvent, Context, ProxyResult } from 'aws-lambda';
const wrap = require('@dazn/lambda-powertools-pattern-basic');
import instrumentationMiddleware from "./middlewares/instrument";

export const handler = wrap(async (event: APIGatewayEvent, context: Context): Promise<ProxyResult> => {
  /* ... */
  return response;
})
.use(instrumentationMiddleware);

@padzikm
Copy link

padzikm commented Nov 23, 2020

Hi, can you explain why capturePromise() should be inside lambda handler or otherwise it throws 'Missing AWS Lambda trace data for X-Ray. Expected _X_AMZN_TRACE_ID to be set' error (only during lambda's cold start)? In aws doc (https://docs.aws.amazon.com/xray/latest/devguide/xray-sdk-nodejs-httpclients.html) there is no mention of need for capturePromise invocation, but in aws-xray for node readme (https://github.com/aws/aws-xray-sdk-node/tree/master/packages/core#capture-all-outgoing-axios-requests) there is example for capturing axios with capturePromise invoked on the top of a module, which is opposite to this issue's solving. According to this issue's solution capturePromise should be called at the end of a handler, but capturePromise patches promise's then method, so in my opinion it should be called right at the beginning of a handler to capture all requests inside handler - why you recommend it to be called at the end? It's a bit confusing, so can you please explain what is correct way of capturing http(s) requests and why?

@willarmiros
Copy link
Contributor

Hi @padzikm,

This issue has gotten fairly cluttered with tangentially related problems that we attempted to fix, but the original issue that you're experiencing turns out to still be unresolved, as pointed out in #315. We have investigated with the Lambda team and we've found the issue to be a bug in the SDK. I will be working on getting a fix out and released as soon as possible. I will post further updates in #315.

@DavidTanner
Copy link
Contributor

The problem with calling things like capturePromises is that it is adding another function to the .then. Depending on how many calls a lambda gets, it could end up adding a lot of functions to the Promise prototype slowing things down.

function patchPromise(Promise) {
  var then = Promise.prototype.then;
  Promise.prototype.then = function(onFulfilled, onRejected) {
    if (contextUtils.isAutomaticMode()
      && tryGetCurrentSegment()
    ) {
      var ns = contextUtils.getNamespace();

      onFulfilled = onFulfilled && ns.bind(onFulfilled);
      onRejected = onRejected && ns.bind(onRejected);
    }

    return then.call(this, onFulfilled, onRejected);
  };
}

https://github.com/aws/aws-xray-sdk-node/blob/aws-xray-sdk%403.1.0/packages/core/lib/patchers/promise_p.js#L12-L26

@frosas
Copy link

frosas commented Jan 12, 2021

(commenting here as I'm not sure this capturePromise() issue is related to #315)

As others already stated, it doesn't look correct to have to call capturePromise() within the handler in order to work around that function triggering the misleading "context missing" message when a Promise then() is called outside the handler.

Would it be possible to not have this error message logged under this scenario instead?

@stale
Copy link

stale bot commented Jun 11, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs in next 7 days. Thank you for your contributions.

@stale stale bot added the stale label Jun 11, 2021
@stale stale bot closed this as completed Jun 26, 2021
@shellscape
Copy link

Not stale, still relevant.

@Dgarc359
Copy link

Dgarc359 commented Nov 18, 2021

Experiencing the same issue with a middy implementation using AWS Xray. If I'm following my cloudwatch logs correctly, it seems that the missing _X_AMZN_TRACE_ID is only logged as missing during a cold start invocation of my lambda. The curious thing is, is that the error doesn't have any actual discernible impact on the logic we're using (Which is: to create an xray trace url from the _X_AMZN_TRACE_ID in the lambda's env vars). The url is correctly created, so I believe that the error is coming about as a side effect of some sort when calling .captureAWS() (this is just a guess, I'm in the process of debugging but I have no real clue where the issue is actually coming from)

If it makes any difference at all, the project which builds the dependencies for the mono repo is using lerna and aws cdk. This issue hasn't popped up in implementations using middy on lambdas bundled using serverless. Currently, I have only seen the issue crop up in lambda's created using nodejs construct from aws-lambda-nodejs. The only other notable difference I have found is that these assets are being bundled with esbuild as opposed to the native bundling used for assets in the aws-lambda module.

All this is to say, this issue is still persisting, but it's not having an impact on the business logic (that I can find). I would love to get to the bottom of /why/ it's happening, but since it's not seemingly having a noticable effect, I'm left wondering if there is a simple way to silence the log for the missing trace_id.

@willarmiros
Copy link
Contributor

so I believe that the error is coming about as a side effect of some sort when calling .captureAWS()

@Dgarc359 Do you use any AWS SDK clients to make requests in your initialization (cold start) code? If so, that is the reason you're receiving the error. captureAWS is trying to instrument those requests, but the _X_AMZN_TRACE_ID env var is not set for cold start code.

@Dgarc359
Copy link

@willarmiros Here is the full stack trace of the error

2021-11-22T18:54:21.378Z	undefined	ERROR	Error: Missing AWS Lambda trace data for X-Ray. Expected _X_AMZN_TRACE_ID to be set.
    at Object.contextMissingLogError [as contextMissing] (/opt/nodejs/node_modules/aws-xray-sdk-core/lib/context_utils.js:26:19)
    at Segment.resolveLambdaTraceData (/opt/nodejs/node_modules/aws-xray-sdk-core/lib/env/aws_lambda.js:93:43)
    at Object.getSegment (/opt/nodejs/node_modules/aws-xray-sdk-core/lib/context_utils.js:94:17)
    at Object.resolveSegment (/opt/nodejs/node_modules/aws-xray-sdk-core/lib/context_utils.js:73:19)
    at features.constructor.captureAWSRequest [as customRequestHandler] (/opt/nodejs/node_modules/aws-xray-sdk-core/lib/patchers/aws_p.js:66:29)
    at features.constructor.addAllRequestListeners (/opt/nodejs/node_modules/aws-sdk/lib/service.js:279:34)
    at features.constructor.makeRequest (/opt/nodejs/node_modules/aws-sdk/lib/service.js:203:10)
    at features.constructor.svc.<computed> [as describeTable] (/opt/nodejs/node_modules/aws-sdk/lib/service.js:685:23)
    at main (/node_modules/dynamoose/lib/aws/ddb/internal.ts:30:29)
    at func (/node_modules/dynamoose/lib/Model/index.ts:102:60)
    at getTableDetails (/node_modules/dynamoose/lib/Model/index.ts:111:10)
    at check (/node_modules/dynamoose/lib/Model/index.ts:172:33)
    at /node_modules/dynamoose/lib/Model/index.ts:190:3
    at new WrappedPromise (/opt/nodejs/node_modules/async-listener/es6-wrapped-promise.js:13:18)
    at /node_modules/dynamoose/lib/Model/index.ts:167:30
    at /node_modules/dynamoose/lib/Model/index.ts:344:41

I believe that what's happening here is that dynamoose is being used to call dynamoDB, and I believe our Middy implementation is adding the .captureAWSRequest [as customRequestHandler] part of that stack. It seems there is some interaction between our middy code, xray, dynamoose, and cold starts that hasn't been accounted for. I'm guessing your suggestion would be to define some behavior in which the _X_AMZN_TRACE_ID isn't attempted to be accessed during cold start?

@Dgarc359
Copy link

I resolved my issue. I'm not going to say I 100% understand the underlying interactions that were causing the error to be thrown up, but from my understanding, it seems like Dynamoose during cold start code would try to make a connection to a DynamoDB instance, and our middy implementation was attempting to capture that request and pull the trace id. Like you said, this would cause an issue due to the env var not being there during cold starts.

The issue was resolved when I declared our middy code as an external module in the bundling prop of the node JS lambda construct. I was originally modifying esbuild to declare our middy code as external, which, to me at least, does the same thing, but it seems like there's something happening under the hood when you declare an external module in the aws-cdk construct as opposed to the actual esbuild node module that was resolved when declaring the module in the construct itself. Can't say more without having a more holistic understanding of the underlying code though

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests