Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Traced aws client should allow requests to proceed without Segments present #54

Closed
NZKoz opened this issue Apr 23, 2018 · 6 comments
Closed

Comments

@NZKoz
Copy link

NZKoz commented Apr 23, 2018

Wherever possible we try to include XRay tracing in our common wrapper libraries so that we can gradually benefit from tracing as more and more of our services and consumers start preparing segments. This works great with the http client, however the AWS instrumentation really doesn't like being passed requests which don't have segments embedded.

Ideally the instrumentation would be sufficiently unobtrusive that it silently continued when there is no active segment.

We did investigate embedding our own segments manually elsewhere but had the same issue reported in #51

Specifically endSubsegment assumes there's a segment in flight:

func endSubsegment(r *request.Request) {
seg := GetSegment(r.HTTPRequest.Context())
seg.Close(r.Error)
r.HTTPRequest = r.HTTPRequest.WithContext(context.WithValue(r.HTTPRequest.Context(), ContextKey, seg.parent))
}

And some of the hooks don't handle it at the beginning of the requests:

var xRayBeforeValidateHandler = request.NamedHandler{
Name: "XRayBeforeValidateHandler",
Fn: func(r *request.Request) {
ctx, opseg := BeginSubsegment(r.HTTPRequest.Context(), r.ClientInfo.ServiceName)
opseg.Namespace = "aws"
marshalctx, _ := BeginSubsegment(ctx, "marshal")
r.HTTPRequest = r.HTTPRequest.WithContext(marshalctx)
r.HTTPRequest.Header.Set("x-amzn-trace-id", opseg.DownstreamHeader().String())
},
}

By contrast, other parts of the codebase assume that missing segments can happen, and check for nil when working with segments.

aws-xray-sdk-go/xray/aws.go

Lines 178 to 180 in ddbaf76

if curseg == nil {
return
}

The specific crash we get is:

 panic: runtime error: invalid memory address or nil pointer dereference
 [signal SIGSEGV: segmentation violation code=0x1 addr=0xd8 pc=0x8eb5d5]
 
 goroutine 1 [running]:
 github.com/vend/aconsumer/vendor/github.com/aws/aws-xray-sdk-go/xray.glob..func1(0xc4203b4800)
 	/home/travis/gopath/src/github.com/vend/aconsumer/vendor/github.com/aws/aws-xray-sdk-go/xray/aws.go:53 +0xb5
 github.com/vend/aconsumer/vendor/github.com/aws/aws-sdk-go/aws/request.(*HandlerList).Run(0xc4203b4938, 0xc4203b4800)
 	/home/travis/gopath/src/github.com/vend/aconsumer/vendor/github.com/aws/aws-sdk-go/aws/request/handlers.go:195 +0x9d
 github.com/vend/aconsumer/vendor/github.com/aws/aws-sdk-go/aws/request.(*Request).Build(0xc4203b4800, 0x4298c9, 0x8)
 	/home/travis/gopath/src/github.com/vend/aconsumer/vendor/github.com/aws/aws-sdk-go/aws/request/request.go:355 +0x68
 github.com/vend/aconsumer/vendor/github.com/aws/aws-sdk-go/aws/request.(*Request).Sign(0xc4203b4800, 0xa89bb8, 0xc4203b4800)
 	/home/travis/gopath/src/github.com/vend/aconsumer/vendor/github.com/aws/aws-sdk-go/aws/request/request.go:376 +0x2f
 github.com/vend/aconsumer/vendor/github.com/aws/aws-sdk-go/aws/request.(*Request).Send(0xc4203b4800, 0x0, 0x0)
 	/home/travis/gopath/src/github.com/vend/aconsumer/vendor/github.com/aws/aws-sdk-go/aws/request/request.go:483 +0x132
 github.com/vend/aconsumer/vendor/github.com/aws/aws-sdk-go/service/s3/s3manager.(*uploader).singlePart(0xc42065e120, 0xb06700, 0xc42038d740, 0x301, 0x0, 0x0)
 	/home/travis/gopath/src/github.com/vend/aconsumer/vendor/github.com/aws/aws-sdk-go/service/s3/s3manager/upload.go:572 +0x11c
 github.com/vend/aconsumer/vendor/github.com/aws/aws-sdk-go/service/s3/s3manager.(*uploader).upload(0xc42065e120, 0x0, 0x0, 0x0)
 	/home/travis/gopath/src/github.com/vend/aconsumer/vendor/github.com/aws/aws-sdk-go/service/s3/s3manager/upload.go:463 +0x5b0
 github.com/vend/aconsumer/vendor/github.com/aws/aws-sdk-go/service/s3/s3manager.Uploader.UploadWithContext(0x500000, 0x5, 0x0, 0x2710, 0xb10aa0, 0xc4201f80f8, 0x0, 0x0, 0x0, 0x7f7e81acc180, ...)
 	/home/travis/gopath/src/github.com/vend/aconsumer/vendor/github.com/aws/aws-sdk-go/service/s3/s3manager/upload.go:379 +0x192
@luluzhao
Copy link
Contributor

Hello @NZKoz , thank you for contacting us regarding your issue. Yes, we should check whether the segment is nil or not in xRayBeforeValidateHandler and this PR 52 should be able to fix the issue you had. Please let me know if it works or not. If the PR still doesn't work for you, feel free to attach any related code snippet from your example so that we could better help you regarding your issue.

@NZKoz
Copy link
Author

NZKoz commented Apr 26, 2018

@luluzhao that fixed one of the cases, but I'm guessing you need a similar fast-return from endSubsegment.

[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x8e8b86]

goroutine 1 [running]:
github.com/vend/slog/vendor/github.com/aws/aws-xray-sdk-go/xray.(*Segment).Close(0x0, 0x0, 0x0)
	/home/travis/gopath/src/github.com/vend/slog/vendor/github.com/aws/aws-xray-sdk-go/xray/segment.go:131 +0x26
github.com/vend/slog/vendor/github.com/aws/aws-xray-sdk-go/xray.endSubsegment(0xc420043400)
	/home/travis/gopath/src/github.com/vend/slog/vendor/github.com/aws/aws-xray-sdk-go/xray/aws.go:45 +0x8e
github.com/vend/slog/vendor/github.com/aws/aws-xray-sdk-go/xray.glob..func2(0xc420043400)
	/home/travis/gopath/src/github.com/vend/slog/vendor/github.com/aws/aws-xray-sdk-go/xray/aws.go:67 +0x2b

Is the new crash, the fix for that is:

diff --git a/vendor/github.com/aws/aws-xray-sdk-go/xray/aws.go b/vendor/github.com/aws/aws-xray-sdk-go/xray/aws.go
index b290a4a..3018b0c 100644
--- a/vendor/github.com/aws/aws-xray-sdk-go/xray/aws.go
+++ b/vendor/github.com/aws/aws-xray-sdk-go/xray/aws.go
@@ -42,6 +42,9 @@ func beginSubsegment(r *request.Request, name string) {
 
 func endSubsegment(r *request.Request) {
        seg := GetSegment(r.HTTPRequest.Context())
+       if seg == nil {
+               return
+       }
        seg.Close(r.Error)
        r.HTTPRequest = r.HTTPRequest.WithContext(context.WithValue(r.HTTPRequest.Context(), ContextKey, seg.parent))
 }

If I add another nil check to endSubsegment it falls down again in NewClientTrace

diff --git a/vendor/github.com/aws/aws-xray-sdk-go/xray/aws.go b/vendor/github.com/aws/aws-xray-sdk-go/xray/aws.go
index ec98208..053c702 100644
--- a/vendor/github.com/aws/aws-xray-sdk-go/xray/aws.go
+++ b/vendor/github.com/aws/aws-xray-sdk-go/xray/aws.go
@@ -77,7 +77,11 @@ var xRayAfterBuildHandler = request.NamedHandler{
 var xRayBeforeSignHandler = request.NamedHandler{
        Name: "XRayBeforeSignHandler",
        Fn: func(r *request.Request) {
-               ctx, _ := BeginSubsegment(r.HTTPRequest.Context(), "attempt")
+               ctx, seg := BeginSubsegment(r.HTTPRequest.Context(), "attempt")
+
+               if seg == nil {
+                       return
+               }
 
                ct, _ := NewClientTrace(ctx)
                r.HTTPRequest = r.HTTPRequest.WithContext(httptrace.WithClientTrace(ctx, ct.httpTrace))

With those two fixes applies it runs fine, feel free to use those diffs if they're helpful.

@luluzhao
Copy link
Contributor

Hi @NZKoz , thank you for your effort in deep diving this issue and providing the related fix. We'll get this fix out as soon as possible.

@NZKoz
Copy link
Author

NZKoz commented May 14, 2018

Any update @luluzhao, we're running with a manually managed cherry-pick of aws.go and it's slightly error prone 😄

@luluzhao
Copy link
Contributor

@NZKoz, sorry for the late reply. We just released version v1.0.0-rc.5 which contains the fix for this issue and here is our CHANGELOG link https://github.com/aws/aws-xray-sdk-go/blob/master/CHANGELOG.md. Feel free to contact us if you encounter any issues regarding this new version. Thanks.

@leighmcculloch
Copy link

leighmcculloch commented Nov 12, 2018

Similar to the AWS Client, I think it should be possible for BeginSubsegment to be called without a Segment being present. In many cases lower level code may want to specify subsegments within it, but doesn't want to require all callers to be using XRay. At the moment calling BeginSubsegment with a context that doesn't contain a segment results in a panic.

panic: failed to begin subsegment named '<subsegment-name>': segment cannot be found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants