Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

debug: improving stack traces #3558

Merged
merged 7 commits into from
Jun 8, 2018
Merged

Conversation

alyssawilk
Copy link
Contributor

Dumping function names in stack traces where absl::Sybolize is supported.

before: https://pastebin.com/raw/kZihnpSK
after: https://pastebin.com/raw/XcDwcx15

Risk Level: Low (worst case someone gets a borked stack trace and we roll back)
Testing: manual: see pastebin links.
Docs Changes: updated current docs on stack traces
Release Notes: inline.

Signed-off-by: Alyssa Wilk <[email protected]>
@alyssawilk
Copy link
Contributor Author

@zuercher want me to do the same thing on the APPLE path? I'm wary of making changes there since I can't test them.

bazel/README.md Outdated
actions like segfaults. Where supported, stack traces will contain resolved
symbols. On systems where absl::Symbolization is not supported, the stack
traces written in the log or to stderr contain addresses rather than resolved symbols.
The `tools/stack_decode.py` script exists to process the output and do symbol resolution
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still support stack_decode.py where absl::Symbolization is supported? I think the former gives you line numbers..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The former didn't work at all for me (despite copy-pasting all the various tags from the instructions) which is why I finally got around to doing it inline.

Are you requesting a mode where we disable it? I could add a flag.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm more curious if you can use both together.. Is the absl:: approach able to extract line numbers when we compile -c dbg? That would totally obviate stack_decode.py.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately the traces I posted were with -c dbg so I'm going to go with "no"

I've already tracked down the guy who made these changes in absl, so I can go ask them if they're planning on enhancements. Meanwhile I've made it so stack decode works as well as it has, which is to say not at all for me, but I'm fairly confident it will no longer regress for those who get useful output from it. Before my PR I'd get something of the form
https://pastebin.com/x1KWJjFg
and with my PR and without the latest push it was like
https://pastebin.com/23KrqEmT
witht he latest push it's back to the former #1 ?? ??:0 output which is as much as I can verify :-/

Copy link
Contributor

@PiotrSikora PiotrSikora Jun 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, ./tools/stack_decode.py seems to be broken, and I can't get anything out of it.

@alyssawilk Your "good" output is actually output from absl::Symbolize() that wasn't matched against any regex and left as-is. The ./tools/stack_decode.py output starts with Backtrace (most recent call first) from thread 0:.

@zuercher
Copy link
Member

zuercher commented Jun 6, 2018

@alyssawilk I'm out most of today or I'd pull it and give it a try. If you're ready to merge before I get a chance, go ahead and merge without __APPLE__ and I'll make a separate PR if it works.

ENVOY_LOG(critical, "thr<{}> #{} {} {}", thread_id, stack_trace_[i].idx,
stack_trace_[i].addr, out);
} else {
ENVOY_LOG(critical, "thr<{}> #{} (unknown)", thread_id, stack_trace_[i].idx,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be "thr<{}> #{} {} (unknown)" (missing one {}).

@@ -104,7 +109,14 @@ class BackwardsTrace : Logger::Loggable<Logger::Id::backtrace> {
ENVOY_LOG(critical, "thr<{}> #{} {}: {}", thread_id, stack_trace_[i].idx,
stack_trace_[i].addr, trace.object_function);
#else
ENVOY_LOG(critical, "thr<{}> #{} {}", thread_id, stack_trace_[i].idx, stack_trace_[i].addr);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: extra empty line.

@@ -104,7 +109,14 @@ class BackwardsTrace : Logger::Loggable<Logger::Id::backtrace> {
ENVOY_LOG(critical, "thr<{}> #{} {}: {}", thread_id, stack_trace_[i].idx,
stack_trace_[i].addr, trace.object_function);
#else
ENVOY_LOG(critical, "thr<{}> #{} {}", thread_id, stack_trace_[i].idx, stack_trace_[i].addr);

if (absl::Symbolize(stack_trace_[i].addr, out, 200)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: use sizeof(out) instead of 200.

ENVOY_LOG(critical, "Backtrace obj<{}> thr<{}> (use tools/stack_decode.py):", obj_name,
char out[200];
ENVOY_LOG(critical,
"Backtrace obj<{}> thr<{}> (If unsymbolized, use tools/stack_decode.py):", obj_name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you're here, should we perhaps flip the order from "Backtrace obj<{}> thr<{}> ..." to "Backtrace thr<{}> obj<{}> ..." to match rest of the output?

@PiotrSikora
Copy link
Contributor

@alyssawilk @zuercher I'm going to test if this works on macOS (but it's going to take a while, since I'm on dual-core machine).

@alyssawilk
Copy link
Contributor Author

Looks like macos is already has better stack trace support than I'm adding so I think we're set here.

Signed-off-by: Alyssa Wilk <[email protected]>
@PiotrSikora
Copy link
Contributor

It doesn't work on macOS, it only emits lines with (unknown).

@@ -84,14 +86,17 @@ class BackwardsTrace : Logger::Loggable<Logger::Id::backtrace> {
#ifdef __APPLE__
// The stack_decode.py script uses addr2line which isn't readily available and doesn't seem to
// work when installed.
ENVOY_LOG(critical, "Backtrace obj<{}> thr<{}>:", obj_name, thread_id);
ENVOY_LOG(critical, "Backtrace thr<{}> obj<{}>:", thread_id, obj_name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to update tools/stack_decode.py for that, sorry!

diff --git a/tools/stack_decode.py b/tools/stack_decode.py
index f141d1c1..2edbc63e 100755
--- a/tools/stack_decode.py
+++ b/tools/stack_decode.py
@@ -29,7 +29,7 @@ def decode_stacktrace_log(input_source):
   # bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:84]
   backtrace_marker = "\[backtrace\] [^\s]+"
   trace_begin_re = re.compile(
-      "^(.+)%s Backtrace obj<(.+)> thr<(\d+)" % backtrace_marker)
+      "^(.+)%s Backtrace thr<(\d+)> obj<(.+)>" % backtrace_marker)
   stackaddr_re = re.compile("%s thr<(\d+)> #\d+ (0x[0-9a-fA-F]+) " % backtrace_marker)
   new_object_re = re.compile("%s thr<(\d+)> obj<(.+)>$" % backtrace_marker)
   trace_end_re = re.compile("%s end backtrace thread (\d+)" % backtrace_marker)
@@ -42,7 +42,7 @@ def decode_stacktrace_log(input_source):
         return  # EOF
       begin_trace_match = trace_begin_re.search(line)
       if begin_trace_match:
-        log_prefix, objfile, thread_id = begin_trace_match.groups()
+        log_prefix, thread_id, objfile = begin_trace_match.groups()
         traces[thread_id] = Backtrace(log_prefix=log_prefix, obj_list=[])
         traces[thread_id].obj_list.append(
             AddressList(obj_file=objfile, addresses=[]))

@@ -104,7 +109,13 @@ class BackwardsTrace : Logger::Loggable<Logger::Id::backtrace> {
ENVOY_LOG(critical, "thr<{}> #{} {}: {}", thread_id, stack_trace_[i].idx,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: we're emitting "thr<{}> #{} {}: {}" on macOS and ""thr<{}> #{} {} {}" (without :) on Linux.

Consider removing : from macOS output or adding it to both Linux outputs.

@@ -30,7 +30,7 @@ def decode_stacktrace_log(input_source):
backtrace_marker = "\[backtrace\] [^\s]+"
trace_begin_re = re.compile(
"^(.+)%s Backtrace obj<(.+)> thr<(\d+)" % backtrace_marker)
stackaddr_re = re.compile("%s thr<(\d+)> #\d+ (0x[0-9a-fA-F]+)$" % backtrace_marker)
stackaddr_re = re.compile("%s thr<(\d+)> #\d+ (0x[0-9a-fA-F]+) " % backtrace_marker)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the space at the end is unnecessary here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never trust regex to be greedy so I'm more comfortable leaving it in

@@ -29,8 +29,8 @@ def decode_stacktrace_log(input_source):
# bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:84]
backtrace_marker = "\[backtrace\] [^\s]+"
trace_begin_re = re.compile(
"^(.+)%s Backtrace obj<(.+)> thr<(\d+)" % backtrace_marker)
stackaddr_re = re.compile("%s thr<(\d+)> #\d+ (0x[0-9a-fA-F]+)$" % backtrace_marker)
"^(.+)%s Backtrace thr<(\d+).*obj<(.+)>" % backtrace_marker)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason for using "thr<(\d+).*obj<(.+)>" instead of "thr<(\d+)> obj<(.+)>"? We know exactly the format that's emitted.

PiotrSikora
PiotrSikora previously approved these changes Jun 7, 2018
Copy link
Contributor

@PiotrSikora PiotrSikora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small nit, but LGTM otherwise.

Signed-off-by: Alyssa Wilk <[email protected]>
Copy link
Contributor

@PiotrSikora PiotrSikora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@mattklein123 mattklein123 merged commit cd05980 into envoyproxy:master Jun 8, 2018
ggreenway added a commit to ggreenway/envoy that referenced this pull request Jun 11, 2018
This fixes an ASAN failure when run in a certain environment, as described by
abseil/abseil-cpp#118. The failure was added by
envoyproxy#3558, which pulled in the same
static initializer as described in the original absl issue linked here.

Signed-off-by: Greg Greenway <[email protected]>
@ggreenway ggreenway mentioned this pull request Jun 11, 2018
ggreenway added a commit that referenced this pull request Jun 12, 2018
This fixes an ASAN failure when run in a certain environment, as described by
abseil/abseil-cpp#118. The failure was added by
#3558, which pulled in the same
static initializer as described in the original absl issue linked here.

Signed-off-by: Greg Greenway <[email protected]>
@alyssawilk alyssawilk deleted the symbolize branch October 10, 2018 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants