-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DWARF: linkage_name does not include hash, does not match any symbol #46453
Comments
I agree that this should done. It needs more than just including the hash though, since the DWARF linkage name seems to be missing other stuff (eg sanitize).
Is this necessary if rust doesn't have overloading? The DWARF contains the parameters in other DIEs, so it's still available to the debugger. |
Sure, I didn't even investigate more complicated examples; as I hinted at end, I've definitely seen pretty big divergence between symbol name and debug name. A proper solution unites them fully.
Yea you're right this probably isn't as important for rust/needed at all. I just wanted to illustrate it should be the unique symbol name, whatever that means for rust, whether its hash + other stuff, I'm not quite sure. For c++ it'll be the parameters and other stuff, etc., but the point, as above, is that its the same value in the dwarf linkage name. |
See also bug #32925, which is about the same problem.
I agree with this plan; I think that's the intended purpose of the linkage name. |
Someone mentioned it somewhere, maybe in irc, but I'll post it here, the dwarf name mangler also doesn't sanitize, and is basically out-of-sync with the actual name mangler. Really it seems like there are two implementations of the name mangler at this point, and to illustrate, here is the linkage_name for https://github.com//m4b/rust/blob/590d1fa7b28776f4f7bba9b759a76f1ab5240f63/src/libcore/fmt/mod.rs#L305 when specialized to
or more pretty printed using ddbug:
I.e., it's not sanitized (the linkage_name contains So incidentally, and I've mentioned this before in But now I'm realizing, once this issue is fixed though I'm thinking in some cases it actually will disappear, and it is was just an artifact of the separate dwarf name mangler. I'm realizing now it pops up as an auto-complete item because its in the dwarf information, but I don't believe can |
To give some historic context: Back in 2013, when I implemented the initial version of debuginfo support, |
Me too. Actually I'd go farther and say I think rust's mangling scheme should change completely, and not be C++-like at all. FWIW gdb's DWARF reader just ignores the mangled name in these cases now:
|
Is there an issue about removing |
Great bug, thanks for all the context. +1 for the proposed @tromey says:
Interesting -- what were you imagining instead? Ignoring formatting nested function pointers and arrays in a C/C++ style (which obviously isn't an issue here), my experience is that all of the complexity with C++ symbols is the substitutions table, which prevents symbols from taking up an inordinate amount of space in artifacts. How would you approach this problem differently? I realize this is slightly off topic, so feel free to reach out to me off-thread. |
@sfackler agreed. I think this will actually be "automagically" fixed once the issue is resolved, since for example:
Is mangled into the actual symbol table as:
which is probably what you wanted to see (modulo the symbol hash), yes?
I am interested in different name mangling schemes too and am interested in what you had in mind as well, @tromey @fitzgen please /cc me if you open another thread/whatever :) |
Yep! |
I don't have a concrete plan; but I think (1) starting Rust symbols with |
…woerister Set the dwarf linkage_name to the mangled name ref #46453 @michaelwoerister or anyone else who knows, i'm not sure if this is the correct instance to pass here (or how to get the correct one precisely): https://github.com//m4b/rust/blob/5a94a48678ec0a20ea6a63a783e63546bf9459b1/src/librustc_trans/debuginfo/namespace.rs#L36 So don't merge this yet, I'd like to learn about correct instance first; however, I think this already fixes a bunch of weirdness i'm seeing debugging from time to time, not to mention backtraces in gdb via `bt` are now ~readable~ meaningful 🎉 E.g.: new: ``` (gdb) bt #0 <inline::Foo as core::convert::From<()>>::from () at /home/m4b/tmp/bad_debug/inline.rs:11 #1 0x000055555555a35d in inline::deadbeef () at /home/m4b/tmp/bad_debug/inline.rs:16 #2 0x000055555555a380 in inline::main () at /home/m4b/tmp/bad_debug/inline.rs:20 ``` old: ``` (gdb) bt #0 inline::{{impl}}::from () at /home/m4b/tmp/bad_debug/inline.rs:11 #1 0x000055555555b0ed in inline::deadbeef () at /home/m4b/tmp/bad_debug/inline.rs:16 #2 0x000055555555b120 in inline::main () at /home/m4b/tmp/bad_debug/inline.rs:20 ```
Triage: https://github.com/rust-lang/rfcs/blob/master/text/2603-symbol-name-mangling-v2.md was implemented, changing the mangling scheme completely. I'm not sure about the other stuff here though; can anyone give a summary of what would be needed to close this issue? |
So this is going to be a long issue, but the gist is, to put it semi dramatically, is that I think that all of rust debugging info might be slightly broken, but workable enough for say gdb, that it has gone unnoticed. At the very least, I think:
linkage_name
/cc @philipc @fitzgen @tromey @rkruppe @michaelwoerister
Discussion
I've created some test files in rust and c++, and also grepped the binaries for dwarf dies and also symbol table values, which I explain below.
Repro / Test files
and an approximating C++ file:
I've compiled the rust and c++ versions as follows:
I will use the clang output for the c++ examples below, since it shares the same backend infrastructure, though the g++ does output the same.
Analysis
First I will show the dwarf values for
TEST
,TEST2
, and the functiondeadbeef
for thetest
binary, then thetest_cpp_clang
binary.I would like to direct the reader's attention to the
DW_AT_linkage_name
field, and the corresponding linkage name it shows for each binary.And now for the cpp version:
The first thing to note is that for the rust,
no_mangle
static,TEST
, it is given a linkage name:In contrast to the cpp version, which (correctly) has none. I believe the rust DIE that is emitted is outright incorrect, and is the cause of the issue in #33172
Note, although this issue was closed in favor of #32574 that issue does not
no_mangle
the static.Unfortunately, that issue also noted (but did not seem to pursue further):
which I believe may be the crux of the major problem at large here: the
linkage_name
on all non-mangled Rust DWARF DIEs references a non-existent symbol - I think this is at best highly unusual, and at worst problematic.Missing Symbols
Considering only ELF at the moment, we can verify that for
TEST
,TEST2
, anddeadbeef
, there is no symbol referenced by thelinkage_name
on the DIE:You will note that the symbols include the symbol hash.
In contrast with the cpp version, the symbol name (including parameter types, see the
v
(for void) in_ZN4test8deadbeefEv
) is identical to the linkage_name, as I think, expected:Debuggers
I would now like to present a debugging session (primarily in
gdb
) for the two binaries to attempt to illustrate some of the oddities that occur, and motivate why I think there is something wrong here, at the very least.There are general ergonomic issues and other oddities that I think are surfacing because of the current debug info situation. I have used
gdb
to illustrate, aslldb
is essentially non-functioning for me, and when it doesn't segfault, I cannot break on un-mangled names for the rust binaries.GDB
First the rust binary:
The last two are particularly troubling. This certainly looks like a direct consequence of mapping the unmangled, symbol name + hash, in the ELF symbol table to the mangled-without-hash linkage name in the DWARF DIE.
If we compare this to the exact same c++ debugging sequence, it is exactly what one would expect:
LLDB
As of lldb with llvm 5.0.0 I cannot even auto-complete or break on non-mangled names, which further increases my suspicion something is wrong, and the fact that it segfaults occasionally when I auto-complete using
test::
as a seed and the stack trace indicates its in the dwarf DIE parsing logic is equally suspicious:Solutions
I believe 1. is the first that should be attempted, but I am not an expert in the compiler internals.
Eventually, I think that the final, correct solution imho is to exactly mirror the dwarf output of both the gcc and clang backends, which means:
Final Considerations
I believe I've also found some weirdness w.r.t. other symbol names, specifically:
examples of which are:
at lines:
0.0
appended to a mangled name is not a valid mangled name afaik), respectively.
But I think this is another story for another time ;)
The text was updated successfully, but these errors were encountered: