-
Notifications
You must be signed in to change notification settings - Fork 2
Symbol name demangling
DubRE comes with a partial MSVC name undecorator necessary for data classification. It supports C/C++ PDBs.
Additional tools will be needed to support other compilers' mangling schemes, e.g. rustc
's signatures (_ZN*
, example below):
_ZN9hashbrown3raw15Bucket$LT$T$GT$6as_ref17h99f7f5cc4c79175cE
Below is a loose description of the approach for sanitizing function name samples extracted from PDB.
Glossary:
-
drop
means 'truncate a part of the string' -
ignore
means 'exclude from sample set'
Reference:
https://en.wikiversity.org/wiki/Visual_C%2B%2B_name_mangling
Use Microsoft's undname.exe
:
undname <mangled name>
Only extract the actual name (identifier describing the action) and easy-to-demangle namespaces (no templating/nesting/exotic operators); name structuring for constructors, destructors and operators.
PDB function set is deduplicated on unique names to prevent data-related biases and reduce the datasets before labelling.
<name>
<name>XX@@<...>
Example: __sse2_cosf4@@16
Remarks:
-
right-find last
::
and take the string after (check if not a const-/destructor or operator) -
drop generics template part(s)
-
ignore if contains
<lambda_
-
ignore if contains ` or
'
dynamic initializer for
dynamic atexit destructor
dtor$x
-
ignore if starts with
??$?
-
try reading namespaces from qualification
-
drop the remaining part at first
@@
or any other special character sequence matchingX@
?<name>@<...>
Used for:
- constructors
- destructors
- operators
??<code><class>@<...>
??<code>?$<class>@<...>
Complete code reference:
https://en.wikiversity.org/wiki/Visual_C%2B%2B_name_mangling#Special_Name
Applicable code ranges:
-
0 - Z
(except forB
, other exceptions could happen) _0 - _6
-
_U
and_V
Remarks:
- do not confuse with nested names
- ignore if starts with
??<code>@
- ignore if contains
<lambda_
- ignore if contains
<unnamed-type-
?<nested name>@??<name>@<...>
Remarks:
- use nested name (the last one)
- drop the previous name(s)
??$<name>@<...>
Remarks:
-
??$EPA@
- ignore (repetitive lambda thing)
Examples of excluded special name patterns found in PDB:
??$DispatchCheckVerify@XV<lambda_7e1e494c5b76f8d74a63bab89734c283>@@$$V@@YAX$$QEAV<lambda_7e1e494c5b76f8d74a63bab89734c283>@@@Z
??0<unnamed-type-PipelineState>@FD3D12StateCacheBase@@QEAA@XZ
-
dynamic initializer for 'FCompositeBuffer::Null''
(` omitted) -
oo2net::OodleNetwork1UDP_MeasureTotalCompressedSize'::1'::dtor$0
(` omitted) __scrt_is_nonwritable_in_current_image$filt$0