Skip to content

Symbol name demangling

michal-kapala edited this page Jul 7, 2023 · 4 revisions

Demangler

DubRE comes with a partial MSVC name undecorator necessary for data classification. It supports C/C++ PDBs.

Additional tools will be needed to support other compilers' mangling schemes, e.g. rustc's signatures (_ZN*, example below):

_ZN9hashbrown3raw15Bucket$LT$T$GT$6as_ref17h99f7f5cc4c79175cE

Below is a loose description of the approach for sanitizing function name samples extracted from PDB.

Glossary:

  • drop means 'truncate a part of the string'
  • ignore means 'exclude from sample set'

Reference:

https://en.wikiversity.org/wiki/Visual_C%2B%2B_name_mangling

Manual

Use Microsoft's undname.exe:

undname <mangled name>

Automated

Only extract the actual name (identifier describing the action) and easy-to-demangle namespaces (no templating/nesting/exotic operators); name structuring for constructors, destructors and operators.

PDB function set is deduplicated on unique names to prevent data-related biases and reduce the datasets before labelling.

Unmangled

<name>
<name>XX@@<...>
Example: __sse2_cosf4@@16

Remarks:

  • right-find last :: and take the string after (check if not a const-/destructor or operator)

  • drop generics template part(s)

  • ignore if contains <lambda_

  • ignore if contains ` or '

    • dynamic initializer for
    • dynamic atexit destructor
    • dtor$x
  • ignore if starts with ??$?

  • try reading namespaces from qualification

  • drop the remaining part at first @@ or any other special character sequence matching X@

Mangled

?<name>@<...>

Mangled with special names

Used for:

  • constructors
  • destructors
  • operators
??<code><class>@<...>
??<code>?$<class>@<...>

Complete code reference:

https://en.wikiversity.org/wiki/Visual_C%2B%2B_name_mangling#Special_Name

Applicable code ranges:

  • 0 - Z (except for B, other exceptions could happen)
  • _0 - _6
  • _U and _V

Remarks:

  • do not confuse with nested names
  • ignore if starts with ??<code>@
  • ignore if contains <lambda_
  • ignore if contains <unnamed-type-

Mangled with nested names

?<nested name>@??<name>@<...>

Remarks:

  • use nested name (the last one)
  • drop the previous name(s)

Mangled with template args

??$<name>@<...>

Remarks:

  • ??$EPA@ - ignore (repetitive lambda thing)

Special cases

Examples of excluded special name patterns found in PDB:

  • ??$DispatchCheckVerify@XV<lambda_7e1e494c5b76f8d74a63bab89734c283>@@$$V@@YAX$$QEAV<lambda_7e1e494c5b76f8d74a63bab89734c283>@@@Z
  • ??0<unnamed-type-PipelineState>@FD3D12StateCacheBase@@QEAA@XZ
  • dynamic initializer for 'FCompositeBuffer::Null'' (` omitted)
  • oo2net::OodleNetwork1UDP_MeasureTotalCompressedSize'::1'::dtor$0 (` omitted)
  • __scrt_is_nonwritable_in_current_image$filt$0