-
Notifications
You must be signed in to change notification settings - Fork 15
Name Mangling
inNative represents all functions using standard C function names. While some internal functions generated by the compiler use non-alphanumeric characters that are impossible to use from C, all webassembly functions use a simple, standard method of encoding the module that a function belongs to.
[module]_WASM_[function]
If there is no module, the function name itself is used. All functions from all webassembly modules are exported using in this form, and can be accessed from C using the mangled name. Conversely, any C function declared using this form in an embedding environment can be imported into any webassembly module compiled with that environment. To demonstrate, here is an example of a C function pretending to be in the module "engine", which itself is calling a webassembly function exported from the "plugin" module.
int32_t plugin_WASM_GenerateID();
int32_t engine_WASM_CreateEntity(float x, float y)
{
int32_t id = plugin_WASM_GenerateID();
_entities.push_back(new Entity{id, x, y});
return id;
}
Name mangling for simple function names is easy, but WebAssembly allows arbitrary UTF8 strings to be used for module names and functions when importing or exporting. While we can't call these from C, we should be able to simply directly encode these into the binary, but linkers on most operating systems make assumptions about what characters will be in a function name, and really don't like it when a Null terminator is in the string, even though this is valid webassembly. So, inNative has special handling for control characters and certain punctuation.
-
Spaces are encoded as
--
, so "My Function" belonging to "My Module" would look like this:My--Module_WASM_My--Function
-
The punctuation characters
:
=
/
"
,
@
are encoded as their hexadecimal Ascii values, soMy:strange=function@
inMy,Module
would look like:My#2CModule_WASM_My#3Astrange#3Dfunction#40
-
All control characters (0-31) or high ascii characters (127-255) are encoded as their hexadecimal Ascii values, just like punctuation, so a function whose name is just a null terminator would be encoded as
#00
Webassembly requires all strings be valid UTF8
encoded strings, which means all unicode characters will manifest as high ascii characters in the range 128-255. As a result, unicode characters are simply all encoded as hexadecimals.
WebAssembly doesn't understand calling conventions, but they are sometimes required to properly call into certain libraries. The Windows kernel requires all functions use the special __stdcall
calling convention. inNative encodes calling conventions into the module name by using a !
separator. It will only perform this check for external C functions - all WebAssembly functions use standard calling conventions even if you use the !
syntax. The calling convention codes are listed below (capitalization doesn't matter):
-
!C
-__cdecl
(default) -
!STD
-__stdcall
-
!JS
- JavaScript calling convention -
!GHC
- Calling convention used by Haskell's GHC compiler -
!SWIFT
- Calling convention for Swift -
!HiPE
- Calling convention for HiPE
Here is an example of declaring windows functions exposed through the sys
module.
(module
(import "sys!STD" "GetStdHandle" (func $getstdhandle (param i32) (result cref)))
(import "sys!STD" "WriteConsoleA" (func $writeconsole (param cref cref i32 cref i64) (result i32)))
)
Even though the module name here is sys!STD
, the mangled name would still resolve to just GetStdHandle
, because the calling convention is first removed, resulting in a sys
module name, which is then determined as the environment module name, which resolves to a blank module name. If your C functions are accessed with a blank module name, simply put the calling convention by itself - it will still resolve correctly, because the calling convention is stripped from the module name before any linking occurs.
(module
(import "!STD" "GetStdHandle" (func $getstdhandle (param i32) (result cref)))
(import "!STD" "WriteConsoleA" (func $writeconsole (param cref cref i32 cref i64) (result i32)))
)