Add a pure Rust WASI target? #73432

newpavlov · 2020-06-17T05:32:18Z

Currently wasm32-wasi target is tied to the libc implementation for compatibility with C/C++ code. But for pure-Rust applications C environment is unnecessary and causes a certain bloat of resulting binaries.

For example a very simple code like this:

fn main() {
    let mut buf = [0u8; 4];
    unsafe {
        let _ = wasi::random_get(buf.as_mut_ptr(), buf.len());
        let c = wasi::Ciovec { buf: buf.as_mut_ptr(), buf_len: buf.len() };
        let _ = wasi::fd_write(1, &[c]);
    }
}

Gets compiled into a 64 KB binary (after strip), I guess a significant amount of which has to do with correctly setting up and destroying C environment.

Thus it could be beneficial to add a target like wasm32-wasi-rust intended for pure-Rust WASI applications. Most of the preliminary work has been already done and dependence of wasm32-wasi on libc is minimal (allocator and environment functions, abort, exit, __wasilibc_find_relpath), so IIUC addition of such target should be relatively simple. Depending on the rate of WASI evolution, such addition could be postponed until a certain level of stabilization to reduce maintenance burden.

Previously it was proposed in #63676.

cc @alexcrichton @sunfishcode

The text was updated successfully, but these errors were encountered:

alexcrichton · 2020-06-17T16:37:39Z

I would personally see the addition of a duplicate target as a very heavyweight operation that needs to be very well motivated. I don't really know what benefit we stand to gain other than shaving off some binary size, which could also arguably just be construed as a bug to fix in upstream wasi-libc. Are there other reasons than binary size to introduce a new target?

newpavlov · 2020-06-19T06:47:34Z

I am not sure setting up C environment can be called a libc bug.

I guess in addition to smaller sizes, having a pure-Rust target could:

enable additional optimizations (although cross language LTO, should solve this to a certain extent)
make it a bit easier to develop alternative compiler backends for those targets

Also note that there is a general interest in having libc-independent targets if platform allows it (let's call it "ideological reasons"?): rust-lang/rfcs#2610

newpavlov · 2020-06-20T23:56:53Z

Also there is this issue: WebAssembly/WASI#24

I guess it could be solved by tweaking libc, but I think the proposed target probably will be a more natural solution.

jpmartin2 · 2020-06-21T05:08:58Z

wasi-libc actually already has a change to help with WebAssembly/WASI#24 (see WebAssembly/wasi-libc#74), so at this point I think it's just a matter of rust taking advantage of it. And I managed to figure out how to hack the rust compiler to get it to work, involved setting wasi-root in rust's config.toml (to point to a recent build of wasi-libc) and changing the crt object linked against for the wasi target in librustc_target to crt1-reactor.o instead of crt1.o. Then I was able to produce a wasm file using a binary target with #![no_main] and rustflags = ["-C","link-args=--no-entry"] (creating a cdylib didn't work), and got a wasm file with an _initialize export which does the libpreopen setup (this is provided by crt1-reactor.o). I was able to run the wasm file using wasmtime and call other functions that did file io successfully (I was able to get it working in wasmer as well, though wasmer doesn't call _initialize for you like wasmtime does right now, so you have to add that call yourself).

sunfishcode · 2020-06-22T13:29:43Z

The first step here is to optimize Rust to avoid pulling in as much code. Right now with a default cargo init --bin project compiled to wasm32-wasi, Rust is still pulling in the environment-variable code and preopen code, even though neither should be needed just to print "hello world". Fixing these will be a code size win, whether we link with libc or a Rust library providing similar functionality, and whether one uses the new reactor ABI or not.

newpavlov · 2020-06-22T21:37:13Z

@sunfishcode
Maybe I am missing something, but shouldn't a pure-Rust target solve it automatically? IIUC right now a variable containing environmental data gets populated unconditionally on startup as part of setting-up C environment, while for a pure-Rust target it will be done as part of std::env functions (we even could remove the caching altogether), thus if program does not use environmental data, compiler will be able to trivially remove associated code using dead-code elimination pass.

sunfishcode · 2020-06-23T00:38:37Z

WASI libc is structured such that it avoids pulling in the environment variable code if getenv, environ, etc. are not referenced, so it should work automatically for Rust too.

I've now debugged this a little more: the env is called from within std::panicking::default_hook. I briefly tried panic="abort" and using a completely empty fn main() {}, and Rust still linked in its own Rust panic support code, env code, UTF-8 decoding, backtracing, thread management, some fmt things, and other stuff.

alexcrichton · 2020-06-23T13:37:08Z

The source of bloat there is panicking. The runtime thinks it can panic in a few places during fn main(){} (and generally in the internals of println! there's at least one path that LLVM can't prove won't panic). Panicking, as discovered, brings in the fmt machinery. Switching to a pure-Rust target would not solve this (and might make it worse if the Rust code is bigger than the equivalent C!)

I don't think the preopen bits are needed by fn main() {}, though, so maybe it's a bug that's pulled in?

sunfishcode · 2020-06-24T00:25:29Z

Some preopen bits are being pulled in because:

Rustc adds code to the fn main() {} program which calls various functions in libstd
libstd is compiled into a single .o file, so the linker reads the whole thing in
Elsewhere in libstd is a call to __wasilibc_find_relpath.
libstd's call to __wasilibc_find_relpath then causes the linker to read in the .o file from libc.a which defines it
The .o file which defines __wasilibc_find_relpath also contains the constructor __wasilibc_populate_libpreopen
wasm-ld does GC, and notices that nothing actually calls __wasilibc_find_relpath and removes that function
But, it doesn't remove __wasilibc_populate_libpreopen because it doesn't ever DCE constructor functions, once they're pulled in.

alexcrichton · 2020-06-24T14:53:43Z

Ah indeed that makes sense.

That seems unlikely to get fixed (it's just the behavior of the linker), but @sunfishcode perhaps we could refactor the APIs in wasi-libc? Could the population of libpreopen have an explicit entry point which we call from Rust's std::fs functions (the one that calls __wasilibc_find_relpath), and that way we'd avoid pulling in a constructor function?

Basically Rust really has no need for constructor functions, and we'd rather lazily initialize them ourselves, and that way the linker GC should prevent everything from getting pulled in.

sunfishcode · 2020-08-01T13:33:54Z

Refactoring libc do do the libpreopen initialization lazily is tricky because once user code is running, it may have renumbered the file descriptors, which would break the way libpreopen currently discovers them.

I've now submitted https://reviews.llvm.org/D85062 which is a patch to wasm-ld which allows it to DCE the __wasilibc_populate_libpreopen constructor when it isn't needed. [edit: link to the right patch]

coolreader18 · 2020-12-13T00:20:15Z

I'm looking into adding support for reactor binaries, since it seems pretty simple -- use crt1-reactor instead of crt1 and set the entry to _initialize instead of _start, based off this code from clang. Would it make sense to add a variant LinkOutputKind::WasmReactor, since that's what currently handles which crt to use? Or maybe repurpose an existing LinkOutputKind variant to mean "reactor" on wasi?

coolreader18 · 2020-12-13T00:46:17Z

Or if we do need to change the entry symbol, maybe it should be a separate target entirely? wasm32-wasi_reactor or something? Cause it looks like llvm args are per-target, so if we need to pass -e like x86_64_fortanix_unknown_sgx.rs, it might make sense to be an entirely separate target rather than try to hack around it, and then all the *-dylib LinkOutputKinds can map to crt1-reactor like all the *-exe kinds map to crt1.o on wasi.

coolreader18 · 2020-12-13T00:53:12Z

Or does LinkOutputKind have the chance to change llvm args at some point? That seems like it should happen.

coolreader18 · 2020-12-13T01:13:13Z

Ok, nevermind, the pre_link_args are args to the linker, duh. I think I'll just repurpose cdylib to put an _initialize func in; I don't think there's much sense in making a wasm32-wasi module that can't use wasi functions, so I don't think changing that behavior is really a big deal.

…ichton Emit a reactor for cdylib target on wasi Fixes rust-lang#79199, and relevant to rust-lang#73432 Implements wasi reactors, as described in WebAssembly/WASI#13 and [`design/application-abi.md`](https://github.com/WebAssembly/WASI/blob/master/design/application-abi.md) Empty `lib.rs`, `lib.crate-type = ["cdylib"]`: ```shell $ cargo +reactor build --release --target wasm32-wasi Compiling wasm-reactor v0.1.0 (/home/coolreader18/wasm-reactor) Finished release [optimized] target(s) in 0.08s $ wasm-dis target/wasm32-wasi/release/wasm_reactor.wasm >reactor.wat ``` `reactor.wat`: ```wat (module (type $none_=>_none (func)) (type $i32_=>_none (func (param i32))) (type $i32_i32_=>_i32 (func (param i32 i32) (result i32))) (type $i32_=>_i32 (func (param i32) (result i32))) (type $i32_i32_i32_=>_i32 (func (param i32 i32 i32) (result i32))) (import "wasi_snapshot_preview1" "fd_prestat_get" (func $__wasi_fd_prestat_get (param i32 i32) (result i32))) (import "wasi_snapshot_preview1" "fd_prestat_dir_name" (func $__wasi_fd_prestat_dir_name (param i32 i32 i32) (result i32))) (import "wasi_snapshot_preview1" "proc_exit" (func $__wasi_proc_exit (param i32))) (import "wasi_snapshot_preview1" "environ_sizes_get" (func $__wasi_environ_sizes_get (param i32 i32) (result i32))) (import "wasi_snapshot_preview1" "environ_get" (func $__wasi_environ_get (param i32 i32) (result i32))) (memory $0 17) (table $0 1 1 funcref) (global $global$0 (mut i32) (i32.const 1048576)) (global $global$1 i32 (i32.const 1049096)) (global $global$2 i32 (i32.const 1049096)) (export "memory" (memory $0)) (export "_initialize" (func $_initialize)) (export "__data_end" (global $global$1)) (export "__heap_base" (global $global$2)) (func $__wasm_call_ctors (call $__wasilibc_initialize_environ_eagerly) (call $__wasilibc_populate_preopens) ) (func $_initialize (call $__wasm_call_ctors) ) (func $malloc (param $0 i32) (result i32) (call $dlmalloc (local.get $0) ) ) ;; lots of dlmalloc, memset/memcpy, & libpreopen code ) ``` I went with repurposing cdylib because I figured that it doesn't make much sense to have a wasi shared library that can't be initialized, and even if someone was using it adding an `_initialize` export is a very small change.

workingjubilee · 2023-07-21T22:47:11Z

I would personally see the addition of a duplicate target as a very heavyweight operation that needs to be very well motivated.

wasm32-unknown-emscripten
asmjs-unknown-emscripten

and now:

wasm32-wasi-preview1
wasm32-wasi-preview1-threads
wasm32-wasi-preview2

and of course:

x86_64-apple-darwin
x86_64h-apple-darwin

Or indeed any -gnu and -musl and -uclibc targets.

Strictly speaking, these are not duplicates (in one case the duplication IS total, but it is due to a bug), but the variations between them can be very subtle, and it is in many cases still useful to have distinct support for them, especially for more experimental targets. Thus, @newpavlov, if you wish to do the work, and can describe its merits, I believe it will be considered for inclusion (whether it will be accepted is of course another story).

However, the target tier policy regarding adding targets mostly does not require GitHub issues open for them. In a sense, we are not accepting suggestions anymore, we are accepting PRs. As the rest of the conversation seems to be about ancillary bugs which are only secondarily related, and those can be split into new issues, I don't see a convenient "repurpose this issue" angle. Thus I am closing this.

jonas-schievink added C-feature-request Category: A feature request, i.e: not implemented / a PR. O-wasm Target: WASM (WebAssembly), http://webassembly.org/ labels Jun 17, 2020

jpmartin2 mentioned this issue Jun 21, 2020

WASI Libraries WebAssembly/WASI#24

Open

coolreader18 mentioned this issue Dec 13, 2020

Add a -Zwasi-exec-model codegen option for emitting WASI reactors #79997

Merged

This was referenced Jul 6, 2022

Rust/WASI example tetratelabs/wazero#619

Closed

Panics caller on exit error tetratelabs/wazero#673

Merged

workingjubilee closed this as completed Jul 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a pure Rust WASI target? #73432

Add a pure Rust WASI target? #73432

newpavlov commented Jun 17, 2020

alexcrichton commented Jun 17, 2020

newpavlov commented Jun 19, 2020 •

edited

Loading

newpavlov commented Jun 20, 2020

jpmartin2 commented Jun 21, 2020

sunfishcode commented Jun 22, 2020

newpavlov commented Jun 22, 2020 •

edited

Loading

sunfishcode commented Jun 23, 2020

alexcrichton commented Jun 23, 2020

sunfishcode commented Jun 24, 2020

alexcrichton commented Jun 24, 2020

sunfishcode commented Aug 1, 2020 •

edited

Loading

coolreader18 commented Dec 13, 2020

coolreader18 commented Dec 13, 2020

coolreader18 commented Dec 13, 2020

coolreader18 commented Dec 13, 2020

workingjubilee commented Jul 21, 2023

Add a pure Rust WASI target? #73432

Add a pure Rust WASI target? #73432

Comments

newpavlov commented Jun 17, 2020

alexcrichton commented Jun 17, 2020

newpavlov commented Jun 19, 2020 • edited Loading

newpavlov commented Jun 20, 2020

jpmartin2 commented Jun 21, 2020

sunfishcode commented Jun 22, 2020

newpavlov commented Jun 22, 2020 • edited Loading

sunfishcode commented Jun 23, 2020

alexcrichton commented Jun 23, 2020

sunfishcode commented Jun 24, 2020

alexcrichton commented Jun 24, 2020

sunfishcode commented Aug 1, 2020 • edited Loading

coolreader18 commented Dec 13, 2020

coolreader18 commented Dec 13, 2020

coolreader18 commented Dec 13, 2020

coolreader18 commented Dec 13, 2020

workingjubilee commented Jul 21, 2023

newpavlov commented Jun 19, 2020 •

edited

Loading

newpavlov commented Jun 22, 2020 •

edited

Loading

sunfishcode commented Aug 1, 2020 •

edited

Loading