-
Notifications
You must be signed in to change notification settings - Fork 691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pondering the Stack and Globals #88
Comments
In general I think a reasonable approach is indeed to let the compiled program manage its own "user stack". Global variables in a wasm module are indeed kind of special, they cannot alias the rest of the heap. I feel like that's a nice feature. And if web workers are "threads", then those variables are basically a form of thread-local storage, and they are initialized when the module is initialized. And otherwise memory usage should be normal as per other platforms (malloc must be threadsafe, etc.). There might be better approaches, though. I've worried that a user-handled stack like that might have overhead over "normal" native compilation, but I've never had an idea as to how to measure that. My hope though is that usage of that stack should be fairly rare, as scalarrepl should eliminate stack vars in most cases. |
Just to go into a bit more detail:
An alternative is to specifically incorporate the user-defined stack into semantics (e.g., by specially recognizing |
OK, let's get weird. What happens when dynamic linking is a thing? Assuming the status quo, shadow globals would be only visible to a specific combination of thread and module. So... does this mean a user stack needs to be allocated for every thread for every module? How does that happen? NxM madness. Or is there a user-level ABI convention where the user stack pointer gets passed across module boundaries? Wrapper functions to hide this? Or is there a way to share shadow globals between modules? (This would require a somewhat de-optimized JS implementation?) If STACKTOP is not a shadow global but lives in user space... how does a shared library find it? (We're back to per-thread initialization for each shared library?) What happens when a shared library wants to create a thread? This means that any user-level thread APIs would also need to be in a shared library? How do shadow globals get initialized on thread creation? There seem to be 3 workable solutions:
I will say that shadow globals are weird. If they are thread and module local, using them seems to cause complications for num_thread > 1 && num_module > 1? A random though: how much size could be shaved off by eliminating user stack setup and teardown operations? Building things in can reduce size, in general, in theory. With the obvious downsides. |
We also have to design something that'll make it possible for wasm to eventually support:
FWIW I think the design will end up having a safe shadow stack and an untrusted stack. The details will be complicated! |
Based on past discussions, the expectation was to do (2) (dynamically-linked modules can import/export functions and globals, thread-local or shared). On a side note, I'm not sure "shadow" is the best adjective to describe globals or stacks. At least in my VM experience a "shadow stack" was a stack maintained in parallel with the native stack to hold, e.g., just the GC pointers. But here you're calling the native stack the shadow stack. Perhaps we could have the "trusted" stack and the "user" (or "heap" or "aliased" stack)? For the same reason, "shadow global" doesn't quite make sense; globals aren't aliasable, but they're not a shadow of anything else. |
If there are lingering questions, they'll bubble back up in later discussions. |
There's no direct question here, just pondering how things fit together.
For the most part, variables will live in the "shadow stack" (the JS stack, for many implementations). Manipulation of this stack is entirely implicit. Do a function call? New frame on the shadow stack. There will be some cases where the shadow stack insufficient, however. For example, address taken of a stack local variable. To support this, a "user space" stack is needed that lives in the visible heap. OK... so how is this user space stack implemented? Is it explicitly compiled into the program? Or is it supported by the system and are there special opcodes?
If the user stack is an explicit part of the program, then how do you implement it? I believe Emscripten uses a global variable: STACKTOP. OK, that's kind of weird if you think about it... it's sort of a "shadow global". Can't take the address of it. Spiritually similar to the stack pointer on CPUs. A virtual register that is not confined to a particular stack frame? OK, how do those virtual registers get initialized? For example, when you launch a thread? I suppose they could be parameters passed in to thread initialization. (Although this raises the question of what the system/user interface for thread creation looks like.) Alternatively, you could just store STACKTOP in the user visible heap. Much simpler, doesn't require a separate concept. (The concept of shadow globals may be desirable. It isn't necessary, however.)
On the other hand - where do you store the thread-local STACKTOP global in user space? Some implementations may lower shadow globals into the heap, anyways, so the question of where the globals are allocated is relevant in multiple situations. If memory allocation is an explicit part of the program... how does the allocator know what parts of memory are safe to use vs. grabbed by some lower-level part of the system? Does there need to be a system-level "page allocation" API that user-level memory allocators build on top of?
JF mentioned an example of split stacks being added to LLVM: http://reviews.llvm.org/D6095
The text was updated successfully, but these errors were encountered: