-
Notifications
You must be signed in to change notification settings - Fork 691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Postpone adding globals until dynamic linking #154
Comments
"globals declare immutable pointers (i.e., integers)" sounds an awful lot like a relocation! This proposal sgtm. |
How would this work in the polyfill? |
Since this would only come with dynamic linking:
|
Sorry, I meant, how would existing asm.js code (no linking) be polyfillable, without globals (which are in asm.js)? |
Well, asm.js could still use asm.js. Unless you're asking about asm.js->wasm->asm.js; that seems harder (I can imagine hacky solutions, though, especially if Emscripten were to participate), but that seems ok. |
Was that not the point of your polyfill prototype, asm.js=>wasm=>asm.js? I guess I am surprised to hear that is being given up on? This is the first feature that seems to break that. |
Yeah, that was useful for experimentation purposes, but I don't think it should constrain the overall design. |
It will also be a limitation on using emscripten to emit wasm - as we were intending to go through asm.js in some parts of our pipeline (I discussed details of that at length with @sunfishcode). This would appear to require a complete rethinking of that strategy. I'll think more about that, but in general I think this aspect - breaking asm=>wasm - is something we need to discuss and consider carefully. I was operating under that assumption, removing it would be a big change. |
For Emscripten's case can't you just control codegen not to emit globals (to use heap instead, as is already supported)? |
That is what I am thinking about, sure, but it isn't quite so simple. Not all the globals come from compiled code. |
In the limit, the polyfill can always transform reads/writes to globals to FFI calls that use outside-asm.js storage. I don't think this corner case should influence the much-longer-lived standard, though. |
In general I agree the polyfill should not influence long-term plans, and good point that ffis can be a non-performant polyfill, if we need that. Anyhow, getting back to the topic here: I think globals are a very convenient feature for human beings. Just like having the
The stack pointer is an extremely common thing, appearing at the beginning and end of most functions (and also sometimes in the middle). Without globals, our View Source suffers significantly, and in particular regresses significantly compared to asm.js (which is a shame since we win in so many other ways!). And with decreases in source readability come downsides in debuggability, understanding performance profiles, etc. |
With the above proposal, |
I didn't realize that about the proposal, but |
The goal of the text assembly is to accurately reflect the AST; I don't think we should let the tail wag the dog here. Independently, if we can find a performance/semantics argument why we should have a special class of unaliased "registers" (what |
We may not agree on the goal of the text assembly, then. Aside from accurately reflecting the AST, I think another goal is having it be as readable and as debuggable as possible. Again, much like having the Which I think is the issue here. From one side, it's nice for those creating the platform to make it "perfect" and minimalistic in one sense; from another side, the people that are going to use the platform, that will read and debug a lot of code in it, for them it would be nice if the text format were as clear and pleasant as possible. I think my stance here comes from the latter standpoint. Obviously both matter :) Regarding the performance issue, I think that's a very good question. We suspect that in asm.js we have overhead due to using a separate stack, and part of that is due to the stack pointer being just an address in the heap. In theory on a machine with lots of registers and running short method calls (where the stack pointer overhead is highest), seeing that sp cannot be aliased (as is the case with a global) would allow it to be pinned permanently in a register. That sounds intuitively interesting, definitely. I don't have data, but over time Emscripten has optimized our stack pointer code, reducing that overhead as much as possible. I can't estimate how much is left, but I can say that optimizing that code has been important in non-LTO'd programs. |
Regarding the text format, I think you may be alone in that viewpoint. I'd be interested to hear from others on this but I do not think text should motivate features that would not otherwise exist. As for |
If I am alone on this, it might be because almost none of us here represent the userspace developer point of view. But I think it's an important one! :) |
I think globals are important but I don't have enough context to understand why it would be hard for emscripten to use heap instead of JS globals. It sounds like JS globals have many known issues (closure variable limits, etc). Is the problem basically just that it impairs debugging of the generated JavaScript? That is important, but on the other hand we're already way over the cliff and on our way down the waterfall when it comes to ease-of-use and debugging, source maps or no. Do we expect people to be debugging polyfill-generated JS for years to come? I thought one of our primary objectives was to provide a good debugging experience, in which case we can do better than what came before. But I can see how the reality might be that those debugging tools will take years to ship, and in that case maybe we have to compromise our design to produce user-friendly JS on the client. Personally I'd sooner generate getter/setter functions ( |
(The emscripten note was an internal technical issue - for special globals like sp, not general global vars; converting the special ones to heap vars is not trivial, but also not impossible.) I don't think the issue is debugging the generated JS, or the polyfill output, but rather people debugging using the text format of wasm itself. That's a standardized format we're going to have for a long time. In theory we want people to use source maps when they are available, but (1) that's far off, but far more importantly (2) even with source maps, often in my experience you do end up reading the text format (low level hacking, performance profiling, etc. etc.). So I feel strongly that the text format should be as readable and pleasant as possible for people. (The polyfill output will likely not be that readable, and I don't think that's a concern.) |
OK, I see. You're saying that you think it's important for there to be opcodes dedicated to globals, for the purposes of debugging and text format manipulation? I agree with that. |
I don't think it's a good idea to add a semantic feature which adds a whole new memory storage class and has wide ranging specification and implementation concerns (I can tell you from Odin experience this isn't a one-off feature like some random If we want to, we can add a feature to the text language which allows you to write in global notation and for this to be translated by the text-to-binary mapping to the associated That all being said, over the weekend I was thinking that TLS has a stronger performance argument than (shared) globals for having outside-the-heap/unaliased locations. To implement TLS, the engine will somehow (pinned register, segment register (x32), reusing OS TLS ABI) maintain a pointer to an internal, trusted array of thread-local engine state and, for maximum perf, you want the TLS variables (like |
Thinking a bit more about this, if we consider the cartesian product { global, thread-local } x { aliased, unaliased }, the above comments make the case for (global, aliased) (necessary for efficient globals with dynamic linking), (thread-local, aliased) (for efficient aliased Talking more recently with @sunfishcode, he did remind me of a good use case for (global, unaliased): for register-plenty archs (AArch64), it might be a good idea to pin certain unaliased globals to registers. In particular, for single-threaded apps, the heap stack pointer This attractive optimization combined with the general "slightly more optimizable b/c unaliased" argument make me think it's fine to leave globals as-is in the MVP. Closing the issue since I opened it, but feel free to reopen if someone else wants to remove globals from the MVP. |
I actually like this proposal as it is; i.e. make globals pointers into the heap rather than making them what amounts to a special address space (or spaces). It sounds to me like the situation for the MVP is that globals may serve 3 purposes:
Maybe instead of having globals as they are conceived now, what if we make them pointers into the heap (aka relocations) as this proposal suggests, and then additionally have a way to provide hints which designate them as not address-taken. For the MVP, it provides the required way to name objects for export (if the heap is exported, then they can be used directly, otherwise getter/setter thunks could be generated or a separate indirection table or whatever). A dumb implementation can safely ignore the not-address-taken hint, but a smart implementation can pin a location to a register where suitable (If the location is also exported, it can flush when the global is accessed). When we move to dynamic linking we will need a way to export structs and arrays; if we don't want to augment wasm's type system and we don't want to have separate exports for every struct element, then we are essentially down to exporting addresses anyway, perhaps with size designations as in ELF. When we add threads, MVP modules which are not thread-aware can behave as if all of their non-exported non-address-taken globals are thread-local (the use case for SP) and we can design a scheme that augments this one for TLS.
The high-level problem with this scheme is that a not-address-taken declaration of some part of the heap is essentially a promise by the user not to access that region via the heap, and if the user lies, then the result is non-deterministic; either they get whatever was written via last global reference, or they get whatever was written via the last heap reference, or the initial value. This is more constrained than C-style undefined behavior, but maybe still not what we want in wasm. If we don't like this proposal I still want to highlight that for dynamic linking we will need a way to export aliased objects larger than our wasm types (but not necessarily unaliased objects), and it would also be good to allow users some control over layout for globals to prevent false sharing and group objects that pass between threads. |
Another alternative is to let aliased globals be addresses/relocations as in this proposal) and perhaps in the future give them sizes to use with dynamic linking), and allow unaliased globals which are also pointers into linear memory for the SP use case. When TLS is added they become thread-local, and globals of this kind could be used for SP and the thread pointer. In the MVP these globals needn't be exportable at all. This scheme would still be pretty good for user-space debuggers and users who want to control layout of globals, and allows optimizations for the most important use cases, while being simpler than having exportable mutable unaliased (and potentially non-thread-local) globals, There was discussion in #337 about whether we need exported mutable globals or not, which would carry over here; this scheme allows that only when the heap itself is exported. I also actually kind of like the opposite idea of letting unaliased globals be arrays; this would allow more interesting uses such letting users separate things into different address spaces for security reasons, or to allow a more fragmented memory space (making things easier for a VM which lives in an already-crowded browser memory space), or just to take advantage of aliasing information that's better-known than C (Fortran! :) ) |
For the MVP use cases:
I don't think this is necessary until we get to dynamic linking; in the MVP, the compiler knows all the addresses statically and constant-address loads/stores should be fully optimized.
I expect this could be solved in the text format in the same way we've created syntactic name sugar for local/function indices; I don't think this alone would justify a semantic feature addition.
The only real candidate we've seen so far that is hot enough is maybe
This seems difficult to optimize: while it's easy to monitor the loads/stores to constant addresses, it'd be costly to catch the dynamic accesses that happened to access the same location. Also, registers are a precious resource so I'd be surprised if any engines actually wanted to pin anything (rather like the So ultimately, I'm coming back to my original position when filing this issue that the MVP could do just fine without globals and that, since threads and dynamic linking are the only use cases for globals, we should hold off on adding globals until we add those. |
I'm in agreement that we should just punt on globals since there are so many unresolved issues here. I like the idea of having runtime-managed globals outside of the heap, but it seems like a huge can of worms. And for the record, I think that if import/export for globals is implemented it has to be mutable - one reason I'd rather we wait until we can do them right. On the other hand, import/export of mutable globals is much simpler than heap sharing, so maybe that's useful? But it seems like it doesn't really solve anything until we have dynamic linking. |
Reopening, since we seem to be converging back to this :-) |
It seems like we need to split globals into two components: immutable module-scoped bindings (i.e. globals) and mutable memory locations (i.e. variables). A global variable is an immutable pointer to some memory that was allocated by the loader (from the instance address-space or otherwise). If we don't have unaliased variables in the MVP, then we don't need global variables: aliased global variables are just data segments. However, I think we do need immutable module-scoped bindings in the MVP to lay the foundation for dynamic linking: a data segment shouldn't be accessed with a static address, but as an offset from an immutable base address "global". Function "pointers" should also be accessed through such a binding, rather than as static indices for an explicit function table. I think the idea of unaliased global variables is also interesting for things like SP and vtables, but that seems like it's outside the scope of the MVP. |
I'm wary about "laying a foundation for dynamic linking" without actually having dynamic linking implemented; there's a good chance we'll define it slightly wrong. Also, with or without dynamic linking, the main module can position the data segments anywhere it wants, so there is no fundamental need for symbols and thus minimality of the MVP also suggests leaving it out. |
Agreed with @lukewagner's wariness. @dschuff is experimenting with dynamic linking concurrently with us driving to MVP, so it's not like we're totally ignoring it prior to MVP! |
+1 for leaving globals out of the MVP though. |
Yes, and I'm saying that should be changed...
...because I agree that we should try not to put stuff in the MVP that will have to be changed post-MVP, and this will need to be changed sooner or later to support dynamic linking. That it will need to change is uncontroversial, right? We need a way to compile a module so that it can address a data segment that is loaded wherever suits the hosting instance, but the address should be immutable so it can be baked into generated code. Same for function "pointers". |
It would be good to firstly see the performance of asm.js without the use of globals, and across JS implementations. If everyone agrees that TLS will be needed, and that the SP will be in TLS, then it why not just define the TLS now for the MVP and have the TLS implemented in asm.js by the asm.js global variables for the MVP. It's is quite possible that some MVP code will work in a multi-threaded app in future, but only if the SP is in TLS now. |
Nope: I think the main module (which is all that MVP has) should always be able to absolutely position its data segments (just like on native); it's only dylibs (or whatever we're calling modules that can be dynamically linked) that require dynamically positioned data segments. Concretely, I think this will mean that dylibs have a different type of segment stored in their memory section which doesn't specify a start address and instead declares a global immutable pointer. |
That works given a distinction between main modules and library modules. But I'm not sure why the distinction of main module is needed as anything other than "module which happened to be loaded into this instance first". A typical use case for WebAssembly seems like the main module will actually be a library of functions that are called by JavaScript, rather than something like an executable with a single entry point. Once there's dynamic linking, should such a module only be usable as the main module of an instance?
IMO this is a legacy of native that doesn't benefit WebAssembly. Once you give guarantees about where data segments will be loaded, you can never take that back. |
@AndrewScheidecker I think the distinction is pretty fundamental as soon as you start thinking about how libc and other shared state would work. It's important to distinguish the use cases for dynamic linking vs. plain MVP modules and imports. In the case of modules and imports, the intention is, like you said, for each module to stand on its own as a collection of functions that can be called from dependent JS or wasm modules; every module in the MVP is a "main" module and thus enables the type of reuse you're talking about. But for dynamic linking, dylibs must carefully coordinate wrt linear memory, ABI (including the shared So either:
So I don't see any downside to starting with absolutely-positioned data segments. |
To be clear, I'm not saying that MVP modules should eventually work flawlessly as dynamically linked modules, just that the MVP should avoid designs that we know will need to change to support threading and dynamic linking. In the MVP, there isn't a main module distinction because every module is a main module. But allowing those modules to give the data segment a static location in memory means that post-MVP that capability must either be removed, or we must distinguish such modules as not dynamically linkable. We know that we eventually need relocatable data segments. It's arguable whether static data segments are useful, and they imply this main module distinction that isn't otherwise implied by anything else in the MVP design. |
Modules that are not specifically compiled to be dynamically linkable are not going to be magically dynamically linkable, no matter what we do with global data sections. If two modules that are dynamically linked into the same instance do not explicitly cooperate on sharing libc (think about the
I think we'll find multiple reasons to make distinctions between main and dylibs when we really dig in (I won't go into it here, but other examples have come up). Regardless, if we did want to avoid any distinction, we could allow all modules to use either static or dynamic data segments. Load-time-linked dylibs could know a static address just as well as a main module (load-time linking is just like a single main module that has been broken up for caching benefits). |
I agree with @AndrewScheidecker that we want main modules to be able to position its data segment more freely, but that discussion should go in #302 :-) @lukewagner makes a point about "old modules" that aren't dynamically linkable, and their non-interaction with "new modules" which are. We can indeed have two worlds that have different capabilities, but I do think that it would be nice to get them to be the same by design! |
I've been thinking recently: one thing I like about globals is that they don't require being contiguous in virtual memory. They stand to significantly reduce virtual memory fragmentation by being easier to move because they're much smaller. That payoff is limited to C/C++ (address-not-taken globals) but may be bigger in other languages (Fortran!). This is pretty important when considering that wasm coexists in the same process as other web things. |
@jfbastien I doubt there would be much memory allocated to the globals, so would not expect them to significantly affect memory fragmentation problems for most code. |
I think we're on the same page: I don't think MVP modules need to be forward compatible with dynamic linking. I was trying to say that non-relocatable data segments imply a distinction between modules that are dynamically linkable and those that are not, and that nothing else in the MVP seems to require that distinction.
I'm not saying that relocatable data segments are the feature needed to make dynamic linking work, just that they're just an obvious requirement, and it affects functionality in the MVP. Making sure all the modules in an instance link to the same libc (or whatever language runtime module) is certainly a problem, but I don't think is dependent on functionality in the MVP.
The bearing on this issue is that if MVP data segments are relocatable they need some module-scoped namespace for immutable values to bind their base address in (similar to what's described in the initial issue). That's a component of global variables, but doesn't require actual variables. Regardless, we can put aliased global variables in data segments for the MVP. Post-MVP we can give data segments a thread-local qualifier to say the runtime should instance them for each thread in TLS. With dynamic linking, we can import/export data segments as untyped variables. Do we need unaliased global variables for the MVP, or after, or never? TBH the stack pointer is the only thing I see that benefits from unaliased global variables, and that may be better implemented as a more specialized state of the runtime. |
@jfbastien Agreed with @JSStats that the savings would be relatively minor in C/C++. Maybe FORTRAN would find a use (esp if we allow array types for globals, as we've discussed already), but I think this line of reasoning puts " @AndrewScheidecker Even after we have full dynamic linking support, some modules will want to absolutely position their global data segments, so starting with only providing absolute offsets makes sense not just in the MVP, but after. Furthermore, I expect absolute addresses will be slightly more efficient when a wasm engine generates relocatable code (using a GOT instead of baking in the address, motivated by machine-code caching/sharing), so apps would be motivated to use absolute offsets when they could (when they were the main module or a load-time-linked module). I don't think we can force modules to be fully relocatable, so we shouldn't try in this one instance, especially when it increases the risks I described earlier that what we specify in MVP will end up being slightly wrong when we do dynamic linking for real. |
@lukewagner said:
Agreed. |
(D'oh, and of course, I meant "unaliased global variables", but I think you understood it as that anyway :) |
This issue will be resolved by #344 (still open for feedback). |
Split off from #139: I was thinking that perhaps globals should be removed from the MVP and instead go in with the dynamic linking feature. The reasoning is that, until we have to worry about dynamic linking, we can simply place globals in the heap and use compiler-chosen static offsets; indeed this is what we'll need to do anyway for any global whose address is taken (and what Emscripten does for most globals anyway, to avoid JS engine number-of-closure-variable limitations). Technically, globals have the advantage that they can't be aliased by loads/stores, but I don't expect this will win much in practice (esp. assuming a C++ compiler has already optimized the code).
Thinking forward to dynamic linking: since we have to deal with aliased globals anyway, it seems simpler and more orthogonal (not requiring a separate set of
LoadGlobal
/StoreGlobal
ops) to have globals declare immutable pointers (i.e., integers) that would then be used as arguments to plainLoadHeap
/StoreHeap
ops (soLoadGlobal(a)
=>LoadHeap(Global(a))
whereGlobal(a)
is a const-expr ofint32
/int64
type). Even trivial backends should have no problem eliminating bounds checks.With this strategy, the question is of course where the memory pointed to by the global pointers comes from: the engine has no knowledge of how the memory in the [0, sbrk-max) range is being used by
malloc
et al. There is a lot of room for design here, but I think roughly what we need to do is let the application allocate the data and give it to the engine (either directly or by registering aglobal data allocator
with the runtime). The important thing is that we keep allocation (and addresses) deterministic and under application control so that applications can do smart things with their address space (shadow stacks, emulatingMAP_FIXED
, etc).Lastly, I think this same strategy applies to TLS variables: TLS variables would be pointers into the heap and we'd need a way for the user application to allocate the memory pointed to by these TLS variables (when new threads are created or for each thread when new modules are loaded with TLS variables; that's why I feel like these two issues are related and can have symmetric solutions).
The text was updated successfully, but these errors were encountered: