-
Notifications
You must be signed in to change notification settings - Fork 691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add data segments to binary format #301
Conversation
Add a description of data segments, which are a way that the binary module can load initialized data into memory, similar to a .data section.
lgtm. It'd be nice to link to and from Modules.md#initial-state-of-linear-memory. |
IIUC this means no addition to AST semantics, since the toolchain provides the address (or addresses) that the data segment is loaded at. Code then loads directly from that address, without any indirection / relocation / symbol. Correct? |
In the MVP, that makes sense. With dynamic linking, though, I think we'll need to have global variables that are immutable, load-time initialized pointers into the heap where data sections are loaded (I explained this more in #154). This could be achieved by specifying that, in a shared module (in the sense of |
Yes, these data segments would basically a way to initialize an area of We could explore program control of loading of data segments as a further On Mon, Aug 17, 2015 at 10:14 PM, JF Bastien [email protected]
|
@titzer If we have the ability to efficiently copy into linear memory from outside memory and from files (
|
@lukewagner are you proposing that main modules be able to have a data section, but not dynamically loaded modules? I'm not sure I'm clear. I agree that this interacts tightly with dynamic linking, and it would be good to have a nicely unified solution. |
@jfbastien Nope: both would be able to have data sections: the difference is that main modules would be able to absolutely position their data sections in linear memory while dynamically-loaded modules would need to rely on const-global-ptrs that were declared by the data section. |
In that case it kind of seems like doing |
We could force main modules to do the same thing as shared modules, but that would effectively be strictly taking away useful functionality from main modules:
It does make sense that, for symmetry, we could allow main modules to use symbolic globals to refer to data sections, but until we have dynamic linking, that will be a superfluous feature. |
You may be right. I would however like us to try to avoid designing two features when we know up front we could design one that'll serve both purposes. Could we let dynamically-linked modules decide where their data section is loaded? That would address your first point. Constant folding: relocations and/or patching could take care of this? |
I think there's just fundamental asymmetry between main and shared modules. A main module knows it has the whole [0, For constant folding: I'm thinking compound expression trees that include global addresses at the leaves that could otherwise be folded at compile-to-wasm time. |
The main use case I see is that a module has a complete and efficient On Mon, Aug 17, 2015 at 11:23 PM, Luke Wagner [email protected]
|
@titzer Totally agreed on that use case; maybe I misunderstood what you were asking. To be clear, I think dynamically-linked modules should have their own data segments that are copied into memory when the module is dynamically linked (see discussion with @jfbastien above). I thought you were asking for some sort of API to load data segments at arbitrary times (not just dynamic link time). |
@titzer could you clarify what you mean by "outside linking"? Main and dynamic modules are inherently relying on a loader. A few thoughts: Say a user wants some basic ASLR for their in-app data, and only have a single module (no dso). How would they achieve this? IIUC the current proposal is that they'd manually copy the automatically loaded data, and then use it as regular heap memory? How does user code implement the basic allocator for heap space? The allocator has to figure out where data starts / ends, and stay clear of that? That means that the generic allocator we auto-link into user code has to know this. This is resolvable, but I want to make sure we design this knowingly. |
On Tue, Aug 18, 2015 at 5:33 PM, Luke Wagner [email protected]
|
On Tue, Aug 18, 2015 at 6:01 PM, JF Bastien [email protected]
|
I think we're getting into more design than what a PR should contain. Would On Tue, Aug 18, 2015 at 9:17 AM titzer [email protected] wrote:
I wasn't suggesting the wasm engine help for ASLR as much as avoid getting
That's indeed what my question leads to. So how does it figure this out? :-) |
Merging based on above LGTM from @lukewagner |
Add data segments to binary format
Add a description of data segments, which are a way that the binary module can load initialized data into memory, similar to a .data section.