Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Typed Object support to WebAssembly #1022

Closed
wants to merge 1 commit into from
Closed

Conversation

smvv
Copy link

@smvv smvv commented Mar 28, 2017

For interoperability between normal JavaScript and WebAssembly, Typed Objects can be used as a bridge because the object's memory layout is defined by its typed descriptor. Typed objects are always unboxed and therefore provide fast read and write access from both WebAssembly and normal JavaScript. The "core" design of Typed Object is mandatory for implementing object support in
WebAssembly, while the object oriented features are not.

Non goals
Support for interoperability between arbitrary JavaScript objects and WebAssembly is not feasible at the moment. Typed Object provide a way of interfacing with the same memory without worrying about whether the value is the real value or stored in a NaN-boxed format.

TODOs

  • Describe how arrays can be used and constructed.
  • Describe how field names of a type are generated / exported to JavaScript.

For interoperability between normal JavaScript and WebAssembly, Typed Objects
can be used as a bridge because the object's memory layout is defined by its
typed descriptor. Typed objects are always unboxed and therefore provide fast
read and write access from both WebAssembly and normal JavaScript. The "core"
design of Typed Object is mandatory for implementing object support in
WebAssembly, while the object oriented features are not.

Support for interoperability between arbitrary JavaScript objects and
WebAssembly is not feasible at the moment. Typed Object provide a way of
interfacing with the same memory without worrying about whether the value is
the real value or stored in a NaN-boxed format.
@AndrewScheidecker
Copy link

Does this need to bump the binary version? My understanding was that the version could stay the same as long as there aren't changes to existing semantics, and AFAICT this proposal qualifies. If the version stays the same though, it would be good to replace the version tag on the new opcodes, types, etc with some post-MVP feature label.

AFAICT typed objects can't be shared between multiple threads. If that's the case, should WASM disallow obj-kind globals?

I don't understand why the only way to access fields of a typed object assumes the typed object is stored in the WASM linear memory. Isn't the purpose of typed object support to provide access to typed objects that aren't stored in WASM's linear memory?

@smvv
Copy link
Author

smvv commented Mar 28, 2017

@AndrewScheidecker typed object do not reside in linear memory. It is not clear to me whether the memory_immediate is on the operand stack or part of the opcode for Memory-related operators. The object immediate of get_field should be on the operand stack. The immediates field_index and type_index are part of the opcode.

It is not possible to allocate the object on the webassembly heap since pointers could be leaked. The opcode new_object should allocate the object on the "GC heap", which is separate from the webassembly heap.

@AndrewScheidecker
Copy link

Note that i32.load8_s(get_field(object, type_index, field_index)) should
only be allowed if field_index is actually of type i8. This may be relaxed
to allow any data type to be read regardless of the field type as far as the
field is not of type obj.

This implies that get_field returns an address in WASM linear memory to be used by load8_s. Since the WASM load/store operators can only access WASM's linear memory, they can't be used for accessing typed object fields.

@smvv
Copy link
Author

smvv commented Mar 28, 2017

@AndrewScheidecker something that is missing from the document is that the return value of get_field cannot escape. It can only be used by load or store opcodes. We think that introducing the same load and store opcodes for addresses returned from get_field is redundant.

Perhaps adding a gc_heap_ptr to the type system that is verified by the validator will fix the ambiguity. That way, pointers cannot escape, and it allows load and store opcodes to use both memory_immediate and gc_heap_ptr.

@lukewagner
Copy link
Member

lukewagner commented Mar 28, 2017

I think we do indeed want a statically-typed GC objects in wasm that are exposed to JS as Typed Objects (likely after some refinements to that proposal). This will take a large amount of work and a lot of prototyping (e.g., building up a toolchain analogous to Emscripten for a GC'd source language) to thoroughly vet the new design. Subtle design choices will have large implications for the GC algorithms in engines, so a lot of engine experimentation will be necessary as well. All this is to say that it'll probably be hard to get traction on a PR without having this preparatory work done yet to inform analysis of the proposal.

Does this need to bump the binary version?

Agreed with @AndrewScheidecker that we should aim to never bump the binary version: new opcodes, sections, etc should be addable in a backwards-compatible manner although code using new, not-yet-fully-deployed features will need to feature test via WebAssembly.validate.

@alexp-sssup
Copy link

building up a toolchain analogous to Emscripten for a GC'd source language

@lukewagner We think that Cheerp (http://leaningtech.com/cheerp/) would serve for that purpose. It's a FOSS product that we have developed over the last several years at Leaning Technologies (I am CTO of the company and @smvv is a senior engineer). Cheerp generates fully GC-able Javascript from C++, by smartly mapping C++ objects to JS objects. With this product we have successifully ported to the Web multiple large scale applications and games. We are also currently in beta with a Java oriented product called CheerpJ (http://leaningtech.com/cheerpj/, closed source at this time) which also maps Java objects to JS objects. We have written this draft based on our long time experience developing these products and porting applications using them.

We are currently working to introduce WAST support in Cheerp. We choose to emit wast to make manual tinkering with the output easier, the wast2wasm tool can then be used to get the binary version. We plan to prototype and test our ideas for TypedObjects support on top of WAST output as soon as we possibly can but, given the importance of the topic, we wanted to jumpstart the conversation and begin the consensus building process as early as possible by releasing this draft.

@lukewagner
Copy link
Member

Ah thanks for that background and very excited to learn about CheerpJ! That does indeed sound like a great way to test out the design on Java and thanks for starting the conversation early.

@jfbastien
Copy link
Member

Yes, this is a great way to start. I'd like more than just toolchain implementation experience though: I think at minimum one VM, if not two, need to have implemented this and perform well. I know this sets the bar super high, but it's one of the more complex future features!

FWIW I'm really happy more than one toolchain will try it out! having just Emscripten would be suboptimal in the same way having just one VM would be.

@smvv
Copy link
Author

smvv commented Mar 30, 2017

One problem that we see with this proposal is that Typed Object support support the following types:

uint8, int8, uint16, int16, uint32, int32, float32, float64, any, string, object

while this proposal supports exporting the following value types:

i8, i16, i32, i64, f32, f64, any, obj.

The problem is that signedness is not encoded in webassembly, since the opcodes determine the signedness of the operation. Exporting StructTypes from webassembly to JavaScript requires that it is known which fields are signed or unsigned, because the JS side needs to know if the value should be sign-extended or zero extended.

One solution to this problem would be extending the value_types to:

i8, u8, i16, u16, i32, u32, f32, f64, any, obj.

Note that i64 and u64 are missing from the list, because they are not (yet?) supported in all JS implementations. Strings are missing as well, because they have less priority for the Typed Object MVP, since we encode strings as typed arrays. The validator should check that str, i64 and u64 are not part of a StructType.

The validator should also validate the locals remain i32 or i64, and the type checker should accept storing i32 in any of the unsigned values.

Thoughts?

@smvv
Copy link
Author

smvv commented Mar 30, 2017

@lukewagner @jfbastien we are already working on a prototype in SpiderMonkey. SM does have support for Typed Objects in the JS shell, and I've submitted many patches to SM, while only a view to v8, so that's our best JS engine candidate for now. I'll report my findings and progress in this bug report.

We have preliminary patches for WABT to encode object support in WAST into the binary format. We use that to test the design in SpiderMonkey.

@lukewagner
Copy link
Member

Since there are likely to be a lot of independent threads of discussion all related to this one overall proposal, and since we'll probably want to iterate on both the spec/interpreter and spec text changes extensively, how about we create a new CG repo (under github.com/webassembly) with a branch of the spec repo? (Instead of changing docs in the design repo as done in this PR, the proposed changes could instead be in a standalone .md file that described all the changes at a high-level, like #1019 is doing in Threads.md. We've discussed having a /proposals subdir of the spec repo for these before.)

How does that sound?

@lukewagner
Copy link
Member

Ok, done. Perhaps we can close this PR now and break it down into individual issues/PRs in the new repo?

@lukewagner
Copy link
Member

Righto, closing. Feel free to move discussion to the new gc repo; looks like it's already started with gc/#1 :)

@lukewagner lukewagner closed this Apr 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants