Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requirements for JS Interop #235

Closed
RossTate opened this issue Aug 10, 2021 · 10 comments
Closed

Requirements for JS Interop #235

RossTate opened this issue Aug 10, 2021 · 10 comments

Comments

@RossTate
Copy link
Contributor

@littledan brought up the issue of JS Interop during today's meeting. I did find this conversation in the PR for the tentative Requirements document, but in that conversation it's suggested that we flesh this out more. So I'm starting this thread to do so. To get things started, here are some concrete requirements that I believe the MVP should provide (or at least have a thought-out plan for) based on what people compiling to JS already get and on what JS is looking to add (and on what I know is possible).

  1. The ability to declare some (if not all) reference types as being coercible to JS references (i.e. externref), with the performance expectation that these coercions (or subtypings) are no-ops (or at least very cheap).
  2. The ability for coerced wasm references to be accessed from JS like other JS objects. Specifically:
    1. named fields
    2. named methods
    3. indexable like arrays (just integer indices?)
  3. The ability for the JS-shape of these coerced wasm references to be determined and fixed at allocation time (immutable shapes seems to be a part of the plan in the works for adding more parallelism to JS)
  4. The ability for a module to downcast externref (or the like?) values to its own reference types (with restrictions, such as using the right rtt)
    1. At the same time, the ability to guarantee that JS (and other wasm modules) can only access the internals of these references through the explicitly provided named fields or methods or other JS decorations (i.e. when I coerce to externref, my references are a black box to everyone else except for the explicit decorations I gave it)

As I said, those are what I believe we should provide, as they are what I would expect if I were compiling to JS. I'm interested to hear what others believe. After hearing various thoughts, maybe we can develop a variant of #121 for specifically JS interop.

@jakobkummerow
Copy link
Contributor

My thoughts:
Bare minimum:

  • pass objects back and forth across the boundary
  • read/write struct/array fields
    • accessing array elements by index seems pretty obvious
    • in the absence of a way to name struct fields there'd probably be autogenerated names like my_struct.$field0 or somesuch. That's obviously not very ergonomic for human programmers, but might be sufficient if it turns out that these objects are usually handled by glue code generated by the same toolchain that produced the Wasm module.

Nice to have:

  • a way to provide names for struct fields. (I don't think the name section serves that need, as it's optional, whereas struct field names would be meaningful for JS interop, i.e. if the JS side relies on them, then whatever mechanism provides them is not optional.)
  • a way to specify methods. (In the absence of that: export free functions that take the object as first parameter.)
  • possibly: a way to partially expose objects (e.g.: hide some fields, expose some fields as read-only that the Wasm type considers mutable). Not sure whether that's really useful. Would likely have a performance/memory cost, as it'd mean that the object would effectively have two types associated with it (and hence stored on it): a Wasm type and a JS type.

Open questions:

  • What about prototypes (on the JS side)?
    • Minimal solution: prototype is always null.
    • But people might want to be able to install methods on the prototype. That raises a bunch of difficult questions though, and needs clarity on the Wasm side type system questions before we can properly dig into those.
  • Passing Wasm objects to JS is easy; passing typed objects back from JS to Wasm is harder: if we want a type check to happen magically, then the RTT must be provided somehow.
    • One possible alternative: a Wasm function consuming objects from JS must always type them as dataref and perform an explicit cast.
    • Another possible alternative: if we switched to nominal types with implicit RTTs, that would solve this as a by-product.
  • Relationship with JS "Typed Objects" proposal/explorations. Same? Compatible/subset?

Quick comments on @RossTate 's sketch:
re 1. I don't really see the need to involve externref at all, can you elaborate on why you think that's needed?
re 3. I think fixed shape is pretty much a given: Wasm objects have a statically known shape, exposing them to JS doesn't change that.

@RossTate
Copy link
Contributor Author

Thanks for the great thoughts! I'll respond to your comments, since one's a question, and the other shows that I need to clarify something:

re 1. I don't really see the need to involve externref at all, can you elaborate on why you think that's needed?

My understanding is that externref on the web embedder is essentially JS reference, so wasm-refs-as-js-refs essentially amounts to wasm-refs-as-externrefs-in-the-web-embedder in some form or another (if at the least through the ToJSValue and ToWebAssemblyValue in the JS API). That said, I'm not tied to this particular way of expressing the idea—I think I was essentially jumping to the suggestion you made about dataref in different terms.

re 3. I think fixed shape is pretty much a given: Wasm objects have a statically known shape, exposing them to JS doesn't change that.

Sorry, it seems I used the wrong term. By shape I was meaning something more like hidden class, i.e. the description of how field and method accesses work for the given object, rather than the memory layout of the object. My understanding is that, for JS interop in a multithreaded setting (with shareable wasm references), it will be important that shareable JS values have fixed hidden classes so that critical engine optimizations cannot be broken by other threads, and consequently shareable wasm references will need to have fixed hidden classes as well.


One comment on @jakobkummerow's thoughts as it pulls out an assumption I was making and should make explicit:

possibly: a way to partially expose objects (e.g.: hide some fields, expose some fields as read-only that the Wasm type considers mutable). Not sure whether that's really useful. Would likely have a performance/memory cost, as it'd mean that the object would effectively have two types associated with it (and hence stored on it): a Wasm type and a JS type.

I would expect hiding a field to have no performance or memory cost: you're simply not giving it a name in JS, and so it's not accessible in JS. Also, I expect any ergonomic interop will have two views for any of these coercible reference types: the wasm view (which is very low level, e.g. field offsets and such), and the JS view (which has named fields and methods and such). If we don't do this, then I suspect the two-views thing will likely happen anyways because people will build JS proxies for wasm refs on the JS side in order to get ergonomic interop (with much less efficiency). So, admittedly, I'm implicitly assuming that a good JS-interop story will have some sort of two-views story to it, though that's my assumption and not necessarily others'.

@takikawa
Copy link
Contributor

takikawa commented Aug 12, 2021

I'm quite interested in the JS API for the GC proposal as well, and broadly agree with a lot of what's been said here. I agree that it would be great to be able to manipulate GC objects in an ergonomic way with names from JS, so that mixed JS and source-language programs work well. And of course that casting in interop works in both directions, ideally.

The ability to declare some (if not all) reference types as being coercible to JS references (i.e. externref), with the performance expectation that these coercions (or subtypings) are no-ops (or at least very cheap).

Just to clarify this point, is this essentially saying if you export Wasm GC objects in the Wasm→JS direction it should be possible and cheap? That would make sense to me.


Re: nominal types and the JS API, since there's now some discussion of prototyping a minimal addition of nominal types in parallel with trying structural types, I think it'd be interesting to consider how the JS API should look in that case.

In particular, would it be a requirement that the API provide the same abilities no matter which types you use or not? I could imagine that you get different API features depending on your types. For example, it could be the case that you can get automatic downcasts if you use the nominally typed subset (like Jakob noted above), and have to do more work to get interop if you use structural types.

@RossTate
Copy link
Contributor Author

Thanks for the thoughts, @takikawa!

Just to clarify this point, is this essentially saying if you export Wasm GC objects in the Wasm→JS direction it should be possible and cheap? That would make sense to me.

Yes, though let me refine that. For "possible", I wouldn't necessarily require all wasm GC objects to cross into JS. I think it would be fine if wasm GC objects had to be explicitly marked/allocated in some fashion in order to cross into JS. For example, there are lots of GC objects that just implement machinery of the language runtime and really have no need to go into JS. (That said, I'm not opposed to having all wasm GC objects be coercible—just noting where I see some flexibility.) For "cheap", based on what we've seen in the gradual-typing language-interop space, my recommendation would be that coercions (in either direction) should not require memory allocation for the typical case. In particular, my recommendation would be to avoid approaches requiring allocating proxies for ergonomic interop (though, of course, modules wanting a proxy-based interop semantics should be able to implement proxies themselves using the lower-level wasm and JS-interop primitives, thereby opting into the associated costs without imposing such costs upon others).

@sjrd
Copy link

sjrd commented Oct 13, 2021

To be able to compile Scala.js to Wasm+GC, in addition to the capabilities laid out in the first message, we would need the following capabilities:

  • Regarding "The ability for coerced wasm references to be accessed from JS like other JS objects.", in addition to named methods, we would need:
    • named properties with getter/setter
  • Be able to somehow declare that some classes of Wasm objects are for error objects, i.e., objects that would have the [[ErrorData]] internal field, so that they are treated as exception objects (receiving stack traces and other debugging-related niceties).

Together with named methods, named properties are necessary to implement member exports of Scala.js. We can annotate methods and properties of Scala classes with @JSExport, and they become available to JavaScript.

The error objects is necessary for us to define the Throwable class and all its subclasses. It is a Scala class (so it would become a Wasm class/object), but it needs to be treated as an error class by JavaScript debuggers for a decent debugging experience. In JavaScript, we do this by making Throwable extend JavaScript's builtin Error class.

@RossTate
Copy link
Contributor Author

Ah, those both make sense, and I suspect they're also useful for other languages already discussed but got overlooked. Thanks!

@askeksa-google
Copy link

For dart2wasm, one challenge is weak maps. A way to implement them would be to pass the Wasm objects out to JS and use the JS WeakMap.

This approach requires that as part of the WasmGC/JS interop, it is specified that Wasm objects can be used as keys in the JS WeakMap, and that their identity from the point of view of the map follows the identity of the Wasm object, even if it is passed out to JS multiple times. So this precludes any wrapping by the engine, or at least requires the WeakMap to look inside such wrapping.

@RossTate
Copy link
Contributor Author

Interesting use case. More generally, it seems like JS interop should respect object identity. Thanks!

@askeksa-google
Copy link

Similarly, Dart will also need the capability to register WasmGC objects (structs in particular) in a JS FinalizationRegistry, such that the finalizer is called when the WasmGC object is reclaimed.

@tlively
Copy link
Member

tlively commented Nov 1, 2022

We have consensus on a "no-frills" JS interop approach for the MVP (#279), so closing this. Feel free to make PRs adding ideas for richer JS interop to the post-MVP doc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants