-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First go at MVP proposal #34
Changes from 7 commits
6806bea
1d5d555
5191faa
544af5b
5161350
3205132
9f2ef14
d7ecdbf
f699117
63431e8
60edd03
34f8650
eca0b46
3c2d415
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,272 @@ | ||
# GC v1 Extensions | ||
|
||
|
||
## Language | ||
|
||
Based on reference types proposal. | ||
|
||
### Types | ||
|
||
#### Value Types | ||
|
||
* `eqref` is a new reference type | ||
- `reftype ::= ... | eqref` | ||
|
||
* `ref <typeidx>` is a new reference type | ||
- `reftype ::= ... | ref <typeidx>` | ||
- `ref $t ok` iff `$t` is defined in the context | ||
|
||
* `int31ref` is a new reference type | ||
- `reftype ::= ... | int31ref` | ||
|
||
|
||
#### Type Definitions | ||
|
||
* `deftype` is the new category of types that can occur as type definitions | ||
- `deftype ::= <functype> | <structtype> | <arraytype>` | ||
- `module ::= {..., types vec(<deftype>)}` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There's already a types section containing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, makes sense. Perhaps tweak wording to say "generalizes the existing types component of modules"? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
|
||
* `structtype` describes a structure with statically indexed fields | ||
- `structtype ::= struct <fieldtype>*` | ||
|
||
* `arraytype` describes an array with dynamically indexed fields | ||
- `arraytype ::= array <fieldtype>` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What about the older idea of giving There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree that this would really extend language's abilities to find efficient representations for their data types. Engines can still treat structs without a tail specially. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is enabled by allowing nesting of aggregates (structs/arrays), which is currently left for post-MVP. For MVP it seems okay to use an indirection. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fair enough |
||
|
||
* `fieldtype` describes a struct or array field and whether it is mutable | ||
- `fieldtype ::= <mutability> <storagetype>` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Missing definition of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Mutability is in the spec already. A field can be a valtype, which includes reftypes per the reference types proposal. Nested aggregates are post-MVP, since they imply introducing inner pointers. |
||
- `storagetype ::= <valtype> | <packedtype>` | ||
- `packedtype ::= i8 | i16` | ||
|
||
* Unpacking a storage type yields `i32` for packed types, otherwise the type itself | ||
- `unpacked(t) = t` | ||
- `unpacked(pt) = i32` | ||
|
||
|
||
#### Imports | ||
|
||
* `type <typetype>` is an import description with an upper bound | ||
- `importdesc ::= ... | type <reftype>` | ||
- Note: `type` may get additional parameters in the future | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you explain the reasoning behind having the type have a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The bound is a reftype is that you need to be able to use anyref (or eqref) as a bound. Yes, type imports are in the type index space. Lumping them in with the type section rather than with imports would be inconsistent with how we handle other imports. It would create a mess with both the type index space (which should have imports first) and the "argument list" of a module instantiation (which currently simply reflects the import list). If we want to separate type imports from other types than I think we instead should have a separate type import section. But I don't think it's strictly necessary. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hm, still not sure reftype makes sense as the bound, but I guess I'll wait to see what you propose for nominal types + type imports. In particular, it seems like the import needs to introduce a new type definition; it can't just reference an existing one. Ah ok, so then type imports are appended to the type section so that (1) signatures can refer to type imports by just using the appropriate index (2) the type section can only be validated together with the import section. Yes? It'd be good to mention these things. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, yes, except that type imports are still meant to be prepended to the type index space. (The ordering of binary sections doesn't have to imply the ordering of index spaces. As you said, they need to be validated together anyway.) Added comments. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Regarding the bound: In most cases, it will just be FWIW, I'm not sure I understood what you meant by:
Why not? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, I think I understand the proposal a bit better now. I also see the benefit in that, by using a reference-type bound (instead of using a struct/array bound more-directly), one has the more-expressive capability of declaring whether the import is equality-comparable, nullable, etc. |
||
|
||
|
||
#### Exports | ||
|
||
* `type <typeidx>` is an export description | ||
- `exportdesc ::= ... | type <typeidx>` | ||
- `type $t ok` iff `$t` is defined in the context | ||
|
||
|
||
#### Subtyping | ||
|
||
Greatest fixpoint (co-inductive interpretation) of the given rules (implying reflexivity and transitivity). | ||
|
||
* `eqref` is a subtype of `anyref` | ||
- `eqref <: anyref` | ||
- Note: `int31ref` and `anyfunc` are *not* a subtypes of `eqref`, i.e., those types do not expose reference equality | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hm, I don't disagree, but how does that relate to this item? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oops, I put this comment on the wrong line: I meant to comment on the below There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, I see. Yes, typo. |
||
|
||
* `nullref` is a subtype of `eqref` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would be good to have a hint somewhere (if not here, then in the GC proposal) why we need There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's from the reference types proposal. Anyref and anyfunc are nullable, so before you have cast down to a concrete ref type successfully it could still be null. |
||
- `nullref <: eqref` | ||
- Note: `nullref` is *not* a subtype of `anyfunc`, `int31ref`or any concrete reference type, i.e., those types are not nullable | ||
|
||
* `int31ref` is a subtype of `anyref` | ||
- `int31ref <: anyref` | ||
|
||
* Any concrete reference type is a subtype of `anyref` | ||
- `ref $t <: anyref` | ||
- Note: concrete reference types are *not* a supertypes of `nullref`, i.e., nut nullable | ||
|
||
* Any function reference type is a subtype of `anyfunc` | ||
- `ref $t <: anyfunc` | ||
- iff `$t = <functype>` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, that is implied by the previous bullet. |
||
|
||
* Any concrete reference type is a subtype of `eqref` if its not a function | ||
- `ref $t <: eqref` | ||
- if `$t = <structtype>` or `$t = <arraytype>` | ||
- or `$t = type rt` and `rt <: eqref` | ||
- TODO: provide a way to make data types non-eq, especially immutable ones | ||
|
||
* Concrete reference types are covariant | ||
- `ref $t1 <: ref $t2` | ||
- iff `$t1 <: $t2` | ||
|
||
* Structure types support width and depth subtyping | ||
- `struct <fieldtype1>* <fieldtype1'>* <: struct <fieldtype2>*` | ||
- iff `(<fieldtype1> <: <fieldtype2>)*` | ||
|
||
* Array types support depth subtyping | ||
- `array <fieldtype1> <: array <fieldtype2>` | ||
- iff `<fieldtype1> <: <fieldtype2>` | ||
|
||
* Field types are covariant if they are immutable, invariant otherwise | ||
- `const <valtype1> <: const <valtype2>` | ||
- iff `<valtype1> <: <valtype2>` | ||
- `var <valtype> <: var <valtype>` | ||
- Note: mutable fields are *not* subtypes of immutable ones, so `const` really means constant, not read-only | ||
|
||
|
||
#### Defaultability | ||
|
||
* Any numeric value type is defaultable | ||
|
||
* A reference value type is defaultable if it is not of the form `ref $t` | ||
|
||
* Locals must have a type that is defaultable. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Without adding a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it could be added, but there is a risk though that people will be lazy and default to using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am on the fence here. The main benefit of a nullable ref type over anyref is that a downcast to the non-nullable concrete ref type is gonna be slightly cheaper. But you'd still need it. (Or are you suggesting that all ref instructions should also work with nullable refs directly?) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A downcast from a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okay, added There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks, looks good! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh: perhaps also mention table element kinds (or maybe this is solved by validation constraint on There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think But there is another constraint we need: a table definition of non-zero initial size must have a defaultable element type. Added that. |
||
|
||
|
||
### Instructions | ||
|
||
#### Equality | ||
|
||
* `ref.eq` compares two references whose types support equality | ||
- `ref.eq : [eqref eqref] -> [i32]` | ||
|
||
|
||
#### Functions | ||
|
||
* `ref.func` creates a function reference from a function index | ||
- `ref.func $f : [] -> (ref $t)` | ||
- iff `$f : $t` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I had been on the edge of whether function references were needed for the GC MVP, but the previous discussion about passing closures to and from JS without wrapping each time seemed to boost the priority. Given that, could we also have the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, but closures are different from mere function references, since they have a different runtime representation. Hence they imply a new type, so I considered them post-MVP. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, I had assumed that they would have the same type and that they would also have the same representation. That is, even without a bound dynamic environment, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added it to the question section. |
||
|
||
* `call_ref` calls a function through a reference | ||
- `call_ref : [t1* (ref $t)] -> [t2*]` | ||
- iff `$t = [t1*] -> [t2*]` | ||
|
||
|
||
#### Structures | ||
|
||
* `struct.new <typeidx>` allocates a structure of type `$t` and initialises its fields with given values | ||
- `struct.new $t : [t*] -> [(ref $t)]` | ||
- iff `$t = struct (mut t)*` | ||
|
||
* `struct.new_default <typeidx>` allocates a structure of type `$t` and initialises its fields with default values | ||
- `struct.new_default $t : [] -> [(ref $t)]` | ||
- iff `$t = struct (mut t)*` | ||
- and all `t*` are defaultable | ||
|
||
* `struct.get <typeidx> <fieldidx>` reads field `$x` from a structure | ||
- `struct.get $t i : [(ref $t)] -> [t]` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For languages with ubiquitous There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Changed. But note that this relies on subsumption, i.e., subtyping being implicit. If we were to require an explicit instruction for upcasts like we have considered recently then you would have to inversely upcast every ref to optref -- or introduce all ref instructions in two variants, or introduce overloading. |
||
- iff `$t = struct (mut1 t1)^i (mut ti) (mut2 t2)*` | ||
- and `t = unpacked(ti)` | ||
|
||
* `struct.set <typeidx> <fieldidx>` writes field `$x` of a structure | ||
- `struct.set $t i : [(ref $t) ti] -> []` | ||
- iff `$t = struct (mut1 t1)^i (var ti) (mut2 t2)*` | ||
- and `t = unpacked(ti)` | ||
|
||
|
||
#### Arrays | ||
|
||
* `array.new <typeidx>` allocates an array of type `$t` and initialises its fields with a given value | ||
- `array.new $t : [t i32] -> [(ref $t)]` | ||
- iff `$t = array (mut t)` | ||
|
||
* `array.new_default <typeidx>` allocates an array of type `$t` and initialises its fields with the default value | ||
- `array.new_default $t : [i32] -> [(ref $t)]` | ||
- iff `$t = array (mut t)` | ||
- and `t` is defaultable | ||
|
||
* `array.get <typeidx>` reads an element from an array | ||
- `array.get $t : [(ref $t) i32] -> [t]` | ||
- iff `$t = array (mut t')` | ||
- and `t = unpacked(t')` | ||
|
||
* `array.set <typeidx>` writes an element to an array | ||
- `array.set $t : [(ref $t) i32 t] -> []` | ||
- iff `$t = array (var t')` | ||
- and `t = unpacked(t')` | ||
|
||
* `array.len <typeidx>` inquires the length of an array | ||
- `array.len $t : [(ref $t)] -> [i32]` | ||
- iff `$t = array (mut t)` | ||
|
||
|
||
#### Integer references | ||
|
||
* `int31ref.new` creates an `int31ref` from a 32 bit value, truncating high bit | ||
- `int31ref : [i32] -> [int31ref]` | ||
|
||
* `int31ref.get_u` extracts the value, zero-extending | ||
- `int31ref.get_u : [int31ref] -> [i32]` | ||
|
||
* `int31ref.get_s` extracts the value, sign-extending | ||
- `int31ref.get_s : [int31ref] -> [i32]` | ||
|
||
|
||
#### Casts | ||
|
||
* `ref.cast <reftype>` casts a value down to a given reference type | ||
- `ref.cast t : [t'] -> [t]` | ||
- iff `t <: t' <: anyref` | ||
- traps if the operand is not of type `t` at runtime | ||
|
||
Question: Have explicit runtime type representations as cast operands? | ||
|
||
|
||
#### Local Bindings | ||
|
||
* `let <valtype>* <blocktype> <instr>* end` locally binds operands to variables | ||
- `let t* bt instr* end : [t* t1*] -> [t2*]` | ||
- iff `bt = [t1*] -> [t2*]` | ||
- and `instr* : bt` under a context with `locals` extended by `t*` and `labels` extended by `[t2*]` | ||
|
||
|
||
## Binary Format | ||
|
||
TODO. | ||
|
||
|
||
## JS API | ||
|
||
Based on the JS type reflection proposal. | ||
|
||
### Type Representation | ||
|
||
* A `ValueType` can be described by an object of the form `{ref: DefType}` | ||
- `type ValueType = ... | {ref: DefType}` | ||
|
||
* A `ValueType` can be described by the string `eqref` | ||
- `type ValueType = ... | "eqref"` | ||
|
||
* A `DefType` is described by a kind and a definition | ||
- `type DefKind = "func" | "struct" | "array"` | ||
- `type DefType = {kind: DefKind, definition: FuncType | StructType | ArrayType}` | ||
|
||
* TODO: ...`StructType` and `ArrayType`... | ||
|
||
|
||
### Value Conversions | ||
|
||
#### Reference Types | ||
|
||
* Any function that is an instance of `WebAssembly.Function` with type `<functype>` is a subtype of `ref <functype>`. | ||
|
||
* TODO: ...rules for structure and array types. | ||
|
||
|
||
#### Equality Types | ||
|
||
* Any JS object (non-primitive value) or symbol or `null` can be passed as `eqref` to a Wasm function, stored in a global, or in a table. | ||
|
||
|
||
### Constructors | ||
|
||
#### `Global` | ||
|
||
* `TypeError` is produced if the `Global` constructor is invoked without a value argument but a type that is not defaultable. | ||
|
||
#### `Table` | ||
|
||
* The `Table` constructor gets an additional optional argument `init` that is used to initialise the table slots. It defaults to `null`. A `TypeError` is produced if the argument is omitted and the table's element type is not defaultable. | ||
|
||
#### `Type` | ||
|
||
TODO. | ||
|
||
|
||
## Questions | ||
|
||
* Have explicit runtime type representations? | ||
|
||
* Distinguish reference types that are castable (and therefore have RTTI)? | ||
|
||
* Provide a way to make data types non-eq, especially immutable ones? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I am sure this has its uses, and is nice to have from a "why not" perspective, I feel this is such an implementation specific optimization (it will not apply to languages that guarantee 32-bits integers, and it may not be a great fit for certain engines/hosts either, if they e.g. need more than 1 bit) that I'd think we'd be better off not having it, or guarantee the full 32-bits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What @stedolan said. Don't think of this as a source-level type, it's just a way to represent tagged pointers/integers, which is a mechanism that many languages need. 32 bit integer references can be implemented in user space using this primitive. See also my recent comment on the respective issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what comment of @stedolan you are referring to, I don't see him refer to
int31ref
at all?I see "many languages" needing 32-bit, and 31-bit being rather specific to some language implementations. I don't see how wasm as a whole benefits from this bias, especially since there will be some cost to supporting it (there will be places where this bit needs to be tested if integer values are a possibility). The alternative, if
int32ref
is not practical, is to not have integers part ofreftype
at all, since at least that would give engines the benefit of being able to assume areftype
is always a pointer.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was referring to this comment.
To clarify, int31ref is primarily a primitive for making pointer tagging available, in the most general form that we can still guarantee to be directly implementable across all relevant platforms and engines. Such tagging is a mechanism that probably a majority of GC'ed language implementations rely on in some form (almost all dynamic languages, functional languages, logic languages). The GC proposal will effectively be useless to them without it. While certainly limited, the proposed type hopefully is general enough that a rather large subset of these use cases can be mapped to it directly.
Int32ref is in a different category. It is not applicable to the same use cases, since it can involve unpredictable and substantial hidden cost on some engines (allocation and branching in particular). For example, in V8 on 32 bit platforms ;). Also, an int32ref can already be implemented with the current proposal, in at least two ways: either (1) as a ref to an immutable struct with a single int32 field (which an engine should be able to optimise just like a proper int32ref), or (2) as the union of an int31ref and the aforementioned struct, doing the necessary boxing/unboxing of large values in user space (in case you don't trust the engine to do a good enough job).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about cost to languages that don't use it? The mere possibility of int31ref values being present may require engines to emit code that test the bit, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only at the point where they perform a downcast from
anyref
, which is an explicit operation and which in practice will require that check anyway, because existing engines tend to use tagging already (like V8).