Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incompatibilities with Pointer Tagging #118

Closed
RossTate opened this issue Aug 14, 2020 · 20 comments
Closed

Incompatibilities with Pointer Tagging #118

RossTate opened this issue Aug 14, 2020 · 20 comments

Comments

@RossTate
Copy link
Contributor

RossTate commented Aug 14, 2020

#114 discusses an extension for variants along with a number of pointer-tagging optimizations it would be intended to support. Unfortunately these optimizations are not particularly compatible with the current design.

  • TBD: how this integrates with RTTs

This is a major part of the question. Or more generally, how is downcasting from anyref supposed to work with variants? The answer that seems most consistent with the current MVP and with the suggested instructions for invariants is to have each variant case use an rtt that is an rtt.sub (or something analogous) of the rtt.canon for the variant type. That way there's a way to cast from anyref to a variant, which then provides access to br_on_case and variant.test to further cast to the respective cases.

  • Add a new form of deftype for variants:
    • deftype ::= ... | (variant (case $x <fieldtype>?)*)
    • along with nested-types, the fieldtype can itself be a struct
    • references to such a type can be formed as usual
    • cases with no <fieldtype> can typically be represented as an unboxed integer in an engine

The last note here is problematic. Variants are subtypes of anyref, and anyref already has an unboxed integer value: i31ref. So this means having another bit tag to distinguish unboxed variants from references.

But even then, the bigger problem is that (variant (case) (case anyref)) and (variant (case) (case funcref)) are distinct types and so would have distinct rtts (even though one would ideally be a subtype of the other). So the nullary case would need to be represented by a different unboxed integer for each such type. This problem gets worse if you consider (variant (case) (case (ref $imported_type))), because now each instance of the module using this type would have to use a different unboxed integer for the nullary case. (And I have no idea in general how to make this design for variants work in the presence of parametric polymorphism.)

It is left to the engine to pick an efficient representation for the required tags, and depending on the hardware's word size, the number of tags in a defined type, the presence of parameters to a given tag, and other design decisions in the engine, these tags could either be stored as bits in the pointer, in a shared per-type data structure (a.k.a. hidden class or shape), or in an explicit per-value slot within the heap object.

The above problems with nullary cases are even worse with even unary cases. Because everything is a subtype of anyref, the engine would need to have a cross-module-coordinated bit-tagging scheme in order to provide consistent/sound downcasting from anyref. Given the complexity of the problems above, likely no tagging would be done.


This incompatibility with pointer tagging seems fundamental to the use of a top type (anyref) with downcasting in the current MVP.

@rossberg
Copy link
Member

So this means having another bit tag to distinguish unboxed variants from references.

Ah, but only if you assume that variants may be unboxed. That was not the idea here. Variants would express tagged pointers, not integers. For the latter, you already have i31ref, and if you want a mixed representation then you can roll that in user space. There is no significant advantage in lowering that into the engine.

If somebody wanted to encode a variant with custom cast semantics on nullary cases, then obviously an RTT has to be stored somewhere. So they'd have to pay the boxing price for nullary cases. It's their choice, and may depend on the use case. The proposal only provides the tool box for expressing different choices. And you pay what you ask for.

This incompatibility with pointer tagging seems fundamental to the use of a top type (anyref) with downcasting in the current MVP.

I think there still is some deeper misunderstanding here. Specifically, anyref is a union type, directly reflecting the union of possible representations. As such, it does not (have to) guarantee disjointness or opacity per se. If you want to ensure that your type's representation is distinguishable from others then you can make it so, but it's not automatic or required. Remember this is an assembly language (more or less). Again, you pay what you ask for.

@RossTate
Copy link
Contributor Author

Ah, but only if you assume that variants may be unboxed. That was not the idea here. Variants would express tagged pointers, not integers.

In the text I quoted, you say "cases with no <fieldtype> can typically be represented as an unboxed integer in an engine". You seem to be contradicting yourself.

For the latter, you already have i31ref, and if you want a mixed representation then you can roll that in user space.

This would then require casting from anyref to your non-nullary variant type using an rtt cast.

I think there still is some deeper misunderstanding here. Specifically, anyref is a union type, directly reflecting the union of possible representations.

You don't seem to be realizing the consequences of this.

As such, it does not (have to) guarantee disjointness or opacity per se.

It needs to guarantee that downcasting behaves deterministically and is sound without trusting the application. That's why the first thing I had to do was derive how downcasting would need to work. The implication of that was that all distinct variant types would need to be disjoint, contrary to your claim here.

Remember this is an assembly language (more or less).

The current proposal permanently requires all reference values to be equipped with the JVM's casting model, which bakes in a high-level paradigm and is not particularly assembly-like.

Again, you pay what you ask for.

What I'm illustrating is that other languages will pay the cost of rtt casting even though they aren't asking for it. Any design with a downcastable top type fundamentally imposes the same casting mechanism on all values, fundamentally causing systems to pay for casting features even if they don't ask for it.


Another problem along this line that I forgot to bring attention to regards the following statement:

However, many language implementations use more elaborate tagging schemes in one form or the other. For example, they want [...] to store additional information in pointer values without requiring extra space (e.g., Lisp or Prolog often introduce a special tag bit for cons cells).

Because everything is a subtype of anyref and downcasting is done with an rtt, the current MVP requires that a reference value's rtt is recoverable without any static type information. So if you want to do this space-saving optimization, that means the rtt must be recoverable solely from the pointer's bit flags (since it's not stored in the heap). Since even tiny differences in variant types cause them to have a distinct rtt, there can be an arbitrary number of distinct variants with a cons-like cell, and a few bit flags in a pointer cannot encode all that variation.

This means that engines are either going to not perform this space-saving optimization at all, or they're going to perform it on a first-come first-serve basis. The latter case means that a module's space performance can vary widely just by being compiled later, after all the bits have been reserved for other modules' purposes.

@jakobkummerow
Copy link
Contributor

Speaking from just one engine's perspective: I think it is extremely unlikely that we will ever use any bits in a pointer for any module-specific special-casing. The marking phase of GC is expensive and hence performance-sensitive, and one consequence of that is that we want the heap layout to be as regular as possible. Being a highly complex system already, we also have a strong dislike for avoidable implementation complexity. One bit to distinguish unboxed scalars from regular pointers is fine, anything else will (in my personal prediction) very likely remain hypothetical.

(Of course, if WasmGC ends up specifying module-controlled tagging schemes, we'll have to implement that observable behavior. But we may well implement it via an indirection, e.g. storing the tag as a plain integer somewhere in the "hidden class" descriptor, possibly even more than one hop away from the original pointer.)

@RossTate
Copy link
Contributor Author

@jakobkummerow Yes, and for those reasons I thought @rossberg had consciously decided to never support such functionality. But now the plans in #114 contradict those suspicions. So now I am unsure if @rossberg is unaware of the limitations of his proposal and possibly advocating for it because he doesn't realize it has these problems. Regardless of his reasons, it's also important that the CG is making informed decisions. It's problematic if members of the CG decide to advance a proposal under the expectation that they'll be able to later add functionality that is impossible to add. It's another story if the CG discusses this limitation and consciously decides that they are willing to never support said functionality. So it's important for the documentation in #114 to accurately convey what extensions are and are not possible, and it's important for us to discuss and consciously decide to never support the functionality that is incompatible with the current proposal.

@tlively
Copy link
Member

tlively commented Aug 19, 2020

@RossTate I would really like to understand the issues at play here, but I am finding it difficult.

I think there still is some deeper misunderstanding here. Specifically, anyref is a union type, directly reflecting the union of possible representations.

You don't seem to be realizing the consequences of this.

Speaking for myself, I definitely don't realize the consequences of this, so it would be more helpful for me if you tried to explain them in a different way. There are many different assumptions among folks in this conversation that have already been brought up (e.g. whether variants may be unboxed, whether engines will want to implement various optimizations) and I suspect there are many more. This conversation will be most productive if we can collaborate to figure out what all our respective assumptions are and document them. That requires assuming that disagreements are due to different assumptions rather than a lack of understanding.

That's why the first thing I had to do was derive how downcasting would need to work. The implication of that was that all distinct variant types would need to be disjoint, contrary to your claim here.

It would be helpful to me as a non-expert in these topics if you could more explicitly outline how you arrived at these conclusions since I don't have the expertise to see the connections myself.


Stepping back, it seems like the current GC proposal was crafted with the boots-on-the-ground reality of what web engines will and will not want to implement in mind, whereas @RossTate is thinking about the needs of engines that would be willing to take on more complexity in return for higher performance. I think there will be engines all along this spectrum in the WebAssembly ecosystem, so I personally think we should make it a goal for the GC proposal to allow engines to make different tradeoffs on this axis. If others feel that this proposal should be tailored to minimize unused complexity for Web engines, we should talk about that on a separate issue and come to a common understanding.

@RossTate
Copy link
Contributor Author

@tlively I'm happy to delve deeper into the reasoning; my terseness was an effort to keep writing brief, and I apologize for the unintended effect of it be inaccessible. Hopefully that will help identify different assumptions as you say.

The answer that seems most consistent with the current MVP and with the suggested instructions for invariants is to have each variant case use an rtt that is an rtt.sub (or something analogous) of the rtt.canon for the variant type.

Let me start with explaining how I came to this conclusion. There are three reasons:

  1. The current MVP states "Each reference value has an associated runtime type"
  2. All the instructions in the Post-MVP regarding variants explicitly refer to the entire variant type (i.e. the collection of cases)
  3. Applications will want an efficient way to cast from anyref to their variant type, especially if they also want to make use of i31ref and so need to use anyref for their "base" type. Here we see that a translation of OCaml to the current MVP would likely make use of a variant type with 9 cases. This means casting from anyref by enumerating each of the cases would likely be impractical, and as such there needs to be a combined rtt that you can cast to and which each of the cases then subclasses from.

The implication of that was that all distinct variant types would need to be disjoint, contrary to your claim here.

Now lets consider two variant types:

$t1 = (variant (case funcref) (case i32))
$t2 = (variant (case funcref) (case f64))

then consider the following two values:

$v1 = (variant.new $t1 0 (ref.func $f))
$v2 = (variant.new $t2 0 (ref.func $f))

The values $v1 and $v2 might look similar because they're both a funcref in the 0th case of a variant, and so you might expect that they can be represented simply with a funcref and a bit flag in the pointer. But in fact they are different because they have different RTTs. In particular $v1 can be cast from anyref to $t1 by using the canonical RTT for $t1, whereas $v2 can be cast from anyref to $t2 by using the canonical RTT for $t2, and neither can be cast to the other variant type. If you were to use just a funcref with a bit flag, you would not be able to tell which RTT the value has and so which one of the two variant types the value can be cast to.

Does that better illustrate the reasoning? And does it illustrate how anyref and downcasting are key parts of the problem?

@rossberg
Copy link
Member

@RossTate:

This means casting from anyref by enumerating each of the cases would likely be impractical

Before you continue asserting stuff, can you explain why an OCaml implementation would ever need to do that?

First of all, RTTs are intended for checking downcasts against known target representations, not for switching on representations.

Second, OCaml is typed, so usually never needs to switch on all possible representation. The only exception I know would be a primitive like polymorphic equality. And that would use the i32 header for the actual switch. Overall, that requires two downcasts per node (one to block, to access the header, and one to the actual type). Not ideal, but also not terrible, especially considering that using polymorphic equality in performance relevant code is usually advised against anyway.

The other place making a runtime case distinction on representations is polymorphic array ops, but they only need to distinguish between two possible representation choices.

@RossTate
Copy link
Contributor Author

Before you continue asserting stuff, can you explain why an OCaml implementation would ever need to do that?

Structural equality/comparison/hashing is one example that has been a running theme through many of the discussions on challenges of implementing OCaml efficiently and motivations for variants, including in the discussion I linked to. You name this in your comment and then explicitly discard the concern without checking if the people currently trying to support OCaml on WebAssembly agree.

And that would use the i32 header for the actual switch.

The i32 header is a patch for not having variants. Only 1 of the 9 cases has any use for it besides saying which of the 9 cases it belongs to. So the fix you're suggesting involves making 8 of 9 cases unnecessarily larger just to make up for not having good support for variants.

First of all, RTTs are intended for checking downcasts against known target representations, not for switching on representations.

Yes, and the point of this issue is to illustrate that other use cases are not well served by RTTs and that the use of anyref causes problems for adding other downcasting mechanisms.

I used OCaml as an example because other people had worked through how it should be represented, and these issues came up. Even if we were to eliminate/ignore these issues for OCaml, there are other languages that would still encounter the same problems. Note that this is somewhat related to the bug identified in #110 pointing out that the current design for RTTs for functions makes it impossible to cast from anyref to funcref due to typed functions not having a common inherited RTT.

@rossberg
Copy link
Member

rossberg commented Aug 20, 2020

So let me double-check whether you agree that

  1. in OCaml this is solely relevant for implementing a couple of (fairly rarely used) runtime primitives, and not for its actual compilation,
  2. the extra overhead doesn't need to be more than one additional cast in (each branch of) the implementation of these primitives,
  3. this is perfectly in line with the stated goals of the GC MVP?

The i32 header is a patch for not having variants.

The header is what OCaml's data representation is using today and which would naturally map over to a representation in Wasm by default. AFAICS, the implementation would not win anything right now by changing it just for Wasm, other than requiring bigger changes to the compiler pipeline.

The goal for Wasm is to allow porting existing implementations to the extent possible, not force rewriting them.

Yes, and the point of this issue is to illustrate that other use cases are not well served by RTTs

Sure, but (1) do they need to in the MVP?, and (2) what prevents them from being supported by later extensions if there's need?

@RossTate
Copy link
Contributor Author

  1. this is perfectly in line with the stated goals of the GC MVP?
    ...
    Sure, but (1) do they need to in the MVP?, and (2) what prevents them from being supported by later extensions if there's need?

The issue is about (2). It does not claim that pointer tagging needs to be in the MVP. It is rejecting the claims made in the Post-MVP and illustrating why the feature cannot be added by later extensions.

The header is what OCaml's data representation is using today and which would naturally map over to a representation in Wasm by default. ... The goal for Wasm is to allow porting existing implementations to the extent possible, not force rewriting them.

The header word in OCaml is its version of an RTT. If the OCaml runtime sees that this header word is the magic number corresponding to a double array, then it knows that the reference is a double array. But that's not the case in the current/post MVP. Even if the application knows that invariant, the WebAssembly engine does not, and so the application has to insert an rtt cast. Since every value needs to have an RTT, that means that, for an OCaml double array, this header word provides no additional information than the rtt does—it is redundant. That redundancy holds for 8 of the 9 variants.

This redundancy means that when compiling OCaml to the current/post MVP it no longer makes sense to have each value have such an i32 field. That is, the use of RTTs prompts existing implementations to be redesigned specifically for WebAssembly.

There are existing design techniques that would be able to let OCaml use its header word as it does currently, i.e. they would be able to recognize that if the header word is the magic constant for a double array then the reference is a double array, but unfortunately they too are incompatible with the current MVP.

@rossberg
Copy link
Member

I don't follow what you are arguing. That an OCaml port should be able to remove the field? That may save space, but may require cross-cutting changes to the compiler, so should not be required. (It's also worth noting that the information in the header is exposed to user code via the Obj module; that may not be worth supporting on a Wasm port, though.)

There are existing design techniques that would be able to let OCaml use its header word as it does currently, i.e. they would be able to recognize that if the header word is the magic constant for a double array then the reference is a double array, but unfortunately they too are incompatible with the current MVP.

Now you seem to be arguing that an OCaml port should be able to keep the header field, but without introducing redundancy. That's fundamentally impossible, AFAICS. The header word has multiple bit fields. The biggest one is the object's size. Given that the Wasm GC will not be able to understand that, it will inevitably have its own representation of that information, in the RTT or elsewhere. The only way to avoid the redundancy is by removing the header word in OCaml's data mapping and have it use Wasm's built-in size, see above.

Either way, I see no barrier to adding means to switch on RTTs post-MVP. Care to substantiate your claim?

@tlively
Copy link
Member

tlively commented Aug 24, 2020

I don't follow what you are arguing. That an OCaml port should be able to remove the field? That may save space, but may require cross-cutting changes to the compiler, so should not be required.

@RossTate is arguing that this should be allowed, not required.

The header word has multiple bit fields. The biggest one is the object's size. Given that the Wasm GC will not be able to understand that...

IIUC, @RossTate would prefer not to have to make this assumption, and the fact that the current proposal does make this assumption is the root of the issue he is raising here.

(@RossTate, my apologies if I'm misrepresenting your position)

@rossberg
Copy link
Member

IIUC, @RossTate would prefer not to have to make this assumption

I have a hard time believing that that's truly what @RossTate is suggesting. It hardly seems realistic to expect that a Wasm GC would be able to read off object sizes from arbitrary bit fields in user space instead of using its own. And that that could result in an efficient GC. Not to mention a safe and secure one.

@sabine
Copy link

sabine commented Aug 27, 2020

Criteria to evaluate performance issues:

  1. does it block the language from compiling to WASM GC by repurposing their current compiler implementation?
  2. does it make the resulting code much slower than an ideal implementation?
  3. can it be resolved by modifying the existing compiler, without going into extensive research with uncertain outcome?

Answers for "header-fields of OCaml being redundant with RTTs and WASM GC structs" look to me to be pretty clear:

  1. No 2. No 3. Likely.

It's perfectly fine that the heap struct's header field becomes obsolete when compiling OCaml to WASM+GC. Other languages experience similar, resolvable redundancy when they port compiler backends to WASM.

@RossTate
Copy link
Contributor Author

Thanks @sabine. That's a great rephrasing of the argument I was making!

@tlively
Copy link
Member

tlively commented Aug 27, 2020

Perhaps I'm misunderstanding, but my reading of @sabine's comment is that they would be ok leaving in the redundant header field, but @RossTate, you're arguing that the redundant header should be removed, so you're not arguing the same thing as far as I can tell.

@RossTate
Copy link
Contributor Author

Oh, I read "becoming obsolete" as meaning no longer necessary, and I read "3. Likely" as meaning it would be easy to adapt the compiler to the change. But I do see how it can be read both ways. @sabine, would you mind clarifying what you meant?

@sabine
Copy link

sabine commented Aug 31, 2020

@RossTate I believe that, in the interest of getting a reasonable WASM GC MVP out of the door, it is very important to focus on resolving performance issues that trigger "Yes" answers to the first two criteria. Thus, we're fine with keeping the redundant header field in our WASM compiler backend MVP.

As time progresses, we optimize the compiler implementation based on profiling results. We believe we can get rid of the header field in the WASM backend in the sense that there is no fundamental reason why we can't.

It is likely that removal of the redundant header field is not a low-hanging fruit in terms of cost vs. reward right now. In the very long term, it is possible that we will go through the effort of making the heap model of the middle-end of the compiler more abstract to accommodate the different representation of the heap struct's header on different backends.

I expect that this process of discovering abstractions for the middle end of the compiler will be similar for most, if not all, existing languages, and that this is, in the long run, a healthy development.

@RossTate
Copy link
Contributor Author

Ah, thanks for the nuanced clarification!

@tlively
Copy link
Member

tlively commented Nov 1, 2022

Closing this since it is about post-MVP features and not actionable for this particular proposal. Of course if we work on variant types in post-MVP, we will have to make sure they perform well and fit into the rest of the language, but the discussion will be more productive with a concrete variants proposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants