-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GC story for arrays? #151
Comments
Potential solution: Keep arrays as is, and introduce a |
Pretty much no memory managers underlying GCs or otherwise can cheaply grow buffers in most cases. Almost all languages that have a resizable array type (vector) implement it by reallocating and copying when the reserved size is exceeded. Luckily, most of those would choose to grow the backing buffer by 1.5x to 2x usually, not on every single new element as you assume. If resizing is indeed such an expensive operation with a variety of choices to be made in terms of growing rate and reserving ahead of time, it is indeed the right choice for something like Wasm GC to leave these implementation choices to the language, rather than to do resizing itself. A vector also implies indirection (since buffer reallocation must not be observable by any references to the vector), whereas a fixed array can be represented directly with a single buffer. Again, the choice of using indirection should be left to the language. Functional languages for example could implement resizing without an indirection since other references can't observe the new element anyway). Java, for example, has no built-in resizable arrays, and each resizable container implements its own resizing logic on top of them. |
Yeah, that's exactly what's not possible currently if an
Not assuming that btw.
And I've implemented several of these, no worries. Mine, based on TLSF, can cheaply grow if the adjacent right free block is large enough.
Right, that's why AssemblyScript has both |
My understanding is that this isn't a problem because producers would simply choose to use nullable reference types to implement vectors, even if the element types were non-nullable in the source language. I can't think of a reason a producer would need to use non-nullable references in its vector implementation, although perhaps there is some reason I haven't thought of. That doesn't address the interoperability concern, though. |
Goes a little into the direction of Cast free arrays, since if the type is nullable but isn't in the source language, you'll end up with superfluous casts again that the other issue is trying to get rid of. cc @RossTate |
Yes, welcome to 50 years of As for arrays needing to contain nulls.. there's a ton of possible data types and algorithms that typically build on top of arrays that may need to have empty elements, or delayed initialization etc. and they're all going to run into this issue, a |
Looks like you are an experienced engineer. How would you tackle the problems of superfluous casts / non-interoperability when doing nothing isn't an option for AS? |
The initial GC design can't solve all things for all languages and still be simple enough such that we can make forward progress. There have been plenty of ideas on how casts can be elided (see @RossTate's alternative GC proposal in particular), but adopting these all, and making sure no language is left with a bad use case, is going to delay things indefinitely. The current GC proposal is from about ~3 years ago, and we're still discussing fundamentals. Best we can hope for at this point is not make decisions that do irreversible damage. As for interop, again, there is only one real solution, and that is Interface Types. Adding special purpose types for the benefit of one language's intended implementation is all incredibly premature.
The GC proposal is trying to serve a LOT of languages, all of which have use cases that don't map 1:1 onto the current proposal. I am not sure why you keep thinking AS should have special status within that group. |
Well, AS isn't really that special tbh. It's much like Java and C# for instance. Furthermore, I guess it is the closest to JS with static types one can get, so solving issues it encounters helps on multiple fronts like for example interop with JS. As such
I think that not talking about these things is already doing irreversible damage. I guess that's exactly what you do not want to happen?
There are certain related and unrelated issues with interface types that I am trying to solve, as you probably know. I agree of course that interface types may be able to help here. How would you design a solution for this particular problem based on interface types? Trim the array at the boundary (alloc+copy->garbage)?
Again not thinking that btw. If you have grievances with me or AssemblyScript, I can offer to talk about the reasons for it in private? Or perhaps use the meeting later today? |
Copy is certainly the default for IT. Not sure if trimming would be required.. dealing with
Then I don't think you see how strongly a statement like |
It is indeed not an option for AS to have non-interoperable arrays with JS. Defeats the whole point of why AS exists / what people want to do with it. (Perhaps being direct like this is a cultural difference? Like, I typically state the facts instead of beating around the bush more than it helps. Some people value that. And to be direct once more, I don't think that you are in a position to cast the first stone here?) |
To repeat what I said in an offline meeting earlier: I'd expect the pre-import idea to work here too, similar to strings. Not exactly the same, in that I'd expect a pre-imported string to be all the strings a program needs, while a pre-imported array would be interesting for JS interop, but not as optimized as a wasm array - which is more like a JS typed array. So both types (pre-imported JS array, and wasm array) would be of use in that case. The nice thing with the pre-import idea is that once we have working code with it, we can actually explore this space, and learn a lot. That can suggest further spec additions, that are not feasible to decide on atm. |
It's worth pointing out that in many languages, particularly Java and C#, arrays are not growable. In order to protect themselves from their arrays' lengths being modified by outside code breaking language invariants, they'd have to wrap their internal arrays so they don't escape. |
It's also worth pointing out that arrays in JavaScript are pretty complicated. One can delete elements from the middle array (where subsequent accesses poke through to the prototype chain), and they can be grown or shrunk from either end. Wasm is not going to adopt such capabilities for arrays by default, so you need to proxy arrays that escape to JavaScript if you want to interoperate properly. |
Hmm, good points. So the use case I have in mind is Wasm code calling Web APIs that expect one or more (object) arrays as arguments (also vice-versa). With a proxy, this can work I guess as long as the API expects an array-like object, or would it fail more often than it'd work? |
It's a difficult situation and I don't envy you :-) Arrays in JavaScript are complicated enough that it might be easier to use JS arrays directly and "proxy" them to Wasm--i.e. import the |
@titzer wrote:
With both strings and arrays, one notable problem is there is no importable host function which will cleanly perform the following operations:
You would have to write JavaScript host functions wrapping non-function operations like the "+" operator for string concatenation, and while you could import JavaScript's Have these things been thought out and explicitly written out with example code yet? |
The design of arrays and the broad picture of JS interop are settled for the MVP, so I'll go ahead and close this issue. PRs adding ideas for future extensions to the post-MVP document would be welcome, though. |
I get that we want to support many languages, but JavaScript is the primary language of the Web (even in the non-JavaScript JavaScript-GC'd DOM APIs written in C++), and we're talking about WebAssembly here. Even if we eliminate JavaScript by allowing non-JS languages to be used via WebAssembly, practically all DOM APIs are implemented as JavaScript APIs with the JS engine's GC, and this includes any APIs that accept or return JavaScript strings and JavaScript arrays. I doubt we are going to be changing all of these DOM APIs to not use JavaScript strings, arrays, and reliance on JavaScript GC. If we do, it will be a huge effort. By not having an interoperable-with-web mentality up front, it seems we're preferring some languages (not JavaScript) in conflict with how the web works, and possibly making web interop more difficult. It seems we can (and should) design WebAssembly with the web in mind first and foremost, while still allowing options that can work for all languages (f.e. a language that wants to handle array growth doesn't use that feature). How do we best support the web, and languages designed for web that compile JavaScript/TypeScript to WebAssembly who wish to have high interoperability with all web and DOM features without inefficient boundaries?Can these facilities for web interop be added later on top of what we now have and plan to have? Does the direction we go in make future web-interoperable interfaces in Wasm more difficult to add? |
This comment was marked as resolved.
This comment was marked as resolved.
Admittedly, many web developers don't know what they need at a lower level, but we must not abuse this fact for purposes that don't align with the existing web. |
As briefly mentioned in #145, arrays have yet unaddressed overhead / interoperability problems:
push
ing to it. Instead it has to grow the array by one element at a time where the element type is not defaultable, producing copying overhead and likely garbage collector pressure over something that is otherwise the ABC of compiler development.When remaining unaddressed, these limitations will effectively prevent AssemblyScript for example from adopting Wasm GC early, losing one potentially valuable source for informing the process. I hence suggest to evaluate solutions in the MVP.
The text was updated successfully, but these errors were encountered: