-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic Dispatch #132
Comments
I find the idea of RTT dictionaries interesting, but I am confused about some of the applications. One thing in particular is I am not sure why this would be more efficient than the current MVP. I understand the issue of having an extra word for storing the v-table, but that can be resolved by being able to put information into the RTT (at statically-determinable offsets). So I'm more focusing on time efficiency rather than space efficiency. For class methods, yes, in the current MVP you need to cast the For interface methods, the approach described in the Call Tags proposal (and in the referenced paper) enables the hash-index of a method and the collisions within a class to be statically determined, again enabling what is likely a more efficient implementation strategy than this proposal. So I'm doubtful that this approach would provide better time performance, at least for dynamic dispatch. It seems to me that the opportunity for improvement lies more in providing a mechanism for placing data within RTTs (at statically determinable offsets) to improve space performance. My impression was that design of such a feature was deliberately deferred to after the MVP. Then again, it's not mentioned in the Post-MVP, but that might have just been an oversight or an effort to focus the Post-MVP document on items for which the design strategy was unclear. Anyways, I'm happy to share thoughts there if you're interested. |
Thank you for the interesting points! I agree that for normal virtual dispatch, vtables are probably faster, and embedding them somehow into the RTT would also save the header word without the need for a more complex dispatch mechanism. By making this RTT storage special, we can probably even save the this pointer cast, similar to RTT dictionaries. This doesn't cover interface dispatch though. Comparing RTT dictionaries with the Call Tags proposal, I would say both approaches are essentially hash-tables, just one uses a table in the RTT dictionary and RTTs are keys, while the Call Tags proposal uses a table in the vtable/RTT and the call tags are keys, with the special feature of using jitted code to select the right entry from a hash bucket. Call tags and RTT dictionaries correspond to each other here, so we are comparing two ways to index into the same two-dimensional space. That being said, I agree that there are clear advantages to the second approach: Offsets become static since call sites usually know their call tag / RTT dictionary, which avoids an offset computation as part of the lookup. And the number of methods per class has less variance than the number of overloads per method, as your Java toString example illustrates. Interestingly, I think that RTT dictionaries as described in my doc also allow for this second implementation strategy, without any change to the specification (I wasn't aware of that when writing the doc). What I describe now uses roughly the call tags way to index things, but doesn't use code to select the right function, this is still done in the normal hash-table style: The RTT dictionary just becomes a globally unique identifier instead of a data structure, with the actual data being stored inside the RTT. Every RTT contains a fixed-size table to store RTT dictionary entries, plus an optional second-level storage if the fixed-size table becomes too small. The lookup sequence for an object o and an RTT dictionary d is then:
None of these lines contains more than one load (if RTTs contain their table inline). |
That does seem like an improvement. But I am still concerned that there will be frequent collisions, both due to lots of elements (classes tend to have more virtual methods than interface methods) and due to uncoordinated index assignment (with interface methods, you can typically pick a different index for each method of an interface to avoid them colliding with each other). I worry that this would still be substantial overhead for something like field access (though more reasonable for calls). So I think the improvement is that the revision puts the data in the rtt, but there is still room for improvement by making it possible to ensure that a particular datum is always at a specific location within the rtt. |
True, for virtual dispatch, hashing is clearly less than optimal. With a slight change in specification, one could allow the user to specify a key when creating an RTT dictionary, which is then used to map to offsets, the idea being that for virtual methods, the user can specify the obvious consecutive offsets manually. For interface methods, the user could either resort to random values or pick carefully selected non-colliding offsets. |
We have various ideas to improve dyamic dispatch via custom meta-objects or more direct support for vtables in the post-MVP doc, and PRs would be welcome adding more. Since this issue is not actionable for the MVP, I'll close this issue. |
It's still an open question how dynamic dispatch, in particular virtual and interface method calls, should work post-MVP.
To contribute to the discussion about this, I wrote up a (sketch of a) proposal containing two ideas:
Please have a look and feel free to leave comments in the doc also:
https://docs.google.com/document/d/1v6-BP1mN04cTRGMnDV2t3_CeSmxhtBwgN5UhXQGwHXY/edit
I'm looking forward to comments and alternative proposals.
The text was updated successfully, but these errors were encountered: