-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't "inverted per-object mapping" representations for WeakMap/WeakSet still need GC integration? #1657
Comments
Agreed, the inverted representation is not a full implementation of an ephemeron table. It can still have circularity induced leaks. Somewhere in the old tc39 bugzilla data dump there is an issue where I explain this. Some TC39 members and/or engine implementors may still think the inverted approach is good enough. As a old-time GC guy, I consider a leaky GC to be a buggy GC, |
Thanks for catching this. It is a severe bug. It is likely due to an oversimplification of historical controversies, though I don't know the actual history of the text you quote. I have long advocated for the inverted mapping, so that the case that is by far more common could be more efficient: when the weakmap lifetime exceeds the key lifetime. But the obligation is symmetric. Whichever representation you choose, you must do full ephemeron collection for the case not optimized by that representation. Today, zero of the browser engines do the right thing. v8, FF, and JSC all use the non-inverted representation, making WeakMaps so terribly and needlessly expensive that too many people have learned to avoid them. But at least they are correct. Chakra does the inverted representation, making the common case efficient. However, they do not backstop it with ephemeron collection. Since the spec can only mandate conservative collection, this is not observably incorrect, but it does violate expectations. |
Membranes is the use case that motivated both weakmaps and proxies in the first place. Across a membrane, there are two gc issues:
The only way to solve both problems without violating encapsulation or introducing observable non-determinism is with full ephemeron collection. v8, FF, and JSC, using the non-inverted representation, collect the uncommon case: membrane revocation, cheaply. But they do succeed at collecting membrane crossing cycles by ephemeron collection. Chakra, by using the inverted representation, collects membrane crossing cycles cheaply. But because they don't backstop it with ephemeron collection, membrane revocation leaves behind an uncollectable unreachable subgraph. |
Here's the way I think about the underlying symmetry of the problem: Normal GC treats each meet in the reference graph as an OR. If A and B both point at C, then C is reachable if A is reachable OR if B is reachable. WeakMaps introduce AND meets into the reference graph. After const m = new WeakMap([[k, v]]); v is reachable if k is reachable AND if m is reachable. We can invert the representation because AND is commutative. We must still backstop the other case with ephemeron collection because AND is commutative. |
I have long advocated for the inverted mapping, so that the case that is by far more common
could be more efficient:
Mark, do you have something to back up the assumption that it is far more common? And more efficient? Because I’m not sure I’m convinced.
There are disadvantages to the inverted representation. For example, it requires internal mutation of objects and their representations that would otherwise be immutable or even frozen. That can lead to deoptimisations of unrelated code and other costly secondary effects, including more complex implementation.
|
I think mark is saying the common case is that a weakmap typically outlives its keys, which generally matches how people use them to store relationships and/or metadata for instances of a certain type of object that they own. The most common thing I use weakmaps used for is this: const map = new WeakMap();
class MyClass {
constructor() {
map.set(this, my secret metadata);
}
} I can't speak to the more efficient part. |
Oooh, just learning about this, quite interesting. So yeah, as far as I can tell, the cost of ephemeron collection is relative to the populations of the target objects (lots of sources weakreffing a small number of targets is cheaper than the opposite) and to the relative lifetimes of the source and target (short-lived object weakreffing long-lived objects is cheaper than the opposite). So the inverted case (the key weakrefs the weakmap) should be better on both of those axises - programs will usually have a much smaller # of weakmaps than keys, and keys will generally be shorter-lived than the weakmaps (essentially by definition of the core weakmap use-case). There might very well be second-order effects, as @rossberg states, that could overwhelm the cost savings of the inverted representation, but that's something that needs to be measured; we can't determine it outright. |
Thanks @tabatkins and @devsnek . @rossberg , what they said ;) A possible source of evidence is Chakra, which uses the inverted representation. |
Complete agreement with @erights here. Every time I've used a |
Actually, Chakra is able to collect WM values when the WM is collected. It does this by maintaining a list of weak references to the keys; it iterates this list on finalization. The main implementation tradeoff in Chakra is that WM usage can change the internal type of the key object, which can affect caches. In a different universe, we chose to create a parallel API to enable the programmer to explicitly select the inverted representation. Alas, we chose Regardless, the two relevant sentences in the spec (emphasized in the OP) should probably be dropped. |
In https://tc39.es/ecma262/#sec-weakmap-objects, there is a note that says:
(Emphasis mine.)
This would seem to imply that, when using an inverted per-object mapping of weak map instances to keys, there is no need for coordination with the garbage collector. And, if the only type of leak you're worried about is the type characterized here, I guess that's true, but … What if the keys remain accessible longer than the WeakMap/WeakSet objects? Won't we then, in the absence of coordination with the garbage collector, just end up leaking the WeakMap/WeakSet objects until all of their respective keys are collected?
I mean, maybe this is obvious to most readers of the spec, but it's the kind of thing I generally assume isn't actually safe to leave unsaid.
Then again, maybe there's some crazy olegian scheme for implementing weak maps behind the garbage collector's back using delimited continuations, zippers, and comonads, but I sort of doubt it.
The text was updated successfully, but these errors were encountered: