-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Optimize AbstractFuture.setFuture #3370
base: master
Are you sure you want to change the base?
Conversation
adafd98
to
dd43c90
Compare
@cpovirk any thoughts on the value of this kind of change? btw the build failure is unrelated and affecting all branches - caused by travis moving to jdk11.0.2 which appears to have an incompatibility with maven-javadoc-plugin (at least until next version 3.1.0 is released). |
Thanks. @lukesandberg, author of most of |
Optimize handling of chained futures w.r.t. both CPU and allocations; allow GC of intermediate futures
AbstractFutureBenchmarks "old" impl changed to match the "current" impl (pre setFuture optimizations)
@lukesandberg @cpovirk curious if you had a chance to take a look at this? |
Not yet, sorry :( Various non-Guava stuff has been eating my time even more than usual lately. |
@cpovirk @lukesandberg I realize that the size of the change is likely off-putting, but I wonder if you could at least read just the overview, results and possibly "behaviour differences" sections above to determine if this might even be of interest. If the answer is no then there's no need to look at the code! |
Chained/recursive async
ListenableFuture
transformations (e.g. viaFutures.transformAsync
/catchingAsync
/scheduleAsync
/... orSettableFuture.setFuture
) currently cause indefinite growth of live objects, with intermediate futures not eligible for collection until the entire chain/tree is completed, even if/when already set asynchronously. This PR is a proof-of-concept experiment to rework thesetFuture
mechanics so that such chains are collapsed on the fly where possible, and makes some related optimizations along the way. It needs some further refinement but should be fully functional.High-level overview
The approach is motivated by the observation that chains (or more generally trees) of
AbstractFuture
s, previously linked viasetFuture
calls, form a disjoint group whose members share a common fate. Although a tree structure is implied by the order in which the futures were combined, only identification of the root has any significance w.r.t. future program state. In other words the group can be considered as a set with no structure apart from the existence of a unique "unset" member. And since listeners and waiters added to any of the futures in the group are interested only in the common outcome, there's no need for them to remain tied to their "local" future instance.If the future overrides
afterDone()
then it must remain reachable, i.e. there is no hope of unpinning it. Unfortunately this applies to most of the internalAbstractFuture
subclasses used to implement theFutures
methods, but it turns out they can be refactored to use a (newly introduced) weaker hook namedafterEarlyCancellation()
. This gets called if and only if the future is cancelled prior to being set asynchronously.AbstractFuture
s which make use ofsetFuture()
but don't overrideafterDone()
are considered to be "passively completable" - they can be manipulated such that from the outside they appear to behave/complete as normal, but no callback references need to be retained to them once they are set asynchronously. This is achieved by relocating/redirecting their Listeners and Waiters elsewhere.Passive completability is currently signalled by overriding a new protected
requiresAfterDoneCallback()
method to return false (it returns true by default). There may be a better approach like a marker interface or new abstractAbstractFuture
subclass or even automatic determination via reflection, though the cost of the latter may be prohibitive.Internal changes
The private
SetFuture
class currently serves as an immutable link to the next future, pointing to both the source and target, and is also registered as a listener on the target. This has been changed to serve a different purpose, now a mutable, shared placeholder set as the value for multipleAbstractFuture
s belonging to the same async-set group. It still has two fields - a fixed (final
) pointer to an arbitrary "delegate" member of the group (chosen when first established), and avolatile
pointer to the group's current unset future*.A new future "joins" the group when an existing member is successfully passed to its
setFuture
method or is passed itself as the argument to thesetFuture
method of the group's current unset target. In either case the group's existingSetFuture
instance becomes the value of the newly joinedAbstractFuture
. If the new future is passively completable, its listener stack is moved to the group's delegate future (replaced by a newMOVED
Listener sentinel), which is guaranteed to be explicitly completed. Waiters in this scenario are released, will recheck their owning future's value finding it is now aSetFuture
, and then just block onSF.delegate.get()
.A new private
Pending
class represents the same unset state asnull
but is set as the value of the group's single unset (innermost) member and holds a pointer to that group'sSetFuture
object. WhensetFuture
is called on this unset future thePending
value is unwrapped, replaced with theSetFuture
object it contains. If trusted, the new future's value is set to the samePending
object (it moves down the chain). The whole operation involves no new allocation if the futures are passively completable.A listener is typically no longer created when passively completable futures are set asynchronously, however the design requires that the assigned "delegate" future for the group is explicitly completed whether or not it is passively completable. This is easily arranged since when the innermost future is finally (non-asynchronously) set, this delegate can be found immediately via the
Pending
object which the terminal value replaces. If the innermost future is async-set to a non-trusted future, thePending
object is discarded and the containedSetFuture
is added as a listener to the target (it continues to implementRunnable
for this reason).Before:
After:
Completion "callbacks" are still required for non-passively completable async-set futures (apart from those which happen to be a
SetFuture
delegate). A new privateCompleter
class is used for this purpose - a simpleRunnable
pointing back to the future in question.As well as benefiting allocations and object retention, this approach reduces the total number of volatile writes and CAS operations required to process a given chain of futures.
(*) For clarity the above explanation ignores the special case of joining two existing async-set groups that occurs if the unset member of a non-singleton async-set group is set to a future already belonging to a different (necessarily disjoint) non-singleton async-set group. Logically the outcome is just a single async-set group formed by their union, whose unset member is that of the second one. But special handling is required in the implementation since we now have two
SetFuture
s, neither of which can be discarded: one of them is "retired" and added as a listener to the other, which means that itstarget
value will no longer be updated when the aggregate async-set group's target changes. However the future which the target is set to will either be updated or (if there are futher group compositions) lead transitively via a chain ofSetFuture
s to one which is updated. Thus the new impl can still result in chains of async futures but they should be much rarer and shorter, since each link in such a chain must have resulted from asetFuture
composition of non-singleton async-set groups of futures.Futures.addCallback()
A small modification is also included to avoid listeners added via
Futures.addCallback()
continuing to hold a ref to their target future once it has been async-set - theFutures$CallbackListener.future
field is changed to be non-final (safe since it's published via CAS to the listeners stack), and is updated to the group's delegate future at the time the listener is redirected or relocated byAbstractFuture
.Results
The following exaggerated example takes more than 2 minutes to fail with an
OutOfMemoryError
after ~14 million iterations when run with the current version and-Xms1g -Xmx1g
. With this PR and-Xms32m -Xmx32m
it completes 100 million iterations successfully in less than 4 seconds:There is negligible if any change to the existing caliper benchmark numbers (both runtime and allocations), but a new benchmark for testing performance of
setFuture
and subsequent completion based on the above snippet shows considerable improvement:Behaviour differences
toString()
output no longer includes intermediate async-set futures. A justification could be that the omitted information is merely "historical" and doesn't pertain to the future's current state or later outcome.cancel(true)
on them will still cancel the future(s) to which their fate is now bound (and any whose fate depends on them), but only the innermost (unset) future will have itswasInterrupted
flag set andinterruptTask()
method invoked (if it's anAbstractFuture
). This behaviour aligns nicely with the existingFutures
internal impls and feels reasonable given that it arguably doesn't make sense to "interrupt" a future which is already set asynchronously since it's now essentially acting as a proxy for the future it was async-set to. It's also the case that although now corrected to be consistent, the current interruption handling is necessarily somewhat arbitrary because there's no way to determine the interrupted state of a third-party cancelled future. Java'sCompletableFuture
s ignoremayInterruptIfRunning
entirely.Further simplifications considered
value
field toUNSET
, defined as a static constantnew Pending(null)
and havenull
represent itself. This elminiates theNULL
constant and simplifies various logic (checking for unset can be justvalue instanceof Pending
rather thanvalue == null | value instanceof Pending
), but negatively impacted construction speed, at least on x86_64 linux as measured by the caliper benchmarks.SetFuture
class by the delegate future itself, with the delegate's value set to the group's current targetListenableFuture
. Use a simple wrapper to differentiate terminal values which areListenableFuture
s themselves (assumed to be less common). This turned out to also negatively impact general performance due to the fact thatinstanceof SomeInterfaceWithManyImpls
(e.g.ListenableFuture
) is greater than an order of magitude more expensive thaninstanceof SomeFinalClass
(e.g.SetFuture
).afterEarlyCancellation
hook, add one calledafterSet
which is called once, after the future is first set (either finally or async). This seems clearer/cleaner API-wise but at least in the case of the internal futures impls, some of the required logic is specific to the early cancellation case (for example this is the only one requiring possible propagation to the delegate) so fewer conditional checks end up being needed.I have not tried this yet but expect further optimization is possible to
Listener
handling, in particular changingListener
to an interface or abstract class which the "internal" callbacks can extend directly to eliminate some allocation and indirection, and bypass the publicaddListener
method when moving listeners betweenTrustedFuture
s.