-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consolidate DelayableWriteable #55932
Consolidate DelayableWriteable #55932
Conversation
Pinging @elastic/es-search (:Search/Search) |
*/ | ||
public abstract class DelayableWriteable<T extends Writeable> implements Supplier<T>, Writeable { | ||
public abstract class DelayableWriteable<T extends Writeable> implements Writeable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering, is it just DelayedWriteable or DelayedReadWriteable ? Because it's not that you can choose to delay its reading, it's the only way that it works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess so. I figured Writeable
was the name for things that we threw across the wire and calling it Delayed
was enough to imply that it delayed reading because that was sort of the only thing you could delay.
At this point, given how we use it, it might be more correct to call it PotentiallySerializedWriteable
. But that is kind of mouthful. I think the whole "Delayed" bit of the name was perfect for the first PR that built it but it is less and less perfect now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but why Delayed VS Delayable for instance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the referencing
implementation isn't delayed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find that that is a lie :) because writeTo will always do the two steps serialization that will allow for delaying the deserialization on the other side. The point is that referencing is only on the producer side hence you never read into referencing from the stream? Maybe a bit of a philosophical discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should rename it to something like SerializedReference
or MaybeSerializedReference
so maybe its not an issue then. I get where you are coming from though. The name is icky now.
@@ -42,7 +55,7 @@ | |||
/** | |||
* Build a {@linkplain DelayableWriteable} that copies a buffer from | |||
* the provided {@linkplain StreamInput} and deserializes the buffer | |||
* when {@link Supplier#get()} is called. | |||
* when {@link #expand()} is called. | |||
*/ | |||
public static <T extends Writeable> DelayableWriteable<T> delayed(Writeable.Reader<T> reader, StreamInput in) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we rename this to readDelayedWriteable ? also depending on the rename proposed above...
} | ||
} | ||
|
||
@Override | ||
public T get() { | ||
public T expand() { | ||
try { | ||
try (StreamInput in = registry == null ? | ||
serialized.streamInput() : new NamedWriteableAwareStreamInput(serialized.streamInput(), registry)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering, is it an option to keep a reference to the original stream input so that we do not need this if and we can avoid exposing the new namedWriteableRegistry getter in StreamInput?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could keep a reference to the original StreamInput
but we'd still need to read the serialized bytes from it.
I wonder if keeping a reference to the StreamInput
would keep a reference to its underlying storage. If it did then it'd probably be bad to keep the reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea it did sound risky to me as well, and what we do instead looks fine to me.
*/ | ||
public abstract class DelayableWriteable<T extends Writeable> implements Supplier<T>, Writeable { | ||
public abstract class DelayableWriteable<T extends Writeable> implements Writeable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess so. I figured Writeable
was the name for things that we threw across the wire and calling it Delayed
was enough to imply that it delayed reading because that was sort of the only thing you could delay.
At this point, given how we use it, it might be more correct to call it PotentiallySerializedWriteable
. But that is kind of mouthful. I think the whole "Delayed" bit of the name was perfect for the first PR that built it but it is less and less perfect now.
* {@linkplain Writeable} when it is read from a remote node. | ||
* A holder for {@link Writeable}s that delays reading the underlying object | ||
* on the receiving end. To be used for objects that use a lot of memory hence | ||
* it is desirable to keep them around only for a limited amount of time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its more that deserializing it is inefficient compared to leaving it serialized. "Big" objects are fine to deserialize if they are no bigger when you deserialize them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, will rephrase.
* Multiple {@link DelayableWriteable}s coming from different nodes may be buffered | ||
* on the receiver end, which may hold a mix of {@link DelayableWriteable}s that were | ||
* produced locally (hence expanded) as well as received form another node (hence subject | ||
* to delayed expansion). When such objects are buffered for some time it is desirable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/is/may be/ ?
} catch (IOException e) { | ||
throw new RuntimeException("unexpected error expanding aggregations", e); | ||
throw new RuntimeException("unexpected error writing writeable to buffer", e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this change!
} | ||
} | ||
|
||
@Override | ||
public T get() { | ||
public T expand() { | ||
try { | ||
try (StreamInput in = registry == null ? | ||
serialized.streamInput() : new NamedWriteableAwareStreamInput(serialized.streamInput(), registry)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could keep a reference to the original StreamInput
but we'd still need to read the serialized bytes from it.
I wonder if keeping a reference to the StreamInput
would keep a reference to its underlying storage. If it did then it'd probably be bad to keep the reference.
I'd be happy to get this in and rename in a follow up. I kind of feel like the right name might have more to do with it potentially being serialized than with that it is delayable. Names are hard though! |
run elasticsearch-ci/bwc |
run elasticsearch-ci/default-distro |
* reduced aggs and reduce it on the fly when someone asks | ||
* for it. This will produce right-ish aggs. Much more right | ||
* than if you don't do the final reduce. Its important that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI @nik9000 I removed the uncertainty around producing right-ish aggs. I find that if certain aggs don't work it's a bug that we should fix like we did with scripted metric. The approach here around partial aggs is the correct one hence we should not have doubts about it :)
This commit includes a number of minor improvements around `DelayableWriteable`: javadocs were expanded and reworded, `get` was renamed to `expand` and `DelayableWriteable` no longer implements `Supplier`. Also a couple of methods are now private instead of package private.
This PR includes a number of minor improvements around
DelayableWriteable
: mainly javadocs were expanded,get
was renamed toexpand
andDelayableWriteable
no longer implementsSupplier
, and a couple of methods are now private instead of package private.