-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Drop stream shielding; it was from a legacy api design #230
Conversation
0c6e7ca
to
8a57cb4
Compare
tractor/_streaming.py
Outdated
if self._shielded: | ||
log.warning(f"{self} is shielded, portal channel being kept alive") | ||
return | ||
|
||
# XXX: This must be set **AFTER** the shielded test above! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This XXX note is obsolete now? also, what happens when await self._ctx.send_stop()
below is cancelled? I can't tell locally if that is handled internally, or for that matter if cancelling it is not a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh good catch and good question.
Technically if we're cancelled there's not a lot that can be done anyway since you run into the 2-general's problem in terms of getting back an ack to the stop
msg. So even if we were to wait, how long would we wait, and further if we're cancelled the transport layer may still stay active but the far end task may have already terminated (depending on which side of the stream was the context opener).
One thing we can think about is a stricter ack to the surrounding context cancellation maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah so i think the short answer is, if we're cancelled on .send_stop()
then it's as if it wasn't sent and the surrounding context is responsible for relaying that cancellation back to the other side.
So you aren't really supposed to be able to tell what happens from here, you'd want to look at .open_stream()
i think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a similar call is cancelled there, so it seems like the far side is abandoned IIUC. to maintain your one-shot invariant you'll need a shielded scope with a deadline, and some way to configure that deadline, and then some way to mark the other side as lost (and maybe panic and blow up the whole portal?) when the deadline passes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh yes sorry so in the bidir case this block isn't invoked on the caller (opener of context) side since self._ctx._portal
will be None
. So if the far end callee task is cancelled while closing, i'm not sure it improves anything to try-to-ensure the stop
is relayed given that we'll get the context cancel bubble across anyway on the caller side. Maybe this is not clear enough or should change though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So i guess the good question you've made is should there be attempted stream state consistency on cancellation during .aclose()
yes?
If we stick with the an op that is cancelled should be treated like it never happened then i think the way this is implemented is fine right?
If the callee tried to close but was cancelled does that mean it closed or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right and it seems to me that's fine as long as it's consistent with the rest of the system design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, this was the whole point of Portal.open_context()
being necessary before you can open a stream - exactly to handle the cancellation/failure case (in one place) without having to try and jam it into the stream state tracking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as in, does any other part of the system need to know if someone wanted to close the stream but couldn't do it in time? probably not but I wouldn't know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah i don't think so.
As per discussion in chat, if you have a stream state desync there's not a ton can be done without a supervisory strategy to handle the cancellation (could be parent(caller) or child(callee) side) and either way, the stream shouldn't be closed implicitly by cancellation (what this patch changes) so it's up to whoever handles that trio.Cancelled
to determine what to do about a cancellation - in the most default case the stream is closed more thoroughly in the exit step and later in the context closure.
For our current support of only a one shot streaming phase in the dialog i think this is sufficient since we shouldn't be attempting to re-open the same stream without a new context.
The whole origin was not having an explicit open/close semantic for streams. We have that now so this internal mechanic isn't needed and further our streams become more correct by having `.aclose()` be independent of cancellation.
8a57cb4
to
558c44f
Compare
Before we had
async with Portal.open_stream_from()
which has explicit enter/exit semantics, we had a implicit portal-returns-stream-reference thing which required that if astream.receive()
was cancelled, we relayed that via a stream closure to the other side.Since we have the new API now we don't need this shielding (aka we won't
.aclose()
the stream on cancellation) any more.This PR removes all that and the CI should prove that all things are well.
Once CI is clean I'll just remove the first patch draft which comments all the unneeded code.