-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Waiting for handlers to finish in all exit cases + abort and compensation in a message handler #144
Waiting for handlers to finish in all exit cases + abort and compensation in a message handler #144
Conversation
c566b62
to
9355bc2
Compare
message_passing/message_handler_waiting_compensation_cleanup/workflows.py
Outdated
Show resolved
Hide resolved
message_passing/message_handler_waiting_compensation_cleanup/starter.py
Outdated
Show resolved
Hide resolved
message_passing/message_handler_waiting_compensation_cleanup/starter.py
Outdated
Show resolved
Hide resolved
message_passing/message_handler_waiting_compensation_cleanup/starter.py
Outdated
Show resolved
Hide resolved
message_passing/message_handler_waiting_compensation_cleanup/starter.py
Outdated
Show resolved
Hide resolved
message_passing/message_handler_waiting_compensation_cleanup/workflows.py
Outdated
Show resolved
Hide resolved
message_passing/message_handler_waiting_compensation_cleanup/workflows.py
Outdated
Show resolved
Hide resolved
message_passing/message_handler_waiting_compensation_cleanup/workflows.py
Outdated
Show resolved
Hide resolved
message_passing/message_handler_waiting_compensation_cleanup/workflows.py
Outdated
Show resolved
Hide resolved
message_passing/message_handler_waiting_compensation_cleanup/workflows.py
Outdated
Show resolved
Hide resolved
message_passing/message_handler_waiting_compensation_cleanup/workflows.py
Outdated
Show resolved
Hide resolved
af64650
to
5867812
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels to me like there's two main things going on here.
- cleaning up the workflow's handlers
- aborting updates and compensating for them.
Both seem valid. The first feels mainstream, the second power user (though we may start to get feedback telling us otherwise).
I wonder if they should be separate samples, or at least more clearly delineate this one so people can just pick out what they need. (I don't want to scare people off of #1 because of the code for #2.)
I can imagine pretty achievable framework improvements we could do for both scenarios. Given that we've heard a lot more feedback about the first scenario, that'd be my guess as to where to invest for now.
# 👉 "Race" the workflow_exit future against the handler's own application | ||
# logic. Always use `workflow.wait` instead of `asyncio.wait` in | ||
# Workflow code: asyncio's version is non-deterministic. | ||
await workflow.wait( # type: ignore | ||
[update_task, self.workflow_exit], return_when=asyncio.FIRST_EXCEPTION | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense, and looks like a pain, but I'm not sure how often people want to cancel in-process update handlers in response to a workflow exit as opposed to simply waiting for the handlers to complete.
My guess is that most people are happy to wait because such cases are not particularly latency sensitive.
It may be okay that this is more of a power user struggle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I agree with this. The point isn't the latency -- it's that if the workflow is exiting due to a failure/cancellation then the update may need to perform a rollback / cancellation. So they can't "simply wait for the handler to complete" because the handler has to learn whether it should proceed to the end of its happy path, or perform rollback / compensation.
message_passing/waiting_for_handlers_and_compensation/workflows.py
Outdated
Show resolved
Hide resolved
) from cast(BaseException, self.workflow_exit.exception()) | ||
# 👉 Catch BaseException since asyncio.CancelledError does not inherit | ||
# from Exception. | ||
except BaseException as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't thought deeply, but I'm wondering if using a child workflow would work better for use cases in which there are compensations. Would it be more straightforward? Would it work better, e.g. by letting the workflow exit and letting the child workflow finish up the compensations?
Obviously that's extra latency which some might not be able to afford. And it's a lot of code, though it might be easier than this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting idea and could be worth another sample, at least to investigate and compare how well it works. This sample contains useful techniques for those not using child workflows, so I don't think this work need block merge here.
) from e | ||
raise | ||
|
||
async def my_update_compensation(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be cool if folks could register cancellation handlers with an annotation similar to validators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting idea! Make a note of it?
raise AssertionError("unreachable") | ||
|
||
|
||
def is_workflow_exit_exception(e: BaseException) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 , This feels like a concept worth highlighting (and maybe adding a helper?) that I think lots of users would benefit from. Good to take every opportunity to teach users about which exceptions fail the workflow and the fact that some don't fail it (and that this is a good thing), and then show how that intersects with cleanup code like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I think we all agreed on that. temporalio/features#589
d72c2de
to
270830d
Compare
Yes, I made them into two separate samples: # Waiting for message handlers
This workflow demonstrates how to wait for signal and update handlers to
finish in the following circumstances:
- Before a successful return
- On failure
- On cancellation # Waiting for message handlers, and performing compensation and cleanup in message handlers
This sample demonstrates how to do the following:
1. Ensure that all update/signal handlers are finished before a successful
workflow return, and on workflow cancellation and failure.
2. Perform compensation/cleanup in an update handler when the workflow is
cancelled or fails. This should be ready to merge now. |
dce6d7b
to
1ca9796
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing blocking
for exit_type in [ | ||
WorkflowExitType.SUCCESS, | ||
WorkflowExitType.FAILURE, | ||
WorkflowExitType.CANCELLATION, | ||
]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit complicated for a simple waiting sample IMO, but nothing wrong with it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of this sample is partly to teach users about what errors they'll see under different sorts of workflow exit. See the README:
workflow exit type: SUCCESS
🟢 caller received update result
🟢 caller received workflow result
workflow exit type: FAILURE
🟢 caller received update result
🔴 caught exception while waiting for workflow result: Workflow execution failed: deliberately failing workflow
workflow exit type: CANCELLATION
🟢 caller received update result
) | ||
|
||
|
||
def is_workflow_exit_exception(e: BaseException) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not accurate for all users to copy because users can customize what is a workflow exit exception. If we feel this is a utility users need, we should consider exposing on the SDK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, it says that in the comment:
# 👉 If you have set additional failure_exception_types you should also
# check for these here.
and we have temporalio/features#589 for the utility.
# 👉 Catch BaseException since asyncio.CancelledError does not inherit | ||
# from Exception. | ||
except BaseException as e: | ||
if is_workflow_exit_exception(e): | ||
await workflow.wait_condition(workflow.all_handlers_finished) | ||
raise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For simplicity reasons, I wonder if this sample should allow a workflow failure to not wait for a handler. That might be what some users want anyways (to fail the workflow now instead of waiting to fail).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that showing them how to wait for the handler when the workflow fails is useful since it points them towards how to implement graceful cleanup of a handler execution.
Two new samples. From the READMEs: