Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New API to wait for handler executions to complete and warnings on unfinished handler executions #556

Merged
merged 45 commits into from
Jun 26, 2024

Conversation

dandavison
Copy link
Contributor

@dandavison dandavison commented Jun 19, 2024

Closes #538

Implementation ready for review; names are still under discussion.

What was changed

  • Added a new API to wait for update and signal handler executions to complete. Provisionally named workflow.all_handlers_finished().
  • Issue warnings on workflow exit if any handlers are unfinished.
  • Allow warnings to be disabled on a per-handler basis via a signal/update decorator parameter.

Why?

Workflows exiting while handler executions are in progress is an important category of run-time workflow incorrectness.

@dandavison dandavison requested a review from a team as a code owner June 19, 2024 19:49
@dandavison dandavison force-pushed the sdk-2228-unfinished-handlers branch from bea62d5 to 6e9cf4c Compare June 19, 2024 19:54
return with_name(fn.__name__, fn)
return partial(decorator, name, unfinished_handlers_policy)
else:
return decorator(fn.__name__, unfinished_handlers_policy, fn)
Copy link
Contributor Author

@dandavison dandavison Jun 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes make the decorator behave consistently: permitting e.g. @update() with the same meaning as @update, and allowing the default value of the handler policy to be passed explicitly as a param if the user desires. That makes sense, especially now that we're adding a 3rd parameter to the decorators which is unrelated to name and dynamic. TODO: I'll need to update the query decorator also for consistency.

Copy link
Member

@cretz cretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Most of my comments are only on little things.

# signals lack a unique per-invocation identifier, we introduce a sequence number for the
# purpose.
self._handled_signals_seq = 0
self._in_progress_signals: dict[
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsure if dict is supported in 3.8, we have used Dict (from typing) to be sure to have compatibility

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I was trying to remember to use the 3.8-compatible upper-cased names but it's easy to forget.

@@ -490,6 +539,7 @@ async def run_update() -> None:
f"Update handler for '{job.name}' expected but not found, and there is no dynamic handler. "
f"known updates: [{' '.join(known_updates)}]"
)
self._in_progress_updates[job.id] = (job, defn)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should move to just above the try for clarity (even though it doesn't matter much in practice, was just a bit confusing to have to hunt down the cases when something undone in a finally was not done)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As things stand, it has to be here, since defn is bound inside the try. I agree that the try should be scoped more tightly around the code which is actually raising the exceptions it's catching, but I think I'll leave that change out of this PR.

temporalio/worker/_workflow_instance.py Outdated Show resolved Hide resolved
temporalio/worker/_workflow_instance.py Outdated Show resolved Hide resolved
temporalio/worker/_workflow_instance.py Outdated Show resolved Hide resolved
temporalio/workflow.py Outdated Show resolved Hide resolved
temporalio/workflow.py Outdated Show resolved Hide resolved
temporalio/workflow.py Show resolved Hide resolved
temporalio/workflow.py Outdated Show resolved Hide resolved
tests/worker/test_workflow.py Show resolved Hide resolved
Copy link
Contributor Author

@dandavison dandavison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review @cretz, I've taken almost all the suggestions you made.

# signals lack a unique per-invocation identifier, we introduce a sequence number for the
# purpose.
self._handled_signals_seq = 0
self._in_progress_signals: dict[
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I was trying to remember to use the 3.8-compatible upper-cased names but it's easy to forget.

temporalio/worker/_workflow_instance.py Outdated Show resolved Hide resolved
temporalio/worker/_workflow_instance.py Outdated Show resolved Hide resolved
# an error rather than the update result.
ABANDON = 1
# Issue a warning in addition to abandoning.
WARN_AND_ABANDON = 2
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I had it that way and then changed it to make the comments read better. Changed it back.

@@ -173,6 +173,20 @@ def run(fn: CallableAsyncType) -> CallableAsyncType:
return fn # type: ignore[return-value]


class UnfinishedHandlersPolicy(IntEnum):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

temporalio/workflow.py Show resolved Hide resolved
temporalio/workflow.py Outdated Show resolved Hide resolved
temporalio/workflow.py Outdated Show resolved Hide resolved
temporalio/workflow.py Outdated Show resolved Hide resolved
pyproject.toml Show resolved Hide resolved
pyproject.toml Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
@dandavison dandavison force-pushed the sdk-2228-unfinished-handlers branch from 7e1fc7b to 62d7c03 Compare June 21, 2024 18:17
temporalio/workflow.py Outdated Show resolved Hide resolved
@dandavison dandavison force-pushed the sdk-2228-unfinished-handlers branch 2 times, most recently from 94bf238 to 9346ac1 Compare June 21, 2024 21:07
Copy link
Member

@cretz cretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM. Need to disable tests for time-skipping test server and may need to remove an unused import, but overall LGTM.

@dandavison
Copy link
Contributor Author

This PR was missing one thing -- it didn't address cancellation. Discussed with @cretz in a different channel and we agreed that it was a bug that the CancelWorkflowExecution command was not being defined to be a workflow "completion" by the SDK. Recent commits fix that bug and add the test coverage that this PR had been missing regarding cancellation.

@dandavison dandavison force-pushed the sdk-2228-unfinished-handlers branch from 2e39d45 to f59bcac Compare June 25, 2024 15:13
@dandavison dandavison force-pushed the sdk-2228-unfinished-handlers branch from f59bcac to 3f0b11c Compare June 25, 2024 15:26
@dandavison dandavison force-pushed the sdk-2228-unfinished-handlers branch from 42e3fc6 to d6ca246 Compare June 25, 2024 15:50
@dandavison dandavison force-pushed the sdk-2228-unfinished-handlers branch from 54d9ac9 to 6526fb1 Compare June 26, 2024 09:35
@dandavison dandavison force-pushed the sdk-2228-unfinished-handlers branch 2 times, most recently from d64fa77 to b9e07cd Compare June 26, 2024 13:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Warn or error when update handlers dangle across CAN or workflow exit
3 participants