-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] No clean way to cancel an activity and wait until it's cancelled #700
Comments
Sorry if this report is a bit garbled -- I'd say the real issue is "there does not seem to be a clean, blessed way of doing this". |
This is how Python asyncio cancellation works. It only can issue a cancel at the await point. The snippet you show completes the activity immediately after calling the non-blocking heartbeat, so I would not expect it to fail because it has completed.
Use the
This is the nature of heartbeating. The server only relays cancellation on heartbeating to the worker, so if you're not heartbeating you're not receiving cancellation. You can run heartbeating in a background task or you can do something like sleep for a short period and heartbeat every so often. Also, heartbeats are throttled to within 0.8 times the heartbeat timeout anyways but you are not setting the heartbeat timeout so it uses a really high default. See:
Etc to better understand how heartbeating and activity cancellation work together. |
Right, if |
IIUC the point of the activity issuing a heartbeat is to let the server know it's still healthy. it sounds like your advice is that i could create a separate so IIUC, |
i'd suggest noting in the docs the exact semantics here. these docs make it sound like you can just it sounds like the semantics are something like this, i'd appreciate clarification as well:
|
In Python asyncio, calling cancel on a task doesn't immediately cancel always (or at all), it's up to the implementation for how to handle. Some tasks may swallow cancel, some shield, some may do cleanup first, or in our case we communicate it with a server. But it does work in that the request is sent to the server and put on history (assuming activity is not complete), it's just not necessarily processed immediately. We will see about clarifying in the docs that Python asyncio Task cancellation (and explicit Temporal workflow activity cancellation in all Temporal languages) is actually a cancellation request.
Yes, though there are of course caveats concerning heartbeat throttling so it's not always immediately on heartbeating. |
What are you really trying to do?
Start an activity from a workflow, cancel it, wait for clean cancellation acknowledgement, exit the workflow.
Describe the bug
There appear to be several issues.
handle.cancel()
on an activity handle from the workflow, theCancelledError
will only be raised in the activity after anactivity.heartbeat()
call, followed by anawait something()
call. Otherwise the activity itself doesn't seem to receiveCancelledError
. This kinda sucks -- both that you need to callheartbeat()
for cancellation requests to get through, and the fact that long-running coros do not get interrupted correctly. I don't understand the underlying implementation, but if the reason we need toheartbeat()
is that workers don't want to poll for cancellation, still, when you callactivity.heartbeat()
, the client library could check for cancellation and immediately calltask.cancel()
on the task running the activity, no? Currently the work-around seems to be to litter activity code withasyncio.sleep(0.1)
.ActivityError
, which is a broader exception type than I actually want to catch (presumably there are otherActivityError
s which I don't want to catch).WAIT_CANCELLATION_COMPLETED
, if you try to cancel an activity that's currently awaiting a long-running coro (likeasyncio.sleep(10)
), the activity won't receive aCancelledError
until it sends a heartbeat, which it can't do until the coro it's waiting on finishes, so your workflow won't finish until the activity's long-running coro finishes.Minimal Reproduction
https://github.com/andmis/snippets/tree/temporal-python-sdk-cannot-cleanly-cancel-activities
Using
python run_workflow.py
with no args, the workflow exits promptly, but the activity never receivesCancelledError
, completes, and we get warning log spam:Using
python run_workflow.py -w
, the workflow waits until the activity's long-runningsleep
finishes, despite the activity being cancelled (note timestamps), and the activity completes rather than being cancelled:Using
python run_workflow.py -w -s
, the workflow waits for the long-running coro in the activity to finish, which is bad, and the activity does cancel rather than completing, but-s
sucks:Using
python run_workflow.py -s
results in the workflow exiting cleanly and promptly (since we aren't usingWAIT_CANCELLATION_COMPLETED
), and the activity cancels rather than completing, but still waits on the long-running coro:Environment/Versions
OS and processor: macOS, M1
Temporal version: 1.1.2
Python SDK version: 1.8.0
Are you using Docker or Kubernetes or building Temporal from source? No
The text was updated successfully, but these errors were encountered: