-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] Handle edge case of recursive exceptions in failure converter #697
Comments
I think this is the primary bug. There are situations where recursion and stack overflow can occur in serialization, and the linked PR naively only helps the case of the cause being the same as itself (but if it was two causes deep that recursed, the issue would still appear). We just need to make it clear when it happens (or any other failure serialization happens). If we are indeed not reporting anything, we need to fix that. #685 is related when failing to build a failure. |
You are right that it doesn't solve all cycles. Not sure if your case happens in practice. What needs to happen to have an exception cycled after 2 or 3 steps? Maybe the solution could be just have a max depth of 10 let's say. During serialization you truncate too deep stacks anyway. |
I think we just need to log the failure of serializing this (assume there's a stack depth/overflow exception but is not visible due to #685) |
When handling exceptions in DefaultFailureConverter, to avoid e.cause == e problems caused by reraising exceptions, you can take the following solutions: Avoid using the from statement when catching and rethrowing exceptions: Reraising the exception directly without specifying from avoids the problem of cause pointing to itself. After catching the exception, check the cause attribute and reset it to None if it points to itself. Add logic to DefaultFailureConverter to ensure that cause does not point to itself when handling exceptions. class DefaultFailureConverter: |
Is your feature request related to a problem? Please describe.
It is kind of an easy mistake to do in python:
The re-raised exception will have
e.__cause__
referencing itself resulting in failure to serialize error and python sdk won't report anything resulting in retries and startToClose timeouts.Describe the solution you'd like
Handle a case when
e.__cause__ == e
in DefaultFailureConverter. Created a dummy pull request #696Additional context
https://temporalio.slack.com/archives/CTT84RS0P/p1719632201314119
The text was updated successfully, but these errors were encountered: