-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[train] fix maximum recursion issue when serializing exceptions #43952
Changes from all commits
eac77df
125854f
38477ec
0448ca0
aff6fbc
03b7a93
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,8 +1,10 @@ | ||
import pytest | ||
|
||
import ray | ||
from ray import cloudpickle | ||
from tblib import pickling_support | ||
from ray.train import ScalingConfig | ||
from ray.air._internal.util import StartTraceback, skip_exceptions | ||
from ray.air._internal.util import StartTraceback, skip_exceptions, exception_cause | ||
from ray.train.data_parallel_trainer import DataParallelTrainer | ||
|
||
from ray.tune import Tuner | ||
|
@@ -47,6 +49,45 @@ def test_short_traceback(levels): | |
assert i == levels - start_traceback + 1 | ||
|
||
|
||
def test_recursion(): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Any ideas on why Maximum recursion happens iff There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Seems like a bug in Good point on testing. Originally I was going to remove |
||
"""Test that the skipped exception does not point to the original exception.""" | ||
root_exception = None | ||
|
||
with pytest.raises(StartTraceback) as exc_info: | ||
try: | ||
raise Exception("Root Exception") | ||
except Exception as e: | ||
root_exception = e | ||
raise StartTraceback from root_exception | ||
|
||
assert root_exception, "Root exception was not captured." | ||
|
||
start_traceback = exc_info.value | ||
skipped_exception = skip_exceptions(start_traceback) | ||
|
||
assert ( | ||
root_exception != skipped_exception | ||
), "Skipped exception points to the original exception." | ||
|
||
|
||
def test_tblib(): | ||
"""Test that tblib does not cause a maximum recursion error.""" | ||
|
||
with pytest.raises(Exception) as exc_info: | ||
try: | ||
try: | ||
raise Exception("Root Exception") | ||
except Exception as root_exception: | ||
raise StartTraceback from root_exception | ||
except Exception as start_traceback: | ||
raise skip_exceptions(start_traceback) from exception_cause(start_traceback) | ||
|
||
pickling_support.install() | ||
reraised_exception = exc_info.value | ||
# This should not raise a RecursionError/PicklingError. | ||
cloudpickle.dumps(reraised_exception) | ||
|
||
|
||
def test_traceback_tuner(ray_start_2_cpus): | ||
"""Ensure that the Tuner's stack trace is not too long.""" | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note:
with_traceback
is needed so that the traceback shows the original line that errored, rather than this line where the copy is happening.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does making a shallow copy remove the
__context__
so that thenew_exc
has no context?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but more importantly the
__context__
gets set (to theStartTraceback
) after the exception gets raised, so we want to make sure there is no nested__cause__
or__context__
that points back to this exception.