Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in linecache when raising exceptiongroup that contains ZeroDivisionError + other exception. #101517

Closed
jonathanslenders opened this issue Feb 2, 2023 · 30 comments
Assignees
Labels
3.11 only security fixes 3.12 bugs and security fixes type-bug An unexpected behavior, bug, or error

Comments

@jonathanslenders
Copy link

jonathanslenders commented Feb 2, 2023

Bug report

Consider this code:

from asyncio import run, TaskGroup

async def bad_function():
    raise ZeroDivisionError
async def bad_function2():
    raise KeyError

async def main():
    try:
        async with TaskGroup() as tg:
            tg.create_task(bad_function())
            tg.create_task(bad_function2())

    except* ZeroDivisionError as e:
        breakpoint()


run(main())

Run it on Python 3.11 or 3.12:

--Return--
Traceback (most recent call last):
  File "....py", line 18, in <module>
    run(main())
  File "/home/jonathan/.pyenv/versions/3.11.0/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/home/jonathan/.pyenv/versions/3.11.0/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jonathan/.pyenv/versions/3.11.0/lib/python3.11/asyncio/base_events.py", line 650, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File ".....py", line -1, in main
  File "/home/jonathan/.pyenv/versions/3.11.0/lib/python3.11/bdb.py", line 94, in trace_dispatch
    return self.dispatch_return(frame, arg)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jonathan/.pyenv/versions/3.11.0/lib/python3.11/bdb.py", line 153, in dispatch_return
    self.user_return(frame, arg)
  File "/home/jonathan/.pyenv/versions/3.11.0/lib/python3.11/pdb.py", line 372, in user_return
    self.interaction(frame, None)
  File "/home/jonathan/.pyenv/versions/3.11.0/lib/python3.11/pdb.py", line 434, in interaction
    self.print_stack_entry(self.stack[self.curindex])
  File "/home/jonathan/.pyenv/versions/3.11.0/lib/python3.11/pdb.py", line 1554, in print_stack_entry
    self.format_stack_entry(frame_lineno, prompt_prefix))
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jonathan/.pyenv/versions/3.11.0/lib/python3.11/bdb.py", line 573, in format_stack_entry
    line = linecache.getline(filename, lineno, frame.f_globals)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jonathan/.pyenv/versions/3.11.0/lib/python3.11/linecache.py", line 31, in getline
    if 1 <= lineno <= len(lines):
TypeError: '<=' not supported between instances of 'int' and 'NoneType'

One of the exceptions has to be a ZeroDivisionError for it to happen as far as I can see.

Your environment

  • CPython versions tested on: 3.11, 3.12
  • Operating system and architecture: Linux, M1, arm64

Linked PRs

@jonathanslenders jonathanslenders added the type-bug An unexpected behavior, bug, or error label Feb 2, 2023
@github-project-automation github-project-automation bot moved this to Todo in asyncio Feb 2, 2023
@gvanrossum
Copy link
Member

Can you produce a repro without asyncio? I suspect you can get this by just raising and catching two errors and then raise an ExceptionGroup. @iritkatriel

@iritkatriel
Copy link
Member

It's something to do with a breakpoint as the last thing in an except block.

Without asyncio:

excs = []
try:
    raise ZeroDivisionError
except Exception as e:
    excs.append(e)

try:
    raise KeyError
except Exception as e:
    excs.append(e)


def main():
    try:
        raise ExceptionGroup("eg", excs)
    except* ZeroDivisionError as e:
        breakpoint()
        # pass    # <-- uncomment this and the problem goes away

main()

@iritkatriel iritkatriel added 3.11 only security fixes 3.12 bugs and security fixes and removed topic-asyncio labels Feb 2, 2023
@AlexWaygood AlexWaygood removed this from asyncio Feb 2, 2023
@iritkatriel
Copy link
Member

Something to do with frame initialisation? @markshannon @brandtbucher

excs = []
try:
    raise ZeroDivisionError
except Exception as e:
    excs.append(e)

try:
    raise KeyError
except Exception as e:
    excs.append(e)

def main():
    try:
        raise ExceptionGroup("eg", excs)
    except* ZeroDivisionError as e:
        #sys._getframe()     # <-- uncomment this and the problem goes away
        breakpoint()


main()

@brandtbucher
Copy link
Member

I don't see the problem going away when I add the sys._getframe() call...

@brandtbucher
Copy link
Member

This does reproduce the issue for me, though:

try:
    raise ExceptionGroup("eg", [KeyError(), ValueError()])
except* KeyError:
    breakpoint()
    # pass    # <-- uncomment this and the problem goes away

@brandtbucher
Copy link
Member

Note that this only reproduces if some exceptions are uncaught (the ValueError, in my above example... turning it into a KeyError makes the issue go away).

@iritkatriel
Copy link
Member

I think it's some kind of heisenbug. I worked around the lineno being None in linecache.py:31, then it failed on None in another place later on, so I added a "print the frame" just before that line, and the problem went away.

@brandtbucher
Copy link
Member

It looks like we're sending the trace function a "return" event, when we're actually in the process of raising the rest of the EG. That's wrong...

@brandtbucher
Copy link
Member

For example, in my example above, if you uncomment the pass (or replace it with anything else), and go to the next line (by typing n) in the debugger, then the same error occurs.

So it's not that the breakpoint() is the last statement, it's just that if you have an active trace function while re-raising the exception group, things break.

@iritkatriel
Copy link
Member

As far as I can tell the lineno=-1 is on the frame for the "raise ExceptionGroup" line. The except* machienery doesn't do anything with tracebacks/frames other than shallow-copy tracebacks from the original exception group to the parts.

@iritkatriel
Copy link
Member

It looks like we're sending the trace function a "return" event, when we're actually in the process of raising the rest of the EG. That's wrong...

Can you reproduce this scenario with a normal exception re-raised in an except?

try:
    raise ValueError
except Exception:
    raise

@brandtbucher
Copy link
Member

Ah wait, it's okay to send a return event. I forgot that raising still traces a return like that.

@markshannon
Copy link
Member

def f():
    try:
        raise ExceptionGroup("eg", [KeyError(), ValueError()])
    except* KeyError:
        breakpoint()
>>> dis.dis(f)
  1           0 RESUME                   0

  2           2 NOP

  3           4 LOAD_GLOBAL              1 (NULL + ExceptionGroup)
             16 LOAD_CONST               1 ('eg')
             18 LOAD_GLOBAL              3 (NULL + KeyError)
             30 CALL                     0
             40 LOAD_GLOBAL              5 (NULL + ValueError)
             52 CALL                     0
             62 BUILD_LIST               2
             64 CALL                     2
             74 RAISE_VARARGS            1
        >>   76 PUSH_EXC_INFO

  4          78 COPY                     1
             80 BUILD_LIST               0
             82 SWAP                     2
             84 LOAD_GLOBAL              2 (KeyError)
             96 CHECK_EG_MATCH
             98 COPY                     1
            100 POP_JUMP_IF_NOT_NONE     2 (to 106)
            102 POP_TOP
            104 JUMP_FORWARD            16 (to 138)
        >>  106 POP_TOP

  5         108 LOAD_GLOBAL              7 (NULL + breakpoint)
            120 CALL                     0
            130 POP_TOP
            132 JUMP_FORWARD             2 (to 138)
        >>  134 LIST_APPEND              3
            136 POP_TOP
        >>  138 LIST_APPEND              1
            140 PREP_RERAISE_STAR
            142 COPY                     1
            144 POP_JUMP_IF_NOT_NONE     4 (to 154)
            146 POP_TOP
            148 POP_EXCEPT
            150 LOAD_CONST               0 (None)
            152 RETURN_VALUE
        >>  154 SWAP                     2
            156 POP_EXCEPT
            158 RERAISE                  0
        >>  160 COPY                     3
            162 POP_EXCEPT
            164 RERAISE                  1
>>> 
list(f.__code__.co_lines())
[(0, 2, 1), (2, 4, 2), (4, 76, 3), (76, 78, None), (78, 108, 4), (108, 134, 5), (134, 146, None), (146, 154, 5), (154, 166, None)]

It looks like the RERAISE 0 at offset 158 has no line number. Either it should be RERAISE 1, or it should have a line number.

The compiler should be propagating the line number from 132 JUMP_FORWARD 2 (to 138) to 138 and then to 144 POP_JUMP_IF_NOT_NONE 4 (to 154) and on to 154, but looks like it doesn't.

@gaogaotiantian
Copy link
Member

As for the unittest itself, will this work?

def test_pdb_issue_gh_101517():
    """See GH-101517

    Make sure pdb doesn't crash when the exception is caught in a try/except block

    >>> def test_function():
    ...     try:
    ...         raise KeyError
    ...     except* Exception as e:
    ...         import pdb; pdb.Pdb(nosigint=True, readrc=False).set_trace()

    >>> with PdbTestInput([  # doctest: +NORMALIZE_WHITESPACE
    ...     'continue'
    ... ]):
    ...    test_function()
    --Return--
    > <doctest test.test_pdb.test_pdb_issue_gh_101517[0]>(None)test_function()->None
    (Pdb) continue
    """

I've tested it on main and it passed. I reverted 366b949 and it failed. Not sure if this is the thing you are looking for.

@iritkatriel
Copy link
Member

As for the unittest itself, will this work?

def test_pdb_issue_gh_101517():
    """See GH-101517

    Make sure pdb doesn't crash when the exception is caught in a try/except block

    >>> def test_function():
    ...     try:
    ...         raise KeyError
    ...     except* Exception as e:
    ...         import pdb; pdb.Pdb(nosigint=True, readrc=False).set_trace()

    >>> with PdbTestInput([  # doctest: +NORMALIZE_WHITESPACE
    ...     'continue'
    ... ]):
    ...    test_function()
    --Return--
    > <doctest test.test_pdb.test_pdb_issue_gh_101517[0]>(None)test_function()->None
    (Pdb) continue
    """

I've tested it on main and it passed. I reverted 366b949 and it failed. Not sure if this is the thing you are looking for.

Yes, perfect. I tried something similar but I didn't put the try-except inside a function so it didn't trigger the issue.

Could you make a PR to commit this test? Just in the comment - call it a try/except* block rather than try/except.

Thanks!

@gaogaotiantian
Copy link
Member

Yes, perfect. I tried something similar but I didn't put the try-except inside a function so it didn't trigger the issue.

Could you make a PR to commit this test? Just in the comment - call it a try/except* block rather than try/except.

Thanks!

Sure, but we should fix pdb right? list command will crash pdb on main now. So do we expect a f_lineno == 0? I mean in theory, from the end-user view, it's reasonable to have the f_lineno set to the line in the except* block.

@iritkatriel
Copy link
Member

Sure, but we should fix pdb right? list command will crash pdb on main now. So do we expect a f_lineno == 0? I mean in theory, from the end-user view, it's reasonable to have the f_lineno set to the line in the except* block.

I think that's due to a bug in the compiler's code emitted for except*. Let's check once I've fixed that.

@gaogaotiantian
Copy link
Member

I think that's due to a bug in the compiler's code emitted for except*. Let's check once I've fixed that.

But this whole thing is due to that bug right? Your fix in 366b949 basically prevent pdb (well bdb) from crashing when lineno is None - and that's the bug we are talking about now? If the bug is fixed, we don't need this protection anymore?

I'm all for the new unittest and this should always be added for the except* syntax. I just think it should be more thorough - to make sure other commands (for example li) work as well when the user enters debug mode. For now it seems like a half-done unittest with a half-fixed debugger to a known bug that we are going to fix.

I can definitely make a PR for the unittest above right now, but should we consider either:

  • Have a temporary patch in all pdb code for f_lineno is None and test it, or
  • Wait until the bug in the compiler is fixed and test more thorouly with the current patch removed?

@iritkatriel
Copy link
Member

But this whole thing is due to that bug right? Your fix in 366b949 basically prevent pdb (well bdb) from crashing when lineno is None - and that's the bug we are talking about now? If the bug is fixed, we don't need this protection anymore?

It's not fixed yet, but I have a fix that I want to push. But I don't have a unit test for this bug which doesn't involve pdb, because the bug is due to some virtual instruction not having a line number assigned to them, and I don't know how to make that visible from python without pdb. The breakpoint make pdb look at the "next instruction", and this is what's failing. So your unit test will be the regression test for my fix, and when I push my fix I will make pdb fail when the lineno is None (but with a better error message than what was there before).

I'm all for the new unittest and this should always be added for the except* syntax. I just think it should be more thorough - to make sure other commands (for example li) work as well when the user enters debug mode. For now it seems like a half-done unittest with a half-fixed debugger to a known bug that we are going to fix.

Indeed.

I can definitely make a PR for the unittest above right now, but should we consider either:

  • Have a temporary patch in all pdb code for f_lineno is None and test it, or
  • Wait until the bug in the compiler is fixed and test more thorouly with the current patch removed?

I'd go for the second option. We include a unit test for the bug that the user reported in this issue, we fix this problem, and then add the other tests that we discovered are missing as followup work.

@gaogaotiantian
Copy link
Member

Okay, so I'll check in the regression test above, which is always valid. Then with your fix for the actual problem, you can remove the protection in bdb. Then I can do more tests on pdb as follow up work. I'll create a PR soon.

@gvanrossum
Copy link
Member

Making bdb/pdb more robust when confronted with corner cases is also good though.

@gaogaotiantian
Copy link
Member

Making bdb/pdb more robust when confronted with corner cases is also good though.

If f_lineno is None is a "corner case", then yes we should check it and make pdb/bdb deal with it. I can work on that.

However, if f_lineno is None is "wrong", or an indicator of a bug in Python, then do we still want to check it?

@iritkatriel
Copy link
Member

Making bdb/pdb more robust when confronted with corner cases is also good though.

I'm on the fence on whether 366b949 makes pdb/bdb more robust, or silences a bug.

@gvanrossum
Copy link
Member

However, if f_lineno is None is "wrong", or an indicator of a bug in Python, then do we still want to check it?

How could pdb know it is "wrong"? It can be a real bummer when the debugger itself crashes, so I just want it to keep working as well as it can. Sometimes the debugger is being used to track down bugs in the compiler.

@gaogaotiantian
Copy link
Member

How could pdb know it is "wrong"? It can be a real bummer when the debugger itself crashes, so I just want it to keep working as well as it can. Sometimes the debugger is being used to track down bugs in the compiler.

Sure we can try to make the debugger work even if python is not working properly. I guess we can generate a warning for f_lineno is None but not crash - so the user know something is wrong, but it'll still work on some commands.

@iritkatriel
Copy link
Member

PR is here: #103550

I didn't use the warnings module, just print like in a few other places in this module (I'm assuming there's a good reason for that).

carljm added a commit to carljm/cpython that referenced this issue Apr 17, 2023
* main:
  Remove `expert-*` from `project-updater` GH workflow (python#103579)
  pythongh-103583: Add codecs and maps to _codecs_* module state (python#103540)
  pythongh-48330: address review comments to PR-12271 (python#103209)
  pythongh-103527: Add multibytecodec.h as make dep for _codecs_* (python#103567)
  pythongh-103553: Improve `test_inspect`: add more assertions, remove unused (python#103554)
  pythonGH-103517: Improve tests for `pathlib.Path.walk()` (pythonGH-103518)
  pythongh-102114: Make dis print more concise tracebacks for syntax errors in str inputs (python#102115)
  pythonGH-78079: Fix UNC device path root normalization in pathlib (pythonGH-102003)
  pythongh-101517: Add regression test for a lineno bug in try/except* impacting pdb (python#103547)
  pythongh-103527: Add make deps for _codecs_* and _multibytecodec (python#103528)
  pythongh-103532: Fix reST syntax in NEWS entry (pythonGH-103544)
  pythongh-103532: Add NEWS entry (python#103542)
carljm added a commit to carljm/cpython that referenced this issue Apr 17, 2023
* superopt:
  update generated cases with new comment
  review comments
  Remove `expert-*` from `project-updater` GH workflow (python#103579)
  pythongh-103583: Add codecs and maps to _codecs_* module state (python#103540)
  pythongh-48330: address review comments to PR-12271 (python#103209)
  pythongh-103527: Add multibytecodec.h as make dep for _codecs_* (python#103567)
  pythongh-103553: Improve `test_inspect`: add more assertions, remove unused (python#103554)
  pythonGH-103517: Improve tests for `pathlib.Path.walk()` (pythonGH-103518)
  pythongh-102114: Make dis print more concise tracebacks for syntax errors in str inputs (python#102115)
  pythonGH-78079: Fix UNC device path root normalization in pathlib (pythonGH-102003)
  pythongh-101517: Add regression test for a lineno bug in try/except* impacting pdb (python#103547)
  pythongh-103527: Add make deps for _codecs_* and _multibytecodec (python#103528)
  pythongh-103532: Fix reST syntax in NEWS entry (pythonGH-103544)
  pythongh-103532: Add NEWS entry (python#103542)
carljm added a commit to carljm/cpython that referenced this issue Apr 24, 2023
* main:
  pythongh-101517: fix line number propagation in code generated for except* (python#103550)
  pythongh-103780: Use patch instead of mock in asyncio unix events test (python#103782)
carljm added a commit to carljm/cpython that referenced this issue Apr 24, 2023
* superopt: (82 commits)
  pythongh-101517: fix line number propagation in code generated for except* (python#103550)
  pythongh-103780: Use patch instead of mock in asyncio unix events test (python#103782)
  pythongh-102498 Clean up unused variables and imports in the email module  (python#102482)
  pythongh-99184: Bypass instance attribute access in `repr` of `weakref.ref` (python#99244)
  pythongh-99032: datetime docs: Encoding is no longer relevant (python#93365)
  pythongh-94300: Update datetime.strptime documentation (python#95318)
  pythongh-103776: Remove explicit uses of $(SHELL) from Makefile (pythonGH-103778)
  pythongh-87092: fix a few cases of incorrect error handling in compiler (python#103456)
  pythonGH-103727: Avoid advancing tokenizer too far in f-string mode (pythonGH-103775)
  Revert "Add tests for empty range equality (python#103751)" (python#103770)
  pythongh-94518: Port 23-argument `_posixsubprocess.fork_exec` to Argument Clinic (python#94519)
  pythonGH-65022: Fix description of copyreg.pickle function (python#102656)
  pythongh-103323: Get the "Current" Thread State from a Thread-Local Variable (pythongh-103324)
  pythongh-91687: modernize dataclass example typing (python#103773)
  pythongh-103746: Test `types.UnionType` and `Literal` types together (python#103747)
  pythongh-103765: Fix 'Warning: py:class reference target not found: ModuleSpec' (pythonGH-103769)
  pythongh-87452: Improve the Popen.returncode docs
  Removed unnecessary escaping of asterisks (python#103714)
  pythonGH-102973: Slim down Fedora packages in the dev container (python#103283)
  pythongh-103091: Add PyUnstable_Type_AssignVersionTag (python#103095)
  ...
iritkatriel added a commit to iritkatriel/cpython that referenced this issue Apr 25, 2023
iritkatriel added a commit that referenced this issue Apr 25, 2023
…xcept* (#103550) (#103816)

Manual backport of #103550.



<!-- gh-issue-number: gh-101517 -->
* Issue: gh-101517
<!-- /gh-issue-number -->

---------

Co-authored-by: Łukasz Langa <[email protected]>
@iritkatriel
Copy link
Member

This was fixed (in 3.11 as well). We will follow up on the more general problem in #103795.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11 only security fixes 3.12 bugs and security fixes type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

7 participants