Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asyncio.TaskGroup corrupts uncancel() stack when the parent task replaces the cancelled error #95289

Closed
graingert opened this issue Jul 26, 2022 · 20 comments
Assignees
Labels
3.11 only security fixes 3.12 bugs and security fixes release-blocker topic-asyncio type-bug An unexpected behavior, bug, or error

Comments

@graingert
Copy link
Contributor

graingert commented Jul 26, 2022

Bug report

import asyncio

class MyException(Exception):
    pass

async def async_fn():
    await asyncio.sleep(0)
    raise MyException

async def main():
    task = asyncio.current_task()

    try:
        async with asyncio.TaskGroup() as tg:
            tg.create_task(async_fn())
            try:
                await asyncio.sleep(1)
            except asyncio.CancelledError:
                pass
    except* MyException:
        print("done!")

    print(f"{task.cancelling()=} should be 0")

asyncio.run(main())
@graingert graingert added the type-bug An unexpected behavior, bug, or error label Jul 26, 2022
@graingert
Copy link
Contributor Author

this is also quite easy to trigger in typical task group code:

import asyncio
import contextlib

class ConnectionClosedError(Exception):
    pass


async def poll_database(database):
    raise ConnectionClosedError


async def close_connection():
    raise ConnectionClosedError


@contextlib.asynccontextmanager
async def database():
    try:
        yield
    finally:
        await close_connection()


async def main():
    task = asyncio.current_task()

    try:
        async with asyncio.TaskGroup() as tg:
            async with database() as db:
                tg.create_task(poll_database(db))
                await asyncio.sleep(1)
    except* ConnectionClosedError:
        print("done!")

    print(f"{task.cancelling()=} should be 0")


asyncio.run(main())

@graingert
Copy link
Contributor Author

cc @ambv who helped me find this

@graingert graingert added 3.11 only security fixes 3.12 bugs and security fixes labels Jul 26, 2022
@graingert
Copy link
Contributor Author

Tagging @pablogsal as potential release-blocker

@YvesDup
Copy link
Contributor

YvesDup commented Jul 27, 2022

Hello, sorry for my question but I have never seen a * after an except statement. I couldn't find an explanation on the web. What does it mean, please?

In your first example, if you replace pass with raise, you have the expected result.
May be this will help you ?

@graingert
Copy link
Contributor Author

graingert commented Jul 27, 2022

Hello, sorry for my question but I have never seen a * after an except statement. I couldn't find an explanation on the web. What does it mean, please?

Hi there - it's the new 3.11 ExceptionGroup feature https://peps.python.org/pep-0654/ it's required to handle an issue where multiple errors can happen concurrently. Eg a connection error canceles a group of concurrent operations, and the cancellation error leads to some teardown that also fails. In 3.10 library authors had to either choose an error to raise or wrap them in a list and raise a generic "MultiError"

@YvesDup
Copy link
Contributor

YvesDup commented Jul 29, 2022

Hi @graingert, thank you for your response about except * feature.

About the issue, I read the documentation about 'Task groups' and IMHO there is a part about the two cases, in the second paragraph below the exemple:

... At this point, if the body of the async with statement is still active (__aexit__() hasn’t been called yet), the task directly containing the async with statement is also cancelled.

Do you think that your both examples match the situation described ?

@ambv
Copy link
Contributor

ambv commented Jul 29, 2022

@YvesDup, you stopped your quote just before the sentence that describes why what Thomas reported is a bug:

The resulting asyncio.CancelledError will interrupt an await, but it will not bubble out of the containing async with statement.

In other words, after leaving async with we want print(f"{task.cancelling()=} should be 0").

@kumaraditya303
Copy link
Contributor

kumaraditya303 commented Jul 29, 2022

Have you tried this on the edgedb's task group 1 implementation from which the current one is derived from?

Footnotes

  1. https://github.com/edgedb/edgedb/blob/master/edb/common/taskgroup.py

@1st1
Copy link
Member

1st1 commented Jul 29, 2022

@graingert I see that you're suppressing CancelledError in your OP. Why? Why is this a release blocker?

@1st1
Copy link
Member

1st1 commented Jul 29, 2022

@ambv @pablogsal can this wait until Monday? I can debug this just not right now.

@pablogsal
Copy link
Member

pablogsal commented Jul 29, 2022

@ambv @pablogsal can this wait until Monday? I can debug this just not right now.

Yep, the release is delayed until next Friday so there is plenty of time

@YvesDup
Copy link
Contributor

YvesDup commented Jul 29, 2022

@YvesDup, you stopped your quote just before the sentence that describes why what Thomas reported is a bug:

The resulting asyncio.CancelledError will interrupt an await, but it will not bubble out of the containing async with statement.

In other words, after leaving async with we want print(f"{task.cancelling()=} should be 0").

@ambv I trust you but this part of the documentation is difficult to understand. Il would be nice to have a more explict description or/and an example.

@gvanrossum
Copy link
Member

I can report that with edb's TaskGroup this also fails (you have to change the except to except Exception).

@gvanrossum
Copy link
Member

The key issue seems to be that we're entering __aexit__() with self._parent_cancel_requested set but an exception that's not CancelledError. So we never take the path that calls uncancel(). I'm not at all sure how to fix this. Always calling uncancel() if that flag is set would seem too easy...

@gvanrossum
Copy link
Member

Here's a remarkably simple patch that seems to fix the issue. I'm not sure of it yet.

diff --git a/Lib/asyncio/taskgroups.py b/Lib/asyncio/taskgroups.py
index 3ca65062ef..7b8886a205 100644
--- a/Lib/asyncio/taskgroups.py
+++ b/Lib/asyncio/taskgroups.py
@@ -62,12 +62,10 @@ async def __aexit__(self, et, exc, tb):
             self._base_error = exc
 
         if et is not None:
-            if et is exceptions.CancelledError:
-                if self._parent_cancel_requested and not self._parent_task.uncancel():
-                    # Do nothing, i.e. swallow the error.
-                    pass
-                else:
-                    propagate_cancellation_error = exc
+            if (self._parent_cancel_requested and
+                    self._parent_task.uncancel() != 0 and
+                    et is exceptions.CancelledError):
+                propagate_cancellation_error = exc
 
             if not self._aborting:
                 # Our parent task is being cancelled:

@gvanrossum
Copy link
Member

gvanrossum commented Jul 30, 2022

Alas, that doesn't fix the original bug report. But here's a variant that does:

diff --git a/Lib/asyncio/taskgroups.py b/Lib/asyncio/taskgroups.py
index 3ca65062ef..0dde87f1ca 100644
--- a/Lib/asyncio/taskgroups.py
+++ b/Lib/asyncio/taskgroups.py
@@ -61,14 +61,12 @@ async def __aexit__(self, et, exc, tb):
                 self._base_error is None):
             self._base_error = exc
 
-        if et is not None:
-            if et is exceptions.CancelledError:
-                if self._parent_cancel_requested and not self._parent_task.uncancel():
-                    # Do nothing, i.e. swallow the error.
-                    pass
-                else:
-                    propagate_cancellation_error = exc
+        if (self._parent_cancel_requested and
+                self._parent_task.uncancel() != 0 and
+                et is exceptions.CancelledError):
+            propagate_cancellation_error = exc
 
+        if et is not None:
             if not self._aborting:
                 # Our parent task is being cancelled:
                 #

I have other things to do, maybe @graingert and/or @kumaraditya303 can check my fix and turn it into a PR with the two examples as new tests?

@gvanrossum
Copy link
Member

Hm, there’s one case where the effect is different from the original: if a CancellationError occurs while _parent_cancel_request is not set. Then the original sets propagate_cancellation_error = exc but the new code does nothing. Does that matter? No tests fail.

@gvanrossum
Copy link
Member

That would affect the case where the parent task is cancelled externally — it should propagate the cancellation but with my code it will exit cleanly. This can be fixed and requires a separate test.

@gvanrossum
Copy link
Member

Oh, I think I understand why that doesn't matter. In this case we end up returning None from __aexit__() and that causes the interpreter to re-raise the original exception that was being handled. (In fact we may use this fact to simplify the code even further?)

kumaraditya303 pushed a commit to kumaraditya303/cpython that referenced this issue Aug 3, 2022
@kumaraditya303 kumaraditya303 self-assigned this Aug 4, 2022
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Aug 4, 2022
…quested (pythonGH-95602)

Co-authored-by: Guido van Rossum <[email protected]>
(cherry picked from commit 2fef275)

Co-authored-by: Kumar Aditya <[email protected]>
miss-islington added a commit that referenced this issue Aug 4, 2022
GH-95602)

Co-authored-by: Guido van Rossum <[email protected]>
(cherry picked from commit 2fef275)

Co-authored-by: Kumar Aditya <[email protected]>
@kumaraditya303
Copy link
Contributor

Fixed by #95602

Repository owner moved this from Todo to Done in Release and Deferred blockers 🚫 Aug 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11 only security fixes 3.12 bugs and security fixes release-blocker topic-asyncio type-bug An unexpected behavior, bug, or error
Projects
Development

No branches or pull requests

8 participants