-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
asyncio.TaskGroup may silently discard request to run a task #116048
Comments
Related: #115957 |
I don't feel like introducing flags for alternate semantics. When you create a coroutine and immediately throw an exception into it, the coroutine (which hasn't started executing its body yet) will be cancelled without getting a chance to clean up. Those are the semantics of coroutines defined with It's fine to add some documentation for this, but I'm not sure that the various |
I do see your point: if an abstraction has very similar semantics to the layer below, it's often simplest to declare the semantics are exactly the same and be done with it. But I think most of asyncio's users will miss this detail, even with the suggested documentation note. I certainly missed it even though I was specifically looking for race conditions. I'm mainly concerned about task groups because the whole point of them is to protect you from edge cases when there are exceptions. But you can use them to write simple code that looks clearly correct, and could be correct (and should be IMO), but suffers from this issue. Like this (full demo): async def use_connection(conn):
try:
await conn.use() # Might raise exception
finally:
conn.close()
async def process_data(item_iter):
async with asyncio.TaskGroup() as tg:
async for item in item_iter:
conn = create_connection(item)
tg.create_task(use_connection(conn)) Even if you do realise that above code is broken (perhaps by finding out the hard way), there's no really simple fix for it. (Of course you could move connection creation to async def use_connection(conn):
try:
await conn.use() # Might raise exception
finally:
conn.close()
async def use_connection_wrapper(conn, open_connections):
open_connections.remove(conn)
await use_connection(conn)
async def process_data(item_iter):
open_connections = set()
try:
async with asyncio.TaskGroup() as tg:
async for item in item_iter:
conn = create_connection(item)
open_connections.add(conn)
tg.create_task(use_connection_wrapper(conn, open_connections))
finally:
for conn in open_connections:
conn.close() If you look at this through the eyes of an asyncio user that doesn't know every low-level detail, it's hard to see the second snippet as anything other than a hack to work around a bug in asyncio. It's certainly not the natural way to write it. |
Although I've not tested it, I believe this issue also applies to async def handle_connection(reader: asyncio.StreamReader, writer: asyncio.StreamWriter):
try:
await asyncio.sleep(1) # ... use connection ...
finally:
writer.transport.abort()
async def run_server(port):
async with asyncio.TaskGroup() as tg:
def handle_in_taskgroup(r, w):
tg.create_task(handle_connection(r, w))
server = await asyncio.start_server(handle_in_taskgroup, port=port)
tg.create_task(server.serve_forever()) This is not a surprise, as it follows the same pattern as the snippet in my previous comment. It could be worked around in the same way, but again I think that is very non-obvious. It would be convenient, regardless of safety, if you could specify a task group to await asyncio.start_server(handle_connection, port=port, task_group=tg) That could be used as a reason not to fix this issue in general, because this form of |
Let's move the feature request for start_server() to a different issue (and wait until this one has settled). I feel that the best way forward for you, and for others who feel this is a misfeature of coroutine design that asyncio should explicitly work around, is to turn on eager tasks. This is available from 3.12 onward. If eager tasks don't fit your usage pattern, please explain. |
It feels a little wrong to use an optimisation (at least I assumed that's what that was for) that happens to have a semantic side effort as a work around this, but to be fair I can't think of any specific problem with it. The original comment issue in the issue for that, #97696, actually mentioned this specific case, but I think it would be very hard for most asyncio users to figure this out for themselves. My preference would be for a warning in the |
Oh and there's an existing discussion on Discuss of the API change for start_server() so I posted about that there. |
Bug report
Bug description:
As I understand it,
asyncio.TaskGroup
should never silently lose a task. Any time you useTaskGroup.create_task()
, one of these two things happens:TaskGroup
context manager waiting until this happens. Possibly the task is cancelled, maybe due to another task in the group raising an exception, but the coroutine "sees" this (it gets the cancellation exception) so it can always do some handling of it if needed, and the task group still waits for that all to finish.TaskGroup.create_task()
method raises aRuntimeError
because it is not possible to start the task. This happens when the task group is not running (because the context manager hasn't been entered or has already exited), or because it is in the process of shutting down (because one of the tasks in it has finished with an exception).(Disclaimer: My background is as a user of Trio, where these are definitely the only two possibilities. The main difference is that starting a task in an active but cancelled Trio nursery will start a task and cancel it immediately, which allows it to run to its first unshielded
await
, rather than raising an exception fromstart_soon()
. But that's a small design difference. The point is that you are still guaranteed one of the two possibilities.)However, a task can be silently lost if the task group it is in gets shut down before a recently-created task has a chance to get scheduled at all, as in this example:
This snippet seems a bit silly because the task group gets shut down by an exception from the same child task that is spawning a new sibling. But the same situation can happen when an uncaught exception gets thrown by one child at roughly the same time that another child has spawned a sibling. (I came across this issue by launching a task from an inner task group while the outer task group was in the process of shutting down.)
Overall, this follows from this behaviour of asyncio tasks:
This will not run
my_fn()
at all, not even to the firstawait
within it. This is despite the fact that the docs forasyncio.Task.cancel()
say:I looked over old issues to see if this had been reported and found the #98275 which suggested changing the docs to warn about this, but it has since been closed.
Personally, I would say that the behaviour of
create_task
andTask.cancel()
are incorrect, but looking back at that discussion I can see that this is a matter of opinion. However, I think the task group behaviour really does need to be fixed. It's hard to see how this could be reliably done with the current task behaviour, so I think that gives some weight that it really is the undelying issue.Perhaps, as a compromise, there could be a new parameter
asyncio.create_task(..., run_to_first_await=False)
, which can be set toTrue
byTaskGroup
and other users wanting robust cancellation detection?CPython versions tested on:
3.12
Operating systems tested on:
Windows
The text was updated successfully, but these errors were encountered: