Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using ContextVar to hold connections does not guarantee they are always unique to tasks. #230

Closed
trvrm opened this issue Jul 27, 2020 · 5 comments
Labels
bug Something isn't working

Comments

@trvrm
Copy link

trvrm commented Jul 27, 2020

I think there's a fairly fundamental problem with the way that database connections are managed using ContextVar.

Database.connection() is designed to return a Connection object that is unique to an asyncio Task, but because of the way that context is copied during Task creation, this is not always the case.

Consider the following code:

import asyncio
from databases import Database
database = Database("postgresql://example:password@localhost/example")

async def t1():
    await database.connect()
    c1 = database.connection()

    async def inner():
        c2 = database.connection()
        assert c1 is not c2 # this fails
    await asyncio.gather(inner())


if __name__ == "__main__":
    loop = asyncio.get_event_loop()
    loop.run_until_complete(t1())

This raises an assertion, because when gather creates a new task, it copies the current context into the newly created coroutine.

Tasks support the contextvars module. When a Task is created it copies the current context and later runs its coroutine in the copied context.

This is the fundamental problem behind #134 and possibly several other issues. In general it makes it impossible to safely use this library in any code that uses gather or create_task.

Unfortunately I'm not sure what the best way of fixing this is. The only idea I've had so far is storing the connection as an attribute of asyncio.current_task() rather than in a ContextVar, but that feels a bit hacky.

@vmarkovtsev
Copy link
Contributor

This is (almost) a duplicate of #176 I posted some workarounds there. I am opening new connections in my prod and that works very well; my opinion is that this ContextVar feature is harmful and should be removed. It does not save much performance (I py-spy regularly) and leads to many problems, some of them are performance-critical.

@aminalaee
Copy link
Member

@trvrm This is probably fixed by #328 . Please re-open if it still exists.

@trvrm
Copy link
Author

trvrm commented Aug 27, 2021

@aminalaee unfortunately this does not fix this issue. The sample code above still fails.

The problem is still that although the code comments in core.py say "Connections are stored as task-local state", the way that python contextvars work is that when you create a new task, the context variables from the parent task are copied to the child.

Unfortunately I don't think I have permission to re-open this issue - is that something you can do?

@aminalaee
Copy link
Member

@trvrm Thanks, I'll take a look into it.

@zevisert
Copy link
Contributor

Thanks for the great write up @trvrm - I agree about how connections should not be inherited to child tasks, but rather new connections should be acquired. In #546, I think I have a solution. Each database instance stores a WeakKeyDictionary[asyncio.Task, databases.core.Connection] so that each task can have it's own connection, and we still get a neat bit of cleanup by using weakrefs on the tasks. Hopefully this helps address the concurrent woes that keep coming up across quite a few issues in this repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants