Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix potential deadlock during migrations caused by Celery Beat #75

Merged
merged 3 commits into from
Apr 28, 2023
Merged

Conversation

edg956
Copy link
Contributor

@edg956 edg956 commented Apr 20, 2023

When using a subclass of TenantAwareSchedulerMixin, there's a risk that very frequent (or just unfortunately timed) celery tasks might cause a deadlock when running migrations for a new tenant. This is specially true for bigger projects with a lot of migrations and new tenants (in our setup, migrations can last more than 30mins).

In order to mitigate this, we need a flag in the tenant model that indicates whether or not the tenant is ready for usage (at least for beat), but since django-tenants doesn't provide any, the user of the library must configure their projects to change the filtering behaviour of the scheduler based on their own implementation.

The proposed solution adds a configurable filtering dictionary that's picked up by the scheduler and used to filter the queryset of tenants for which it will run migrations.

Adding new tenants then becomes a matter of creating them, running their migrations (either synchronously or asynchronously) and finally setting the flags to start counting them as ready. I.E:

# project/settings.py
TENANT_READINESS_FILTER = dict(ready=True)

# somewhere else, i.e: a shell
Tenant = get_tenant_model()

# Creates tenant and runs migrations
new_tenant = Tenant.objects.create(name="Foo", schema_name="foo", ready=False)

# Tenant is created and the schema is ready, but celery beat still doesn't run tasks for it
...

# Mark the tenant as ready
tenant.ready = True
tenant.save()

# Celery beat will start scheduling tasks for that tenant. 

@maciej-gol
Copy link
Owner

maciej-gol commented Apr 24, 2023

Sorry for the delay, I will have a deeper look into this later on today. Currently, what I suggest is maybe dropping the filters from the settings and taking a more flexible approach via a method?

So the TenantAwareSchedulerMixin would look like this:

class TenantAwareSchedulerMixin:
    def filter_available_tenants(self, tenants: List[object]) -> List[object]:
        return tenants

    def apply_entry() -> ...:
        ...
        tenants = get_tenant_model().objects.all()

        tenants = self.filter_available_tenants(tenants)
        ...

This would allow people to define their own semantics of "ready". Whether its stored in the model or somewhere else. This would also not be limited to ready tenants only, supporting other usecases.

WDYT?

@edg956
Copy link
Contributor Author

edg956 commented Apr 25, 2023

I think it makes sense. This implementation was a quick attempt to generalise our use case, where ready is defined as a method on the manager that basically applies filters.

I like your approach and I can update the PR based on your comments. I'd maybe suggest to name the whole thing get_queryset(), to be consistent with django. Something like this:

TenantModel = get_tenant_model()

class TenantAwareSchedulerMixin:
    def get_queryset(self) -> Iterable[TenantModel]:
        """Override this method to provide your own definition of `ready` tenants"""
        return TenantModel.objects.all()

    def apply_entry() -> None:
        tenants = self.get_queryset()
        ...

@maciej-gol
Copy link
Owner

I thought of naming it in the likes of get_tenants and pass raw lists around. Using raw lists would make it more decoupled from the actual implementation. But after giving it a second thought, I think we can go with get_queryset and allow people to implement any optimizations directly into the queryset.

@maciej-gol
Copy link
Owner

Looks great, will release this later today! Loving all your contributions - da true MVP! Your attention to the README and documentation is also stellar!

@edg956
Copy link
Contributor Author

edg956 commented Apr 27, 2023

Awesome! Thanks. Glad to contribute to this project 👍

@maciej-gol maciej-gol merged commit 40712c6 into maciej-gol:master Apr 28, 2023
@maciej-gol
Copy link
Owner

Released in 2.2.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants