-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TargetList load time for non-superusers #598
Comments
This should be for non-superusers. In our testing, a target list of ~60,000 loads in ~5 seconds for a superuser but takes over a minute for other users. |
In case this is relevant, I'm using |
@griffin Can you share some DB info with me? |
11 Users, 17399500 Targets, 1306771 ReducedDatums. We can probably set you up to access our database directly. DM me if that would be helpful. |
From Pivotal Tracker:
|
I'm checking in on this issue after having a conversation with @griffin-h today. It is apparently still an issue. Thanks! |
It is. |
Hi @dsand and @griffin-h I started by loading up a fresh tom with 6 million targets. So far, the target list page still loads fast for both superuser and non-superuser accounts (about 300ms). These targets don't have any associated data or extras, so I'll look at that next. In the meantime, I have some questions:
I suspect that this is a n+1 problem but the reports of the performance varying wildly between normal and super users makes it slightly more interesting. There are extra models to support user->target permissions, so it's definitely worth looking at further. Thanks! |
@Fingel, great to hear from you! Answers to your questions:
but if that were the issue, I don't see how it could be so fast for superusers. I always suspected the problem was just that checking the view permissions for all the targets was slow, and if you were a superuser you could skip all that. But I never had the time or expertise to look into it more deeply. |
We currently sort on the created field of BaseTarget, which causes performance issues when there are a large number of BaseTargets. In a database with 6 million rows, this reduced the time for the SQL count(*) from ~800ms to ~100ms, significantly speeding up page load time. Note that this does NOT solve #598 but it does help a little.
Hi @griffin-h This is certainly an issue with the way the permissions database works. Can you run these queries so we can see how large these tables are?
I've assigned a single group permissions to 184,453 targets. This already slows down the page load time to almost 2 seconds. If your database utilizes multiple groups, and those groups are assigned permissions to many targets, this query will quickly get out of control! You can see that query there taking ~1 second. It's selecting all the permissions models that the user belongs to, then using that set to select all the targets. I'm going to try this to see if it helps: https://django-guardian.readthedocs.io/en/stable/userguide/performance.html But the reality is that using intersection tables for this might not be sustainable. We may have to move to a model where we store what groups have access to a target directly on the target itself. |
So basically three permissions (view, edit, delete) for each of the 60M targets = 180M group object permissions just to make them all public. |
OK, that is definitely not going to work! I'm very glad you mentioned this:
If you were to remove all the Public group permissions, how many targets do you think would require object permissions? I ask because I would think we could implement Public targets in a different way that wouldn't require utilizing row level permissions. It was probably done that way out of convenience. And if the remaining number of groups -> target permissions is relatively small, that could solve the performance problems. We would still run into this problem if other groups were created that needed access to a large % of the targets in the DB (almost public but not quite) I'm thinking Snex here... |
Let me first clarify what I mean by "public," because I think it may generally mean something different in Django. For my purposes, I actually don't care about permissions at all. If a user is logged in, I want them to see everything in the TOM. That's what those 180M permissions do. There's a separate question of whether I want anything to be world-public, i.e., visible to people that are not logged in. I don't want the world to see all my targets for now, although I might eventually want the ability to make certain ones world-public. But as you say, this is not the model all TOMs use, e.g., SNEx. SNEx wants to maintain group permissions on each individual target and data point. In practice this is actually not used anymore (except maybe for some old data), but it's probably important to keep it in the TOM Toolkit. And as soon as you want to make a distinction between permissions for two groups that both have access to the TOM, you end up with at least 3 * (N_targets + N_dataproducts + N_reduceddatums) rows in that permissions table, assuming each item is only visible to one group. |
@Fingel are you thinking something like a "public" tag in the target model? |
@jchate6 yes exactly. We could make it configurable whether that tag means public to authenticated users (what @griffin-h is trying to achieve) or public to the internet (non-authenticated users). That should remove the need for millions of rows in that database. What do you think? |
I have 2 concerns: 2nd, There is the scenario where you want limited permissions to the public group, say I want all authenticate users to be able to see a target and update it, but not delete it. I think we could address the first one by having the tag be a choice field (public = ['private', 'authenticated', 'global']) or something. I'm ok with not allowing for something to be available to unauthenticated users while unavailable to authenticated users. I don't know of a clean way to address the 2nd point on a per-target basis though. (One solution is that we don't, and if a target is public, an authenticated user has full permissions, while we just restrict un-authenticated users from deleting things universally, but that's a conversation with Rachel and tom admins to see if that would work for them.) |
@griffin-h I am exploring a solution here: #1175 It does solve the speed issue, but there's a bit of work to make sure it will work for all cases. I'll keep you updated. |
Thanks @Fingel, let me know if/when I can help with testing on our database. |
Will do, I'll be able to focus on this Wednesday. |
For non-superusers, some pages like the TargetList taks a long time to load, perhaps b/c of permissions checking
This is related to django-guardian checking.
https://django-guardian.readthedocs.io/en/stable/userguide/performance.html
The text was updated successfully, but these errors were encountered: