Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[native-filter][performance] Dashboard load slower with native-filter components #14421

Closed
3 tasks done
graceguo-supercat opened this issue Apr 30, 2021 · 2 comments
Closed
3 tasks done
Labels
#bug:performance Performance bugs dashboard:native-filters Related to the native filters of the Dashboard

Comments

@graceguo-supercat
Copy link

graceguo-supercat commented Apr 30, 2021

I compared the 2 dashboard with same set of of charts, one using filter_box and the other used filter components. I can tell the dashboard using filter component loads much slower than filter_box version.

How to reproduce the bug

  1. Let's use World Bank's Data dashboard as an example.
  2. disable native filter component, open dashboard,
  3. Make a copy of the original dashboard, enable filter component, remove filter_box, and add 2 filter components to the dashboard. Check the requests from browser's dev console:

Screenshots

Screen Shot 2021-04-26 at 1 00 42 AM

  1. in filter_box version, 1 filter box can have multiple filter field, and Superset dashboard only send 1 http request can get all the options list for each filter field
  2. in filter component version: each filter field is a separated query, and they all get triggered before dashboard start to load chart. See in above screenshot, filter components sends data queries first.

Dashboard users may not need to change filter values for every time they load dashboard, and there might be some scoped-filters that are not applicable to the charts currently visible, but users have to wait all the filter components to be finished before the charts to load.

  1. Another important case is Time related filter types. In filter_box we have time-range, time-column and time-granularity in a single filter_box, for filter components it's 3 different filter-type. And once added into dashboard, it generates 3 extra different query requests.

This is a simple example and performance might not looks really bad. During my test with airbnb dashboard, which has ~10 filter fields (or even more), dashboard load really slow and chart keep spinning a very long time.

Environment

latest master.

Checklist

Make sure to follow these steps before submitting your issue - thank you!

  • I have checked the superset logs for python stacktraces and included it here as text if there are any.
  • I have reproduced the issue with at least the latest released version of superset.
  • I have checked the issue tracker for the same issue and I haven't found one similar.
@villebro
Copy link
Member

villebro commented May 3, 2021

  1. in filter_box version, 1 filter box can have multiple filter field, and Superset dashboard only send 1 http request can get all the options list for each filter field

While the client only sends one http request from the client to the backend for filter box select filters, they are all broken down into multiple chart data requests by the backend to be satisfied either from the analytical database or cache. This means that a filter box data request with multiple selects that is not cached might take relatively longer to resolve, as it will have to complete all queries serially before it can return results for the queries. So time saved by limiting number of client side requests can result in major delays caused by serial resolution of individual queries in the filter box.

If limiting the number of client requests is seen as a major concern, we could introduce functionality that groups together all chart data requests originating from non-legacy plugins into one request (the /api/v1/chart/data endpoint supports requesting data for multiple queries in one request). However, this would likely introduce other bottlenecks, especially in cases where data is not cached.

To address the limitations the browser imposes on concurrent API requests, the Global Async Query (GAQ) framework (#9190), which is nearing completion, should help address some of the bottlenecks caused by multiple chart data requests. It also introduces other benefits that will likely outweigh both the benefits achieved by grouping chart data queries together, such as query deduplication and not being limited by webserver timeout restrictions for long running queries. Therefore, I'd recommend investing in completion of GAQ rather than grouping chart data requests together.

  1. in filter component version: each filter field is a separated query, and they all get triggered before dashboard start to load chart. See in above screenshot, filter components sends data queries first.

We're working on prioritizing chart requests, so that they will trigger before native filter data requests. See #14443 for the PR that implements this. This should make that actual charts on the dashboard load up quicker than before, as the filter box requests are carried out in parallel with regular chart data requests.

@villebro
Copy link
Member

The performance issues should all now be resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
#bug:performance Performance bugs dashboard:native-filters Related to the native filters of the Dashboard
Projects
None yet
Development

No branches or pull requests

2 participants