Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up the dashboard #38

Open
egrace479 opened this issue Aug 7, 2023 · 3 comments
Open

Speed up the dashboard #38

egrace479 opened this issue Aug 7, 2023 · 3 comments
Labels
enhancement New feature or request

Comments

@egrace479
Copy link
Member

egrace479 commented Aug 7, 2023

Currently, the dashboard starts to slow down with datasets at about 1.6MB, this becomes more pronounced by 3.3MB (CSV with about 12,000 rows by 15 columns).

The slow down is most pronounced in

  1. Initial processing of the dataset (hence we have a loading indicator).
  2. Selecting different color-by options for the map distribution; the change in the map has a pronounced delay.

Diagnosing the specific source of the slow down (if it's one or more reasons) is still required.

The Dash documentation provides various suggestions under their performance section, and the dash-extensions package also has some options:

  1. Flask caching. Since the processing is only done once, this would only have potential to speed up the graphing components, though initial testing didn't seem to show improvement.
    a. A similar solution is background caching.
    b. Dash-extensions has a package for server-side caching: ServersideOutput Transform. This removes the need for JSON serialization between callbacks and should speed things up, though I'm not sure how effective it was (I tried an implementation on the server-store branch). This would also require a regular clean-up of the cached files (as with the previous two).
  2. Dash Patch Class for updating the map (and potentially other graphs as well). Plotly express is fast, but starts to slow down around 15K points (per dash performance docs). I attempted to implement this with the map, but my initial attempt was unsuccessful.
  3. Using orjson. I'm unclear on precisely how this potential solution works, it seems (and this) that having the package installed allows dash to serialize json strings with orjson instead. There are errors thrown if you try to use it for the serialization (the speedup comes from the fact that it's bytes).
  4. Clientside Callbacks. To implement this, the graphing portion would have to be translated into Javascript.
@egrace479 egrace479 added the enhancement New feature or request label Aug 7, 2023
@egrace479
Copy link
Member Author

Note: It runs slower on the AWS server than when run locally. Could this have to do with nginx settings?

@egrace479
Copy link
Member Author

Note: It runs slower on the AWS server than when run locally. Could this have to do with nginx settings?

Determined this was a result of the AWS hardware, not nginx through running on a Hugging Face Space, which sped up the processes to match running locally.

@egrace479
Copy link
Member Author

Image sampling can be quite slow on larger datasets; perhaps switching to polars could offer a speed-up for some of these features? It does integrate with plotly-dash and boast many of the same features of pandas at quite the enhanced speed for DataFrame manipulations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant