-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compress each file in a ThreadPool #484
Conversation
A previous version of this patch used |
@evansd @adamchainz Apologies for pinging you. May I get some feedback on this approach? |
It would be great to have this merged. I've patched in a Docker build, where it was made the build much faster, but it would be best to have it built in instead. |
Some bench for my project. Tested on i9-11900 (8 cores, 2 threads per core). Without ThreadPool:
With ThreadPool:
In CI it goes from 213s to 38s 🚀 |
This comment was marked as spam.
This comment was marked as spam.
Thank you very much @rik . I eventually got round to checking this. I did some refactoring, particularly to avoid adding extra methods that are only needed for this compression step. Using inner functions avoids bloating the classes :) It would be nice to avoid the duplication between the different sites, but I don't have the energy to figure that out right now. I checked that current tests would fail if any of the compression didn't work. I don't think there's an easy way to check the executor is used or works correctly, but at least the basics seem to pass. Merging and releasing now. |
32e2611
to
511b57e
Compare
@adamchainz A lot of cleanup for this PR occurred on EDIT: Looks like similar solutions were taken in your clean up, so can be skipped. |
Bumps [whitenoise](https://github.com/evansd/whitenoise) from 6.7.0 to 6.8.1. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/evansd/whitenoise/blob/main/docs/changelog.rst">whitenoise's changelog</a>.</em></p> <blockquote> <h2>6.8.1 (2024-10-28)</h2> <ul> <li> <p>Raise any errors from threads in the <code>whitenoise.compress</code> command.</p> <p>Regression in 6.8.0. Thanks to Tom Grainger for the spotting this with a <code>comment on PR [#484](evansd/whitenoise#484) <https://github.com/evansd/whitenoise/pull/484#discussion_r1818989096></code>__.</p> </li> </ul> <h2>6.8.0 (2024-10-28)</h2> <ul> <li> <p>Drop Django 3.2 to 4.1 support.</p> </li> <li> <p>Drop Python 3.8 support.</p> </li> <li> <p>Support Python 3.13.</p> </li> <li> <p>Fix a bug introduced in version 6.0.0 where <code>Range</code> requests could lead to database connection errors in other requests.</p> <p>Thanks to Per Myren for the detailed investigation and fix in <code>PR [#612](evansd/whitenoise#612) <https://github.com/evansd/whitenoise/pull/612></code>__.</p> </li> <li> <p>Use Django’s |FORCE_SCRIPT_NAME|__ setting correctly. This reverts a change from version 5.3.0 that added a call to Django’s |get_script_prefix() method|__ outside of the request-response cycle.</p> <p>.. |FORCE_SCRIPT_NAME| replace:: <code>FORCE_SCRIPT_NAME</code> __ <a href="https://docs.djangoproject.com/en/stable/ref/settings/#std:setting-FORCE_SCRIPT_NAME">https://docs.djangoproject.com/en/stable/ref/settings/#std:setting-FORCE_SCRIPT_NAME</a></p> <p>.. |get_script_prefix() method| replace:: <code>get_script_prefix()</code> method __ <a href="https://docs.djangoproject.com/en/stable/ref/urlresolvers/#django.urls.get_script_prefix">https://docs.djangoproject.com/en/stable/ref/urlresolvers/#django.urls.get_script_prefix</a></p> <p>Thanks to Sarah Boyce in <code>PR [#486](evansd/whitenoise#486) <https://github.com/evansd/whitenoise/pull/486></code>__.</p> </li> <li> <p>Compress files using a thread pool. This speeds up the compression step up to four times in benchmarks.</p> <p>Thanks to Anthony Ricaud in <code>PR [#484](evansd/whitenoise#484) <https://github.com/evansd/whitenoise/pull/484></code>__.</p> </li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/evansd/whitenoise/commit/bfc5dae69aad3abac5ff763421a4b4e2ae8ab378"><code>bfc5dae</code></a> Version 6.8.1</li> <li><a href="https://github.com/evansd/whitenoise/commit/6bbec0fe2be3814a38f3c7585414174053b414ea"><code>6bbec0f</code></a> Raise errors from threads in whitenoise.compress (<a href="https://redirect.github.com/evansd/whitenoise/issues/615">#615</a>)</li> <li><a href="https://github.com/evansd/whitenoise/commit/0b054e5b9706c2c02b561e3f9ab78e8700c09b69"><code>0b054e5</code></a> Version 6.8.0</li> <li><a href="https://github.com/evansd/whitenoise/commit/54c464a48fec7e0a153f5243d71a25be5be98e12"><code>54c464a</code></a> Upgrade and clarify Django quickstart docs (<a href="https://redirect.github.com/evansd/whitenoise/issues/548">#548</a>)</li> <li><a href="https://github.com/evansd/whitenoise/commit/d5caf8daed1a488999d92d33d3b91778003d1e6a"><code>d5caf8d</code></a> Compress each file in a ThreadPool (<a href="https://redirect.github.com/evansd/whitenoise/issues/484">#484</a>)</li> <li><a href="https://github.com/evansd/whitenoise/commit/9494ff362264b03fb5882dcd059766eaea896ae5"><code>9494ff3</code></a> Use settings.FORCE_SCRIPT_NAME correctly (<a href="https://redirect.github.com/evansd/whitenoise/issues/486">#486</a>)</li> <li><a href="https://github.com/evansd/whitenoise/commit/c42e93cd5805b0f057eefa72d99e4e10e0ae4f12"><code>c42e93c</code></a> Make sure SlicedFile is closed properly (<a href="https://redirect.github.com/evansd/whitenoise/issues/612">#612</a>)</li> <li><a href="https://github.com/evansd/whitenoise/commit/f8dff5026750df7e793a2bc7bd3aaf18f3065a85"><code>f8dff50</code></a> Drop Django 3.2 to 4.1 support (<a href="https://redirect.github.com/evansd/whitenoise/issues/614">#614</a>)</li> <li><a href="https://github.com/evansd/whitenoise/commit/6450820afc42a59e15924d5fc6efcbaa0ab626f6"><code>6450820</code></a> Drop Python 3.8 support (<a href="https://redirect.github.com/evansd/whitenoise/issues/613">#613</a>)</li> <li><a href="https://github.com/evansd/whitenoise/commit/946a95eadb27edd475d64bda93c69cf70a887025"><code>946a95e</code></a> [pre-commit.ci] pre-commit autoupdate (<a href="https://redirect.github.com/evansd/whitenoise/issues/611">#611</a>)</li> <li>Additional commits viewable in <a href="https://github.com/evansd/whitenoise/compare/6.7.0...6.8.1">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=whitenoise&package-manager=pip&previous-version=6.7.0&new-version=6.8.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details>
Bumps [whitenoise](https://github.com/evansd/whitenoise) from 6.7.0 to 6.8.1. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/evansd/whitenoise/blob/main/docs/changelog.rst">whitenoise's changelog</a>.</em></p> <blockquote> <h2>6.8.1 (2024-10-28)</h2> <ul> <li> <p>Raise any errors from threads in the <code>whitenoise.compress</code> command.</p> <p>Regression in 6.8.0. Thanks to Tom Grainger for the spotting this with a <code>comment on PR [#484](evansd/whitenoise#484) <https://github.com/evansd/whitenoise/pull/484#discussion_r1818989096></code>__.</p> </li> </ul> <h2>6.8.0 (2024-10-28)</h2> <ul> <li> <p>Drop Django 3.2 to 4.1 support.</p> </li> <li> <p>Drop Python 3.8 support.</p> </li> <li> <p>Support Python 3.13.</p> </li> <li> <p>Fix a bug introduced in version 6.0.0 where <code>Range</code> requests could lead to database connection errors in other requests.</p> <p>Thanks to Per Myren for the detailed investigation and fix in <code>PR [#612](evansd/whitenoise#612) <https://github.com/evansd/whitenoise/pull/612></code>__.</p> </li> <li> <p>Use Django’s |FORCE_SCRIPT_NAME|__ setting correctly. This reverts a change from version 5.3.0 that added a call to Django’s |get_script_prefix() method|__ outside of the request-response cycle.</p> <p>.. |FORCE_SCRIPT_NAME| replace:: <code>FORCE_SCRIPT_NAME</code> __ <a href="https://docs.djangoproject.com/en/stable/ref/settings/#std:setting-FORCE_SCRIPT_NAME">https://docs.djangoproject.com/en/stable/ref/settings/#std:setting-FORCE_SCRIPT_NAME</a></p> <p>.. |get_script_prefix() method| replace:: <code>get_script_prefix()</code> method __ <a href="https://docs.djangoproject.com/en/stable/ref/urlresolvers/#django.urls.get_script_prefix">https://docs.djangoproject.com/en/stable/ref/urlresolvers/#django.urls.get_script_prefix</a></p> <p>Thanks to Sarah Boyce in <code>PR [#486](evansd/whitenoise#486) <https://github.com/evansd/whitenoise/pull/486></code>__.</p> </li> <li> <p>Compress files using a thread pool. This speeds up the compression step up to four times in benchmarks.</p> <p>Thanks to Anthony Ricaud in <code>PR [#484](evansd/whitenoise#484) <https://github.com/evansd/whitenoise/pull/484></code>__.</p> </li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/evansd/whitenoise/commit/bfc5dae69aad3abac5ff763421a4b4e2ae8ab378"><code>bfc5dae</code></a> Version 6.8.1</li> <li><a href="https://github.com/evansd/whitenoise/commit/6bbec0fe2be3814a38f3c7585414174053b414ea"><code>6bbec0f</code></a> Raise errors from threads in whitenoise.compress (<a href="https://redirect.github.com/evansd/whitenoise/issues/615">#615</a>)</li> <li><a href="https://github.com/evansd/whitenoise/commit/0b054e5b9706c2c02b561e3f9ab78e8700c09b69"><code>0b054e5</code></a> Version 6.8.0</li> <li><a href="https://github.com/evansd/whitenoise/commit/54c464a48fec7e0a153f5243d71a25be5be98e12"><code>54c464a</code></a> Upgrade and clarify Django quickstart docs (<a href="https://redirect.github.com/evansd/whitenoise/issues/548">#548</a>)</li> <li><a href="https://github.com/evansd/whitenoise/commit/d5caf8daed1a488999d92d33d3b91778003d1e6a"><code>d5caf8d</code></a> Compress each file in a ThreadPool (<a href="https://redirect.github.com/evansd/whitenoise/issues/484">#484</a>)</li> <li><a href="https://github.com/evansd/whitenoise/commit/9494ff362264b03fb5882dcd059766eaea896ae5"><code>9494ff3</code></a> Use settings.FORCE_SCRIPT_NAME correctly (<a href="https://redirect.github.com/evansd/whitenoise/issues/486">#486</a>)</li> <li><a href="https://github.com/evansd/whitenoise/commit/c42e93cd5805b0f057eefa72d99e4e10e0ae4f12"><code>c42e93c</code></a> Make sure SlicedFile is closed properly (<a href="https://redirect.github.com/evansd/whitenoise/issues/612">#612</a>)</li> <li><a href="https://github.com/evansd/whitenoise/commit/f8dff5026750df7e793a2bc7bd3aaf18f3065a85"><code>f8dff50</code></a> Drop Django 3.2 to 4.1 support (<a href="https://redirect.github.com/evansd/whitenoise/issues/614">#614</a>)</li> <li><a href="https://github.com/evansd/whitenoise/commit/6450820afc42a59e15924d5fc6efcbaa0ab626f6"><code>6450820</code></a> Drop Python 3.8 support (<a href="https://redirect.github.com/evansd/whitenoise/issues/613">#613</a>)</li> <li><a href="https://github.com/evansd/whitenoise/commit/946a95eadb27edd475d64bda93c69cf70a887025"><code>946a95e</code></a> [pre-commit.ci] pre-commit autoupdate (<a href="https://redirect.github.com/evansd/whitenoise/issues/611">#611</a>)</li> <li>Additional commits viewable in <a href="https://github.com/evansd/whitenoise/compare/6.7.0...6.8.1">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=whitenoise&package-manager=pip&previous-version=6.7.0&new-version=6.8.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Strange, after yesterday's:
I have it 6x slower on my project: 1211 static files copied to '/app/staticfiles', 4971 post-processed. I've tested also the newest release (6.8.1) and older 6.7.0 (before merge) with same results. No gains. |
@petrprikryl I'm curious if it's related to the way the executor is written. It's written a bit differently on If so then I can PR the fixes upstream. Additionally, what operating system are you executing on? |
I was also wondering if the default of up to 32 threads could lead to detrimental performance. Can you try with 4 or 8 threads on your project? Add |
Good thought - probably makes more sense for it to be |
Just double checked standard lib, the default isn't 32. Default is |
I wonder if the problem is materializing , 4971 futures all in one go and waiting on them all. |
Creating the future itself should be an O(1) operation since it just enqueues it within the executor. Would be a bit strange if doing the equivalent to appending objects to a list had a significant performance impact, but I've been proven wrong before 😅 |
After my testing the number of threads isn't related to this. Looks like problem are generators: This is fast:
This is 6x slower:
Tested on i9-11900 (8 cores, 2 threads per core). |
Yes, sorry, was taking a shortcut. That works out as 32 on most machines (for now!). |
That's >>> import os
>>> min(32, (os.cpu_count() or 1) + 4)
16 |
fix #148
I've tested this on a 2020 M1 MacBook Air (4 big + 4 small cores). In milliseconds.
Commands to test yourself
time python -m whitenoise.compress --quiet --no-brotli staticfiles
time python -m whitenoise.compress --quiet staticfiles