Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixer: improve performance of generateDiff() #355

Merged
merged 2 commits into from
Feb 24, 2024

Conversation

jrfnl
Copy link
Member

@jrfnl jrfnl commented Feb 20, 2024

Description

Fixer::generateDiff(): add tests

These tests test the functionality and document the behaviour of the Fixer::generateDiff() method, with a particular focus on (trailing) whitespace handling in the diff as this was noted in the method itself as a potential point of failure.

Related to #146

Fixer: improve performance of generateDiff()

Similar to #351 and #354, this is a change intended to make the test suite faster, in particular when running on Windows, as a quality of life improvement for contributing developers.

However, this change has the added benefit that the performance improvement will also be noticeable for users running PHPCS with the diff report, which also uses the Fixer::generateDiff() method.

On its own, this change makes running the test suite ~50% faster on Windows.
In combination with the other two changes, the difference is smaller, but still ~10%.

👉🏻 Request for testing:

Based on the test runs in CI, this change should make no significant different on *nix systems, but I'd love to have some people test it and post their with and without timing results.

Suggested changelog entry

Performance improvement for the "Diff" report. Should be most notable for Windows users.

@jrfnl jrfnl added this to the 3.9.x Next milestone Feb 20, 2024
@jrfnl jrfnl changed the title Feature/fixer generate diff performance improvement Fixer: improve performance of generateDiff() Feb 20, 2024
@jrfnl jrfnl force-pushed the feature/fixer-generate-diff-performance-improvement branch from 25d3c34 to de1c3f2 Compare February 21, 2024 09:35
@jrfnl
Copy link
Member Author

jrfnl commented Feb 21, 2024

I've updated the first commit with four extra tests (diff of blank lines/whitespace at start/end of file)

@jrfnl jrfnl force-pushed the feature/fixer-generate-diff-performance-improvement branch from de1c3f2 to 905e157 Compare February 21, 2024 09:38
@rodrigoprimo
Copy link
Contributor

Based on the test runs in CI, this change should make no significant different on *nix systems, but I'd love to have some people test it and post their with and without timing results.

I tested this on my Ubuntu machine using the command vendor/bin/phpunit tests/AllTests.php --no-coverage, and I saw a small increase in the execution time of the tests. I did only a few runs on the main branch and on this branch. On the main branch, it takes around 26.5 seconds for the command to finish. On this branch, it takes around 27.5 seconds. I would be happy to check this further if you would like me to.

On a quick check, I believe the same happened in the CI runs. On the last few runs that I checked on the main branch, it took around 13 seconds to run the tests (https://github.com/PHPCSStandards/PHP_CodeSniffer/actions/runs/7977678082/job/21781108952#step:9:56 and https://github.com/PHPCSStandards/PHP_CodeSniffer/actions/runs/7975902826/job/21775216017#step:9:56). On this branch, it took around 15 seconds (https://github.com/PHPCSStandards/PHP_CodeSniffer/actions/runs/7986938143/job/21808386831?pr=355#step:9:56).

Since the sample size is very small, it is very possible that this variation in the execution time is not directly related to the changes in this branch.

@jrfnl
Copy link
Member Author

jrfnl commented Feb 21, 2024

Thanks for testing @rodrigoprimo ! Could it be the new tests I added (which specifically trigger this "slow" code) which cause the extra run time ? Could be checked by checking out the branch and (temporarily) reverting the changes from the first commit.

@rodrigoprimo
Copy link
Contributor

@jrfnl, now that I ran the tests a few more times, I'm starting to think that the first data that I got is not very reliable. I'm seeing a significant variation in execution times between runs. A few times, the execution time using this branch is faster than using the main branch and sometimes it is not.

I tried running all the tests without the first commit and didn't notice much of a difference (but again, the execution time varies a lot anyway). Then I used --log-junit test-results.xml and parsed the results to get the execution time for each of the new tests, and they don't seem particularly slow:

PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiffNoFile: 0.011799 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "whitespace diff at end of file": 0.002502 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "no difference": 0.002197 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiffColoured: 0.002005 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiffDifferentLineEndings: 0.001895 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "line removed": 0.001693 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "line added": 0.001689 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "blank lines at end of file": 0.001648 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "whitespace diff at start of file": 0.001544 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "blank lines at start of file": 0.001512 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "trailing whitespace removed": 0.001454 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "var name changed": 0.001442 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "tab replaced with spaces": 0.001417 seconds

For comparison, here are the slowest tests for this run:

PHP_CodeSniffer\Standards\Generic\Tests\WhiteSpace\ScopeIndentUnitTest.testSniff: 0.59299 seconds
PHP_CodeSniffer\Standards\Squiz\Tests\Arrays\ArrayDeclarationUnitTest.testSniff: 0.340927 seconds
PHP_CodeSniffer\Standards\Squiz\Tests\ControlStructures\ForLoopDeclarationUnitTest.testSniff: 0.224787 seconds
PHP_CodeSniffer\Standards\Squiz\Tests\Commenting\FunctionCommentUnitTest.testSniff: 0.186899 seconds
PHP_CodeSniffer\Standards\Squiz\Tests\WhiteSpace\OperatorSpacingUnitTest.testSniff: 0.15944 seconds

@jrfnl
Copy link
Member Author

jrfnl commented Feb 21, 2024

@rodrigoprimo Thanks for doing that. The main thing I wanted to verify is that this doesn't cause a performance regression on non-Windows systems and if the timings are around the same time each time, with small differences, I don't think it does, as that's the normal behaviour.

@jrfnl
Copy link
Member Author

jrfnl commented Feb 21, 2024

Just to give you some idea of the difference this makes on Windows (on a monster of a desktop) - when running just the GenerateDiff tests, I consistently see a difference of 2 seconds in the test run (~3 seconds without the patch), ~1 second with the patch):

Timing without the patch (first commit only):

testGenerateDiffNoFile: 0.240376 seconds
testGenerateDiff with data set "no difference": 0.240557 seconds
testGenerateDiff with data set "line removed": 0.226245 seconds
testGenerateDiff with data set "line added": 0.270723 seconds
testGenerateDiff with data set "var name changed": 0.226196 seconds
testGenerateDiff with data set "trailing whitespace removed": 0.226120 seconds
testGenerateDiff with data set "tab replaced with spaces": 0.226315 seconds
testGenerateDiff with data set "blank lines at start of file": 0.315879 seconds
testGenerateDiff with data set "whitespace diff at start of file": 0.226627 seconds
testGenerateDiff with data set "blank lines at end of file": 0.225380 seconds
testGenerateDiff with data set "whitespace diff at end of file": 0.240838 seconds
testGenerateDiffColoured: 0.226026 seconds
testGenerateDiffDifferentLineEndings: 0.226189 seconds

Timing with the patch:

testGenerateDiffNoFile: 0.077598 seconds
testGenerateDiff with data set "no difference": 0.075645 seconds
testGenerateDiff with data set "line removed": 0.076428 seconds
testGenerateDiff with data set "line added": 0.076219 seconds
testGenerateDiff with data set "var name changed": 0.074596 seconds
testGenerateDiff with data set "trailing whitespace removed": 0.075834 seconds
testGenerateDiff with data set "tab replaced with spaces": 0.075706 seconds
testGenerateDiff with data set "blank lines at start of file": 0.074717 seconds
testGenerateDiff with data set "whitespace diff at start of file": 0.075771 seconds
testGenerateDiff with data set "blank lines at end of file": 0.074711 seconds
testGenerateDiff with data set "whitespace diff at end of file": 0.075682 seconds
testGenerateDiffColoured: 0.075340 seconds
testGenerateDiffDifferentLineEndings: 0.075768 seconds

@rodrigoprimo
Copy link
Contributor

I agree that there don't seem to be any performance regressions introduced by this PR.

Thanks for the additional details regarding the difference this PR makes in the execution time when running on Windows. I got curious and decided to check the execution times of the new tests without the patch on Linux, and there is an improvement as well.

Without the patch (just the first commit):

PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiffNoFile: 0.011705 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "tab replaced with spaces": 0.002201 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "blank lines at start of file": 0.002173 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "no difference": 0.00211 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "trailing whitespace removed": 0.001728 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "line added": 0.001649 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "line removed": 0.001616 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "blank lines at end of file": 0.001616 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "whitespace diff at start of file": 0.001575 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiffDifferentLineEndings: 0.001547 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiffColoured: 0.001528 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "whitespace diff at end of file": 0.001439 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "var name changed": 0.001392 seconds

With the patch:

PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiffNoFile: 0.002191 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "no difference": 0.001942 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "line removed": 0.001874 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "line added": 0.001625 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiffColoured: 0.001599 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "blank lines at end of file": 0.001595 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "trailing whitespace removed": 0.001578 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "var name changed": 0.001555 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "whitespace diff at end of file": 0.001554 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "whitespace diff at start of file": 0.001546 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiffDifferentLineEndings: 0.001503 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "tab replaced with spaces": 0.001345 seconds
PHP_CodeSniffer\Tests\Core\Fixer\GenerateDiffTest.testGenerateDiff with data set "blank lines at start of file": 0.001342 seconds

@jrfnl jrfnl force-pushed the feature/fixer-generate-diff-performance-improvement branch from 905e157 to 2fdbcc4 Compare February 22, 2024 01:55
@jrfnl
Copy link
Member Author

jrfnl commented Feb 22, 2024

Made one very tiny tweak still to the code to account better for the different return types of shell_exec() vs stream_get_contents():

-        if ($diff === null) {
+        if ($diff === false || $diff === '') {

These tests test the functionality and document the behaviour of the `Fixer::generateDiff()` method, with a particular focus on (trailing) whitespace handling in the diff as this was noted in the method itself as a potential point of failure.

Related to 146
Similar to 351 and 354, this is a change intended to make the test suite faster, in particular when running on Windows, as a quality of life improvement for contributing developers.

However, this change has the added benefit that the performance improvement will also be noticeable for users running PHPCS with the `diff` report, which also uses the `Fixer::generateDiff()` method.

On its own, this change makes running the test suite ~50% faster on Windows.
In combination with the other two changes, the difference is smaller, but still ~10%.
@jrfnl jrfnl force-pushed the feature/fixer-generate-diff-performance-improvement branch from 2fdbcc4 to 851bda2 Compare February 24, 2024 09:54
@jrfnl
Copy link
Member Author

jrfnl commented Feb 24, 2024

Rebased without further changes. Merging once the build passes.

@jrfnl jrfnl merged commit 45a76c3 into master Feb 24, 2024
44 checks passed
@jrfnl jrfnl deleted the feature/fixer-generate-diff-performance-improvement branch February 24, 2024 10:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants