Issue 629 - Use a generator for Cells::getAllCacheKeys to improve performance #822
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Using a generator reduces memory usage and improves performance
when loading large spreadsheets.
This is:
Checklist:
Why this change is needed?
PHPSpreadsheet currently uses a lot of memory when loading large spreadsheets (see #648, #629). All of the coordinates in the spreadsheet are copied into a new array when the method
Cells::getAllCacheKeys
is called. Since the coordinates are concatenated with a new string this results in new strings being created in memory.This result of this method is only ever passed to
Psr\SimpleCache\Cacheinterface::getMultiple
andPsr\SimpleCache\Cacheinterface::setMultiple
. Since both of these methods accept aniterable
we can use a generator instead. Using a generator means we don't need to build the entire array in memory and instead can return one value at a time as needed.I benchmarked this with a ~100K row spreadsheet and saw a 16% improvement in run time and 20% improvement in memory usage. You can view a comparison here.
You can also view the individual profiles here:
This is the benchmark script I used:
I also attached the xlsx if you would like to run the benchmark yourself.