Improve product export performance #1602
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The Bug
I discovered this bug in EE 1.14 while investigating it for a client. The same problem exists in Magento 2.
It boils down to the fact that as the number of categories and attributes increases the vanilla Magento product export gets significantly slower. All due to the bottleneck of the PHP
array_intersect
function.The
array_intersect
function has remained largely unchanged from 5.3 to 5.5, so this problem will likely affect most versions of magento.Based on my reading on stackoverflow and elsewhere I tried replacing
array_intersect
with a combination ofarray_combine
andarray_intersect_key
, and saw that it greatly improved performance. The full product export for the client dropped from ~10 minutes to ~2 minutes.Replication
My (not very scientific) method of testing and replicating this locally was to install
magento/module-sample-data
at1.0.0-beta
.As this sample data was not enough to trigger the
array_intersect
bottleneck, I programatically added another 2500 categories. Giving me a grand total of 2540 categories.Click 99b944e to see the script that generated them.
Profiling
I used xhprof to profile the export method, see the commits c6342a8 and cb4a42b.
In the following xhprof screenshots you can easily see that
array_intersect
takes first place as the bottleneck, once the code change is applied the biggest bottleneck sensibly goes back toPDOStatement::execute
Using
array_intersect
Using
array_combine
andarray_intersect_key