Remove array indexing by entityId #11086
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Remove the indexing of the
$values
array. It is unnecessary and introduces and assumption that theentity_id
androw_id
are the same in thecatalog_category_entity
table. When theentity_id
androw_id
differ then it is possible that the$values
array can contain indexes for anentity_id
which will be inserted into the indexer tmp table and when a row with the samerow_id
as thisentity_id
is attempted to insert it will cause a SQL constraint violation on the tmp table.On line 384 the index of the
$values
array is overwritten by therow_id
of the row. If theentity_id
androw_id
are not the same then some array indexes will not be "filled" by a row. These unfilled indexes create row in the tmp table at insert time which will later be filled by actualrow_id
s from real rows.e.g. Category Flat Data indexer process unknown error:
SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry '501' for key 'PRIMARY', query was: INSERT INTO
catalog_category_flat_store_2_tmp
(row_id
,entity_id
,created_in
,updated_in
,attribute_set_id
,parent_id
,created_at
,updated_at
,path
,position
,level
,children_count
,store_id
,all_children
,automatic_sorting
,available_sort_by
,children
,custom_apply_to_products
,custom_design
,custom_design_from
,custom_design_to
,custom_layout_update
,custom_use_parent_settings
,default_sort_by
,description
,display_mode
,filter_price_range
,fredhopper_category_id
,image
,include_in_menu
,include_vat
,is_active
,is_anchor
,landing_page
,meta_description
,meta_keywords
,meta_title
,name
,page_layout
,path_in_store
,url_key
,url_path
) VALUES ...This seems to be possible when the chunk size is less than the total categories in the tree for a store (we have ~1300) because the same
row_id
gets used twice in 2 different inserts.Fixed Issues (if relevant)
Could not find any open issues and bug exists in
Manual testing scenarios
row_id
andentity_id
(recreated where these value differed by 1) and where there are more than 500 categories in the tree for any 1 store. (500 is the chunk size for the$entityIds
array)catalog_category_flat
indexer and get a SQL constraint error - see above$values
array and re-run the indexerContribution checklist