-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix incorrect product statistics count #1728
Fix incorrect product statistics count #1728
Conversation
product statistics
So it'd be easier to write unit tests.
…g-product-statistics
There are two problems: 1. `$product_id_map` is an array map with the format like `[ 'online:en:TW:gla_5981' => 5981 ]` while the method `find_delete_product_ids` accepts an array of product IDs. 2. Even though we fix the above issue by using `array_values( $product_id_map)`, WCProductQuery couldn't get the product as well because the product is already trashed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments as I went through the code. When testing I was unable to get the two amounts to match up (I've got quite the mix of products).
For the syncable product count I get back:
{
"count": 301
}
And for the product statistics I get:
{
"timestamp": 1668615754,
"statistics": {
"active": 0,
"expiring": 0,
"pending": 0,
"disapproved": 205,
"not_synced": 111
},
"scheduled_sync": 0
}
I'll take another look tomorrow to see if I can pull up some exact product ID's and see why they aren't matching up.
src/Product/ProductHelper.php
Outdated
* | ||
* @return WP_Post | ||
*/ | ||
public function get_wc_product_by_wp_post( int $product_id ): WP_Post { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After we call this function we double check to make sure the product has a value (is not null), but the return type of this function does not allow null. Do we need a nullable return type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in 6c84398.
src/Jobs/DeleteProducts.php
Outdated
|
||
// Exclude any ID's which are not ready to delete. | ||
$product_id_map = array_intersect( $product_id_map, $ready_ids ); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By removing these lines we are reverting the PR #1238
I can see the flaws in the filter, but I don't think we should revert it completely otherwise we'll reintroduce the endless retry upon failure. I can see the logic of adding the status equals trash, but if we look at the sync hooks. It's hooked into both when a product is trashed and when it's deleted completely. In the latter case it won't be able to query the product data at all. Which means:
- we should still try the initial
delete_by_batch_request
- we should not retry a product that is not available in the database
In that case maybe we should move the filtering from process_items to the code block where the retries are scheduled. If we filter the ID map at that point then we can remove any no longer existing products and ones that retried too many times.
Although I would move this to a separate issue/PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, my brain wasn't working properly yesterday, I actually forgot to add back the deleted code so even though I added "status equals trash" logic in find_delete_product_ids
method from ProductRepository.php
, there is no one calling this method. So if I added it back like below It would try to find the product with status trash
then exclude the ones which are not ready to delete. Note that array_intersect( $product_id_map, $ready_ids );
can work as array_intersect
is working through the array values.
--- a/src/Jobs/DeleteProducts.php
+++ b/src/Jobs/DeleteProducts.php
@@ -37,6 +37,12 @@ class DeleteProducts extends AbstractProductSyncerJob implements StartOnHookInte
* @throws ProductSyncerException If an error occurs. The exception will be logged by ActionScheduler.
*/
public function process_items( array $product_id_map ) {
+ $product_ids = array_values( $product_id_map );
+ $ready_ids = $this->product_repository->find_delete_product_ids( $product_ids );
+
+ // Exclude any ID's which are not ready to delete.
+ $product_id_map = array_intersect( $product_id_map, $ready_ids );
+
$product_entries = BatchProductIDRequestEntry::create_from_id_map( new ProductIDMap( $product_id_map ) );
$this->product_syncer->delete_by_batch_requests( $product_entries );
}
but if we look at the sync hooks. It's hooked into both when a product is trashed and when it's deleted completely. In the latter case it won't be able to query the product data at all.
Correct me if I'm wrong, from the WooCommerce admin products page I can only delete a trash
product completely. Which means based on the above code, we would call delete_by_batch_request
when a product status is becoming trash
, and we would not call delete_by_batch_request
again when the product is deleted completely. Is it correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A product can still be force deleted (without going to trash) either through WP-CLI, API or making use of a plugin.
In that case we have the ID mapping scheduled for removal, but we can't load any product details when the scheduled task actually starts running. So I imagine in that case the filter wouldn't work and the product would get excluded from the batch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I think in this case it would be better to move this into different issue/PR so we don't block this PR. Do you think I should revert the commits 3f7a2cc and 7dfd462 then create an issue, or adding a new commit including the change in my previous comment and create an issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think fixing it temporarily by changing the status to trash (as long as we aren't reverting previous behaviour) is a good option. And then create a separate followup issue for filtering later, since that's covering behaviour that was already broken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The follow up issue is created in #1785.
So I collected the arrays of product ID's and did some comparisons. For counting syncable products I should have had 300 instead of 301. The extra one was because I have a variable subscription product, but the Subscription plugin was deactivated, in which case it detects the variations as a regular variable product. (This might be acceptable to leave as an edge case) For the counts I still had 16 products too many in the unsynced count, it seems this was from products that were synced once upon a time but the channel visibility was later toggled to "do not sync and show". Modifying the meta query for unsynced products seems to resolve that: 'meta_query' => [
[
'key' => ProductMetaHandler::KEY_SYNC_STATUS,
'compare' => '!=',
'value' => SyncStatus::SYNCED,
],
[
'key' => ProductMetaHandler::KEY_VISIBILITY,
'compare' => '=',
'value' => ChannelVisibility::SYNC_AND_SHOW,
],
], That gets me closer but I'm still off by one product. Which is a product that previously synced but I set the catalog visibility to hidden, which means it becomes unsyncable. Not sure what the best way to resolve that is. Should either the sync status or the channel visibility be reset in such a case? |
Thanks @mikkamp, those are very valuable results. I will update the meta query for unsynced products.
When setting a product's |
If I recall correctly there was a reason why we didn't update the visibility. In PR #880 is where it was implemented not to sync products hidden from the catalog. This PR also mentions products out of stock, I don't think I had any test products for this, but the same thing would probably apply. So we might want to test how that scenario works, what we don't want is the following scenario:
Step number 3 is not being done right now, so we'll have to be careful we don't introduce unwanted side effects. Maybe it's those cases that we want to have an extra sync status available like TEMPORARILY_UNSYNCED, or just delete the sync status (which throws it in the group of unsyncable products). |
Thanks for the detailed explanation on the history of why we made it. I think removing the sync status for such case (catalog visibility becomes hidden) so it falls into the group of unsyncable products is a good idea, as the product becomes unsyncable for sure. I'm looking for where is the best place to remove the sync status, at first I found google-listings-and-ads/src/Product/SyncerHooks.php Lines 191 to 193 in a9825c1
Setting catalog visibility to hidden the synced product becomes unsyncable (not sync ready) so the code above will trigger delete products job to delete a product from GMC. However adding the code to remove sync status here would be overridden by the code in the
There are some options we could do:
|
I don't think just checking
if ( $remove_sync_status && $this->is_sync_ready( $product ) {
$this->meta_handler->delete_sync_status( $product );
} else {
$this->meta_handler->update_sync_status( $product, SyncStatus::NOT_SYNCED );
} We might not even need the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the additional changes. I rechecked the calculations and I'm now getting matching numbers.
I just found one additional problem with it querying non-trash products. I added a suggestion to fix that and we can continue adding some tests for it in the followup PR.
This PR looks good to go then, so I'll approve it.
@@ -163,7 +164,8 @@ public function find_sync_ready_products( array $args = [], int $limit = - 1, in | |||
* @return array | |||
*/ | |||
public function find_delete_product_ids( array $ids, int $limit = - 1, int $offset = 0 ): array { | |||
$results = $this->find_by_ids( $ids, $limit, $offset ); | |||
$args = [ 'status' => 'trash' ]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now working for items that need to be deleted because they are in the trash. But for items that get deleted from Merchant Center for other reasons are not found/updated. For example I set a product to catalog visibility = hidden, which will trigger it to be deleted from the Merchant Center, but it's status remains "published".
Without any status it was previously querying the default WP statuses so it seems we should include trash
with the default array.
$args = [ 'status' => 'trash' ]; | |
$args = [ 'status' => [ 'draft', 'pending', 'private', 'publish', 'trash' ] ]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this, I've updated the issue #1785 so it will be handled in the next PR. Merging this PR first.
Changes proposed in this Pull Request:
Project post: pcTzPl-1a9-p2
This is a follow up for #1713, which is part of the project #1610.
The number returned by product statistics API are not aligned with the syncable products count because it includes the variation products. There are two parts when getting product statistics: counting
synced
products andnot_synced
products, we need to group the variations as one for the same parent for both parts.1. Group variation products when counting
synced
products.The solution is inspired by @mikkamp's comments in the PR 1713 by using two existing data
$this->product_data_lookup
and$this->product_statuses
fromMerchantStatuses.php
, the format of both data are:product_statuses
product_data_lookup
The modified code loop through
product_statuses
and checkproduct_data_lookup
to see if a product has a parent. From the above data we can get twosynced
products:589
and604
, where589
is a variable product that has two variations602
and603
.@mikkamp also suggested in pcTzPl-I1-p2#comment-1839 that we should consider a case where one variation is
approved
and the other isdisapproved
. So we add a new dataproduct_statistics_priority
to assign each status a priority (in this casedisapproved
has higher priority) then we can get the correct status for each product. The new data's format looks like this:2. Get syncable but not synced products
wp_postmeta
with meta key_wc_gla_mc_status
and has statues:wp_postmeta
with meta key_wc_gla_sync_status
and has statuses:The original implementation to get the
not_synced
products is: finding all thesimple
andvariable
products where their GMC status isnot_synced
. While it already excludes thevariation
products but the result would contains unsyncable products. Our goal is to get the syncable but not synced products, to achieve this we could simply get all thesimple
andvariable
products where theirSyncStatus
is notsynced
. The reason why we can useSyncStatus
is, only the syncable products would have a status recorded inSyncStatus
.3. Fix a problem that "trashing" a product would never really trigger delete batch requests
In DeleteProducts.php, the format of
$product_id_map
is:However in
ProductRepository.php
the methodfind_delete_product_ids()
accepts an array of WC product IDs. But even though we usearray_values
on$product_id_map
to pass the array of IDs, the product is already intrash
status so WC_Product_Query still couldn't get the product unless we set the args'status' => 'trash'
in the query.Detailed test instructions:
Use an existing Merchant Centre account
SyncStatus
andMCStatus
from the DB.Set up your accounts
, choose an existing Merchant Centre account.Complete your campaign
you will seeYour product listings are ready to be uploaded N products
.Skip this step for now
, as we don't need the Ads account for the test./wp-admin/admin.php?page=wc-status&tab=action-scheduler&s=update_all_products&action=-1&action2=-1&orderby=schedule&order=desc&paged=1
GET /mc/product-statistics/refresh
or go to GLA product feed page, the total number should be identical to theN
products in step 4.Create a new Merchant Centre account
Follow the above steps but in step 3 create a new Merchant Centre account.
Add, update, or delete a product
GET /mc/product-statistics/refresh
should be changed accordingly.Changelog entry