Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce ILO_URL_Metrics_Group and ILO_URL_Metrics_Group_Collection classes #1010

Merged

Conversation

westonruter
Copy link
Member

@westonruter westonruter commented Feb 23, 2024

Summary

This was a sub-PR of #988. It has now been merged.

This implements these aspects of #933:

  • Introduce class to represent a grouped URL metrics collection by breakpoint (e.g. ILO_Grouped_URL_Metric_Collection).
    • The class could receive a list (indexed array) of ILO_URL_Metric instances and then take care of grouping them accordingly. The grouping internally could just be a map of breakpoint => ILO_URL_Metric[].
    • The class should probably implement Traversable, so that it can be iterated just like an array (over the breakpoints).
    • The class could have a method unshift( ILO_URL_Metric $metric ) that replaces the ilo_unshift_url_metrics() function and integrates the given metric properly, while discarding excess metrics beyond the configured sample size at the same time.
    • Some of the grouping "configuration" (e.g. breakpoints, sample size) may also be useful to provide via constructor.
  • The ilo_group_url_metrics_by_breakpoint() function could then use these two classes.
    • It seems the module currently always uses the breakpoints from ilo_get_breakpoint_max_widths() and the sample size from ilo_get_url_metrics_breakpoint_sample_size(). If so, we could probably avoid those function calls elsewhere and only call them in ilo_group_url_metrics_by_breakpoint(). In other words, only the group class constructor would take them as parameters (for flexibility and testing), but the wrapper function to instantiate the group class wouldn't, it would just use the default ones, to avoid having to pass those parameters around in several places.
  • Any functions that currently expect a $grouped_url_metrics array should be updated to expect an instance of the new group class. If there are particular usages where another method on that class would be helpful, it could be added based on need.
  • Functions that currently expect a (non-grouped) $url_metrics array but then immediately group them without doing anything else to the original array should be modified to expect an instance of the new group class as well, avoiding another call to the grouping function. This also helps separating responsibilities.

Relevant technical choices

  • A new ILO_Grouped_URL_Metrics class is introduced which now has methods which were previously functions. URL metrics which are added to instances of this class are initially-grouped and kept grouped as new URL metrics are added. The method names are improved, for example ilo_unshift_url_metrics is now just add. The constructor takes a list of the URL metrics, the breakpoints, the sample size, and the freshness TTL.
  • A new method are_all_groups_populated is added which internalizes the logic which was previously in ilo_optimize_template_output_buffer(). This eliminates the need to understand the internal data structure.
  • Another new method flatten is introduced which returns a list of ILO_URL_Metric instances. This is used when obtaining the URL metrics to serialize into the post type, as well as internally when calling the add method.

Checklist

  • PR has either [Focus] or Infrastructure label.
  • PR has a [Type] label.
  • PR has a milestone or the no milestone label.

@westonruter westonruter added [Type] Enhancement A suggestion for improvement of an existing feature [Focus] Images no milestone PRs that do not have a defined milestone for release [Plugin] Optimization Detective Issues for the Optimization Detective plugin labels Feb 23, 2024
@westonruter westonruter changed the title Create initial draft of ILO_Grouped_URL_Metrics class Introduce ILO_Grouped_URL_Metrics class Feb 23, 2024
@westonruter westonruter marked this pull request as ready for review February 28, 2024 01:19
Copy link

github-actions bot commented Feb 28, 2024

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.

Co-authored-by: westonruter <[email protected]>
Co-authored-by: felixarntz <[email protected]>

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

Base automatically changed from add/url-metric-class to feature/image-loading-optimization February 29, 2024 01:05
Copy link
Member

@felixarntz felixarntz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@westonruter Overall this looks good. I think there are a few things we can tidy up and simplify in terms of the methods the class exposes.

* @param float $current_time Current time, defaults to `microtime(true)`.
* @return array<int, array{int, bool}> Array of tuples mapping minimum viewport width to whether URL metric(s) are needed.
*/
public function get_needed_minimum_viewport_widths( float $current_time = null ): array {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to find a simpler name for this.

Some ideas:

  • get_incomplete_minimum_viewport_widths()
  • get_incomplete_viewport_widths()
  • get_incomplete_group_identifiers() (which should be clear enough given that the groups are always by minimum viewport width)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think needed is fine because it's relevant not only if the group doesn't have the sample size but also if any of the URL metrics in the group are stale.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent some time thinking about this some more. I just pushed up some more changes. Now instead of get_needed_minimum_viewport_widths() the method is called get_viewport_group_lacking_statuses(). Not any less verbose. But maybe more clear?

* @access private
*
* @param float $current_time Current time, defaults to `microtime(true)`.
* @return array<int, array{int, bool}> Array of tuples mapping minimum viewport width to whether URL metric(s) are needed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a simpler API, which would also be better in line with the method name, would be to return a simple indexed array of the minimum viewport widths.

I'm not sure there is a need for this to return all minimum viewport widths with a boolean indicator.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me longer than I wished to prove this out but I believe I've finally done so. In db8ae4a I added a test which:

  1. Establishes a single breakpoint for which there are two viewport groups.
  2. Fully populates the group for the wider viewports and leaves the group empty for narrower viewports.
  3. Attempts to over-populate the wider viewport group with yet another URL metric.

When switching to only return the simple indexed array of the minimum viewport widths needing URL metrics (as I did in 794061d), the REST API endpoint fails to properly block attempts to overpopulate the wider viewport group. The reason is that the narrower viewport group is encountered first when iterating which then sets $last_was_needed to true:

$last_was_needed = false;
foreach ( $needed_minimum_viewport_widths as $minimum_viewport_width ) {
if ( $viewport_width >= $minimum_viewport_width ) {
$last_was_needed = true;
} else {
break;
}
}
if ( ! $last_was_needed ) {
return new WP_Error(

It never even considers the wider viewport group since it isn't returned by get_needed_minimum_viewport_widths(). Contrast this with the previous working code:

$last_was_needed = false;
foreach ( $needed_minimum_viewport_widths as list( $minimum_viewport_width, $is_needed ) ) {
if ( $viewport_width >= $minimum_viewport_width ) {
$last_was_needed = $is_needed;
} else {
break;
}
}
if ( ! $last_was_needed ) {
return new WP_Error(

Note that in this case it iterates over all viewport widths ensuring that the $last_was_needed can be switched back to false to block a submission.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you see my comment #1010 (comment)? Wouldn't something like that address the problem in a more elegant way? We don't need to get a list that includes all viewports to find out whether the viewport we already know (as specified in the request) is full.

Looking at my previous suggestion again, I think I missed one thing though, the minimum viewport width is of course not necessarily the exact viewport width provided. So we need a method that, for a given viewport width, returns the appropriate breakpoint / minimum viewport width that identifies the relevant group. Still, I think we could simplify this via a method in the class that checks for a given viewport width whether its information is needed.

For even more separation of concerns, this could work as follows:

/*
 * Get the correct breakpoint / minimum viewport width that the given viewport width belongs to.
 * For example, for breakpoints 0, 600, and 1200:
 * - 599 would return 0
 * - 1199 would return 600
 * - 1200 would return 1200
 * - 8000 would return 1200
 */
$minimum_viewport_width = $grouped_url_metrics->get_minimum_viewport_width( $viewport_width );

/*
 * If the group for that breakpoint / minimum viewport width is already complete/full
 * (i.e. as large as the sample size), fail with an error.
 */
if ( $grouped_url_metrics->is_group_complete( $minimum_viewport_width ) ) {
	return new WP_Error( '...' );
}

This would abstract away the logic so we wouldn't need to have a complicated foreach in here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, I wonder whether in terms of naming we should use "breakpoint" instead of "minimum viewport width". They are not exactly the same technically, but "breakpoint" is just a lot more concise for all those method and variable names. Plus no confusion with a viewport (i.e. a concrete viewport that isn't just the "minimum viewport" for a group).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the logic can be encapsulated into a method to avoid having to iterate over the values in the REST API handler. I wanted to first sanity check why get_needed_minimum_viewport_widths() can't just return an array of integers for the viewport widths that actually need URL metrics.

Regarding the naming to use breakpoint, it can get a bit tricky because the breakpoints are the division between the groups, whereas the minimum viewport widths correspond to the number of groups themselves. The breakpoint count is always one less than the number of minimum viewport widths.

Let me think about how this method might be named a bit more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breakpoints are the maximum widths before entering the next group. Given 3 breakpoints 480, 600, and 782 then there would be 4 groups for the following viewport ranges:

  • 0-480
  • 481-600
  • 601-782
  • 782-∞

So there is always one more group than there are breakpoints. I organized around "minimum viewport widths" since it makes sense to anchor off of the zero value, rather than to work from the other end at infinity (or PHP_INT_MAX). I don't have a problem with "minimum viewport widths" as a name, as it is much more explicit at what it is talking about. Another name like "group ID" or "breakpoint group" is more general and you'd have to remind yourself what it is referring to.

// Convert breakpoint max widths into viewport minimum widths.
$minimum_viewport_widths = array_map(
static function ( $breakpoint ) {
return $breakpoint + 1;
},
$this->breakpoints
);
$grouped = array_fill_keys( array_merge( array( 0 ), $minimum_viewport_widths ), array() );

Separately, WDYT about #1010 (comment)? Looking at b7d0819#diff-4369a8ab8272ee7760fdc9ce4442c633c520114a2d593c859b7807e7ddb806dcR163, I still don't understand why we have to call that method here. IMO it would be more intuitive to have one method to get the correct "breakpoint group ID" or "minimum viewport width" for the group that the given viewport width falls into, and another method to check whether that group is filled.

It could be refactored to access the data in a different way specifically for the purposes of how the data is needed by the REST API here, to specifically look up one single viewport width to see if its group is filled or not. However, the data is needed as well in ilo_optimize_template_output_buffer() in two places:

$needed_minimum_viewport_widths = $grouped_url_metrics->get_needed_minimum_viewport_widths();
// Whether we need to add the data-ilo-xpath attribute to elements and whether the detection script should be injected.
$needs_detection = in_array(
true,
// Each array item is array{int, bool}, with the second item being whether the viewport width is needed.
array_column( $needed_minimum_viewport_widths, 1 ),
true
);

if ( $needs_detection ) {
$head_injection .= ilo_get_detection_script( $slug, $needed_minimum_viewport_widths );
}

In the first, it needs to look at all the groups together to see if any of them need URL metrics. This is because at this point we don't know the user's viewport width. So it doesn't make sense here to look up a specific group since there is no specific viewport yet known.

In the second instance, we export the data to detect.js so that the client logic (which does know the viewport width) can decide whether or not to send the URL metrics to the server.

So in both of these cases, we need the full data returned by get_needed_minimum_viewport_widths and this is re-used in the REST API endpoint via this new is_group_filled method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in both of these cases, we need the full data returned by get_needed_minimum_viewport_widths and this is re-used in the REST API endpoint via this new is_group_filled method.

With the latest changes, these methods are now called get_viewport_group_lacking_statuses and is_viewport_group_lacking, respectively.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I've been commenting above, I've been working on refactoring this further. I've introduced a ILO_URL_Metrics_Group_Status which contains both the minimum_viewport_width and the is_lacking data. This object is then passed around which I think makes it more clear and readable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@westonruter

So in both of these cases, we need the full data returned by get_needed_minimum_viewport_widths and this is re-used in the REST API endpoint via this new is_group_filled method.

The full list is really only needed in the ilo_get_detection_script() call. For the $needs_detection check a more intuitive solution would be to just call a method on the class, see #1010 (comment).

With the latest changes, these methods are now called get_viewport_group_lacking_statuses and is_viewport_group_lacking, respectively.

Why? See #1010 (comment), a positive term would be more intuitive. Whether it's "full" or "filled" doesn't really matter from my perspective, but either is IMO easier to understand and shorter. Negative method names are more complicated, particularly as they may be used with negation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I've been commenting above, I've been working on refactoring this further. I've introduced a ILO_URL_Metrics_Group_Status which contains both the minimum_viewport_width and the is_lacking data. This object is then passed around which I think makes it more clear and readable.

See #1010 (comment) for basically a response to this.

modules/images/image-loading-optimization/optimization.php Outdated Show resolved Hide resolved
@westonruter westonruter requested a review from felixarntz March 6, 2024 15:46
Copy link
Member

@felixarntz felixarntz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@westonruter Thanks for the iterations, this looks great!

A few minor comments for consideration, please take a look, mostly nit-picks or suggestions.

Comment on lines 31 to 37
/**
* Breakpoints in max widths.
*
* @var int[]
* @phpstan-var int<1, max>[]
*/
private $breakpoints;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is for instance 480, 800, and ... (what's used for the maximum?), which would mean the groups are:

  • 0-480
  • 481-800
  • 800-infinite

Do I understand that correctly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If two breakpoints were provided: 480 and 800, then there would be 3 groups with viewport ranges:

  1. 0-600
  2. 601-800
  3. 801-∞

A breakpoint must be greater than zero, because it is not possible to have a viewport that is zero pixels wide. A breakpoint wouldn't in practice be the maximum because that would mean there could be a viewport at maximum + 1. In this way, the inclusion of max in the typing is not accurate. It should rather be int<1, max-1>[] but I don't think that's supported.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Further improved typing in 34e3d06 to clarify the range of values allowed for breakpoints.

Comment on lines 68 to 70
$breakpoints = array_values( array_unique( $breakpoints, SORT_NUMERIC ) );
foreach ( $breakpoints as $breakpoint ) {
if ( $breakpoint <= 0 ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above, how is the maximum / infinite value handled? The last breakpoint group would not really have a maximum. Does that value need to be provided or is it handled automatically? I just wonder whether additional validation is needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last group has a maximum width value set PHP_INT_MAX. As such, an actual breakpoint wouldn't be able to have that value, since then it would try to create an even bigger breakpoint group which has a minimum viewport width of PHP_INT_MAX + 1 but still a maximum viewport width of PHP_INT_MAX which would be smaller than the minimum viewport width.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I was thinking for the last group you wouldn't add the +1. That would mean every breakpoint represents a group, while right now there's always one less breakpoint than there are groups.

I guess both makes sense from different perspectives, so the current approach WFM.

Comment on lines 111 to 113
if ( ! $this->is_viewport_width_in_range( $url_metric->get_viewport()['width'] ) ) {
return false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than checking this here and returning a boolean, a more explicit API would be to check whether is_viewport_width_in_range in the URL_Metrics_Group_Collection class before calling add_url_metric, with the method here throwing an exception if the given metric doesn't fit in the group.

This prevents mistakes in a way the developer will for sure find out about it. Returning a success/failure value from a method that does something other than check for a specific thing isn't very intuitive and often may go ignored. Think about all the register_post_type() calls in plugins and how uncommon it is to check for the return value - even though there is one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I've implemented this in c362a6f. Nevertheless, a URL metric should always fit into a group contained in a collection because when the collection creates the groups it will ensure the groups span all non-negative integers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. This is just to make a clean API, should someone try to do something custom with it. Plus for any potential future iterations within the plugin code itself.

@westonruter westonruter merged commit 23ba5bf into feature/image-loading-optimization Mar 7, 2024
28 checks passed
@westonruter westonruter deleted the add/grouped-url-metrics-class branch March 7, 2024 09:35
@westonruter westonruter changed the title Introduce ILO_Grouped_URL_Metrics class Introduce ILO_URL_Metrics_Group and ILO_URL_Metrics_Group_Collection classes Mar 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no milestone PRs that do not have a defined milestone for release [Plugin] Optimization Detective Issues for the Optimization Detective plugin [Type] Enhancement A suggestion for improvement of an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants