-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add storage for URL metrics to optimize image loading #878
Add storage for URL metrics to optimize image loading #878
Conversation
array( | ||
'methods' => 'POST', | ||
'callback' => 'image_loading_optimization_handle_rest_request', | ||
'permission_callback' => '__return_true', // Needs to be available to unauthenticated visitors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's going to be public, this probably needs some other kind of protection to prevent being flooded with entries.
Years ago I built something similar where multiple requests were blocked by storing the IP address in a transient. Maybe it's useful here:
https://github.com/wearerequired/rest-likes/blob/8c522be82ca468622d287c7a8539ea1dc72e45e9/classes/Controller.php#L251-L253
https://github.com/wearerequired/rest-likes/blob/8c522be82ca468622d287c7a8539ea1dc72e45e9/classes/Controller.php#L282-L353
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Yes, I was also intending to do something similar following the same pattern as WP comments.
'methods' => 'POST', | ||
'callback' => 'image_loading_optimization_handle_rest_request', | ||
'permission_callback' => '__return_true', // Needs to be available to unauthenticated visitors. | ||
'args' => array( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the URL also passed as an argument?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! 😊 However, this could also obtained via the Referer
HTTP request header. But probably better to be explicit, even though perhaps Referer
would be a bit safer to rely on. If nothing else, a custom validation callback could be used for the url
arg to make sure it matches the Referer
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added url
in b4e29d6
…on' into add/image-loading-optimization-storage * origin/feature/image-loading-optimization: Ignore webp-uploads/fallback.js from linting for now Undo turning off no-console eslint rule Add IMAGE_LOADING_OPTIMIZATION_VERSION constant Remove accounting for /wp-* symlinks in local dev
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@westonruter This looks great. A few additional thoughts, but I think this is almost ready for merge.
* @return int TTL in seconds. | ||
*/ | ||
function ilo_get_page_metric_storage_lock_ttl(): int { | ||
|
||
/** | ||
* Filters how long a given IP is locked from submitting another metric-storage REST API request. | ||
* | ||
* Filtering the TTL to zero will disable any metric storage locking. This is useful during development. | ||
* | ||
* @param int $ttl TTL. | ||
*/ | ||
return (int) apply_filters( 'ilo_metrics_storage_lock_ttl', MINUTE_IN_SECONDS ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should document that this can also be 0
and that, in that case, transients shouldn't be used at all?
Two related thoughts:
- What is the rationale for allowing
0
? - If we keep it, we need to triple-check that this value is never simply passed through to
set_transient()
. As far as I can tell, that's currently (rightfully) not the case, but it's a bit error-prone, and in case of an error it would lead to transients without expiration, which ties into my other scalability concern with lots of transients. With expiring transients, a large number can be fine, but not if they remain forever.
I think to harden against the potential problems from my second point, it would be good to make this a private function, so that nobody can call it externally (potentially without knowing these implications).
I think the 4 functions in this file could be easily refactored into e.g. a ILO_Page_Metric_Storage_Lock
class, where the methods to get the TTL and transient key could be private. The public interface could then simply be the lock()
and is_locked()
methods.
Alternatively, we disallow returning 0
via the filter, avoiding the risk in a simple way. We could also reconsider later, combining with potential refactoring to keep use of this function "contained".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is useful to allow zero, beyond just for development purposes. For example, consider this example:
add_filter( 'ilo_metrics_storage_lock_ttl', static function ( $ttl ) {
return is_user_logged_in() ? 0 : $ttl;
} );
This allows authenticated users in WP to store page metrics without any locking, which could even make sense to do by default. A user who is already authenticated in WP is not going to be the same level of risk to necessitate locking. I've added this as an example in the filter doc in da0b420.
Also, I'm now ensuring that the TTL is never less than zero.
As for when the TTL is zero exactly, code is in place to make sure that no transient is set:
function ilo_set_page_metric_storage_lock() /*: void (in PHP 7.1) */ { | |
$ttl = ilo_get_page_metric_storage_lock_ttl(); | |
$key = ilo_get_page_metric_storage_lock_transient_key(); | |
if ( 0 === $ttl ) { | |
delete_transient( $key ); | |
} else { | |
set_transient( $key, microtime( true ), $ttl ); | |
} | |
} |
So there is no possible way for a transient to be set with a zero expiration. Not really a need to make it private, though I intend to mark all of the functions as private in any case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, making the function private ensures nobody uses it incorrectly. Obviously we can't truly make it private as long as it's a function, but if someone still uses it, we have no obligation to keep it working for them.
The rationale for this specific function to be private is that it's error-prone to have to remember that you have to always add an extra condition for whether TTL is 0
since otherwise you'd create transients without expiration. That's why I think this function specifically really should be private, or extremely well documented on what should happen in the case it returns 0
. See #878 (comment).
Co-authored-by: Felix Arntz <[email protected]>
Co-authored-by: Felix Arntz <[email protected]>
const ILO_PAGE_METRICS_POST_TYPE = 'ilo_page_metrics'; | ||
|
||
/** | ||
* Registers post type for page metrics storage. | ||
* | ||
* This the configuration for this post type is similar to the oembed_cache in core. | ||
* | ||
* @since n.e.x.t | ||
* @access private | ||
*/ | ||
function ilo_register_page_metrics_post_type() /*: void (in PHP 7.1) */ { | ||
register_post_type( | ||
ILO_PAGE_METRICS_POST_TYPE, | ||
array( | ||
'labels' => array( | ||
'name' => __( 'Page Metrics', 'performance-lab' ), | ||
'singular_name' => __( 'Page Metrics', 'performance-lab' ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this still needs to be updated per #878 (comment)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 9876aa4
|
||
const ILO_REST_API_NAMESPACE = 'image-loading-optimization/v1'; | ||
|
||
const ILO_PAGE_METRICS_ROUTE = '/page-metrics'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above, if we changed the post type name, this name should be changed accordingly.
Additionally though, following up on #878 (comment), I don't think just this name is semantically correct. POST /page-metrics
would be intended to create a new resource of this type, which isn't really the case as this endpoint doesn't create a new post of the metrics post type, but modifies an existing one with specific data for a request. That is effectively a custom method, so it shouldn't use just the resource based identifier.
Core's endpoints indeed do not include :
characters, though it's probably not an issue introducing one. But if we want to tie more nicely into core's naming, we should simply use another forward slash to separate the custom method from the resource, i.e. /page-metrics/store
(or /url-metrics/store
). There are already a few endpoints in core that use a custom method in this way.
Maybe the store
is also not super clear, how about store-request
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
POST /page-metrics
would be intended to create a new resource of this type, which isn't really the case as this endpoint doesn't create a new post of the metrics post type, but modifies an existing one with specific data for a request.
Actually, it can create a new resource of that type. If there were no page metrics gathered for the URL, then a new ilo_page_metrics
post would be created. However, if the post does already exist, then essentially it creates a new sub-resource (by adding the page metric to the collection of page metrics for a given URL).
I don't feel strongly about the endpoint name, however. If you want to add a :
then I'm fine with it. But IMO the POST
method's broad semantics don't require it. In particular, the HTTP spec includes this use case for POST
:
Appending data to a resource's existing representation(s).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, it can create a new resource, but it doesn't always, and that's not the purpose of the endpoint.
Regarding "Appending data to a resource's existing representation(s).", that applies to a specific resource. In other words, POST /metrics/<id>
can append data to that resource. However POST /metrics
does not touch any existing resource, it's typically intended to create a new resource. This is applied in most core endpoints in that way and also the best practice per https://google.aip.dev/133.
I don't feel strongly about the :
, however I do think a custom method is the appropriate solution for what this endpoint does. It's not a distinct CRUD operation for a resource (as it can either create or update), and the parameters aren't fields on the resource.
If we consider the requests within a post of the post type a sub resource, we could go with that approach, but that would require knowing the post's identifier, which is probably too much of a hassle. So with the way things are now I think POST /{page|url}-metrics{:|/}store-request
works best.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've implemented this in 698d89f. I think store
is better than store-request
, as it is immediately following url-metrics
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking store-request
because the data that is processed by that endpoint is from a single specific request. And it's storing that within a URL metrics resource, rather than the URL metrics resource itself.
With only store
, it sounds like you're storing a resource of the URL metrics post type, which if that was the case should not be a custom method. So I think store-request
makes that more clear.
But not a blocker if you feel strongly about it.
* @param string $slug Page metrics slug. | ||
* @return WP_Post|null Post object if exists. | ||
*/ | ||
function ilo_get_page_metrics_post( string $slug ) /*: ?WP_Post (in PHP 7.1) */ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say this is a bit overkill for now. There are a ton of places like this in core and in plugins, and I don't think it's worth adding this to all of them. So I would say let's stick with what we can add at this point.
…_ttl() Co-authored-by: Felix Arntz <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
03f3252
into
feature/image-loading-optimization
Thanks all for the great reviews! |
Summary
See #871.
Relevant technical choices
A new
ilo_url_metrics
post type is introduced which stores the client-submitted URL metrics as JSON inpost_content
, similar to thecustomize_changeset
andoembed_cache
post types. Thepost_name
is a MD5 hash of the query vars from the$wp->query_vars
global after normalization. (Using the WP query vars is safer than using the URL due to random additional query vars added by 3rd parties, such as analytics.)Collected URL metrics are segmented into viewport breakpoints. By default it attempts to collect a sample size of three URL metrics for each breakpoint. The default breakpoint is 480, which means there will be a total of 6 URL metrics collected: 3 for mobile for viewport widths from 0 to 480, and 3 for desktop for viewport widths larger. The breakpoints can be modified with the
ilo_breakpoint_max_widths
filter, and the sample size can be filtered withilo_url_metrics_breakpoint_sample_size
. Each URL metric is stored with the timestamp at which it was captured. When a URL metric is older than a given freshness TTL (filtered byilo_url_metric_freshness_ttl
), a new URL metric will be attempted to be measured for that viewport breakpoint. New URL metrics for a given breakpoint push out the oldest URL metrics for that breakpoint.When a client sends a URL metric, that client is locked from sending another URL metric for one minute (filtered by
ilo_metrics_storage_lock_ttl
). Client-side logic prevents another request from being made viasessionStorage
, while a server-side check with the REST API is also done using the client IP address. Similarly, if no URL metrics are needed for the current viewport (given the breakpoints), no measurement will be performed and no data will be sent; likewise, the server will also reject any submission for viewport URL metrics it doesn't need.REST API requests are allowed by unauthenticated visitors, but the parameters of the request (namely the WP query vars that the URL metrics are for) are authenticated with an HMAC via a WP nonce.
Other points:
image_loading_optimization_
prefix has been shorted toilo_
.hooks.php
has been moved todetection.php
.To test facilitate testing, turn on
WP_DEBUG
and playing around with filters to disabling locking, zeroing out the freshness TTL, and changing the breakpoint sample size, for example:Checklist
[Focus]
orInfrastructure
label.[Type]
label.no milestone
label.