-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve URL handling in Optimization Detective #1043
Improve URL handling in Optimization Detective #1043
Conversation
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.
To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
02aab4d
to
56f2ebe
Compare
….com/WordPress/performance into update/url-handling
…om/WordPress/performance into update/url-handling
* | ||
* @return string Current URL. | ||
*/ | ||
function od_get_current_url(): string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inspired by the same function in the AMP plugin: https://github.com/ampproject/amp-wp/blob/5708e19c60cf0f88ce522349fba955266b6892f6/includes/amp-helper-functions.php#L617-L662
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@westonruter LGTM, just a few small questions.
if ( isset( $_SERVER['REQUEST_URI'] ) ) { | ||
// phpcs:ignore WordPress.Security.ValidatedSanitizedInput.InputNotSanitized | ||
$current_url .= ltrim( wp_unslash( $_SERVER['REQUEST_URI'] ), '/' ); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about my favorite edge-case of sites within a subdirectory, this works as expected, right? I guess since $_SERVER['REQUEST_URI']
would include the full path, including the part that's potentially part of home_url()
, this should be fine. Just want to make sure since the function doesn't otherwise consider a path that may be part of home_url()
already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, REQUEST_URI
should include the full path including the subdirectory root path. Otherwise, this function here only uses home_url()
to obtain the default scheme
, host
, and port
. The path
is not used.
// phpcs:ignore WordPress.Security.ValidatedSanitizedInput.InputNotSanitized | ||
$current_url .= ltrim( wp_unslash( $_SERVER['REQUEST_URI'] ), '/' ); | ||
} | ||
return esc_url_raw( $current_url ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this use esc_url_raw()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, humm, since potentially malicious data is present in REQUEST_URI
. Perhaps this is overkill, but it has worked well in the AMP plugin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would think this is already handled by $wpdb->prepare()
when storing the value. But in any case, doesn't hurt I guess.
@@ -190,7 +190,7 @@ static function ( OD_URL_Metric $url_metric ): array { | |||
}, | |||
$group_collection->get_flattened_url_metrics() | |||
), | |||
JSON_PRETTY_PRINT | JSON_UNESCAPED_SLASHES // TODO: No need for pretty-printing. | |||
JSON_UNESCAPED_SLASHES // No need for escaped slashes since not printed to frontend. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completely unrelated question: Why is this relevant? Do frontend slashes have to be escaped? I'm asking since the Speculation Rules plugin JSON output contains backslashes which Domenic suggested to remove in my Speculation Rules post draft code example. I'm confused about whether those should or shouldn't be there. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. The slashes should always be present when the JSON is inline in HTML, as this prevents malicious strings from breaking out of the containing script
. For example, if someone somehow got </script>
into the data being serialized to JSON, this would prematurely terminate the script
element, whereas if slashes are escaped, then <\/script>
would not.
In the context of post_content
here where the JSON is never being printed to HTML, there is no need for this escaping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Er, so that should probably be restored to Speculation Rules 😊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks for explaining. I believe Speculation Rules already does this correctly then, feel free to check. I was referring to the Make Core post draft I've been working on.
18d36fe
into
feature/image-loading-optimization
post_title
of theod-url-metrics
post so there is apparently duplication with the URLs stored in thepost_content
. However, this may not be the case since the URL metrics being stored the post are grouped because they share the same normalized query vars. Each of the URL metrics have have a slightly different actual URL (although they should share the same canonical URL). Storing the original URL with each URL metric can assist with debugging issues with query var normalization.post_content
for URL Metrics posts.Checklist
[Focus]
orInfrastructure
label.[Type]
label.no milestone
label.