-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add script module data implementation #61658
Conversation
This pull request has changed or added PHP files. Please confirm whether these changes need to be synced to WordPress Core, and therefore featured in the next release of WordPress. If so, it is recommended to create a new Trac ticket and submit a pull request to the WordPress Core Github repository soon after this pull request is merged. If you're unsure, you can always ask for help in the #core-editor channel in WordPress Slack. Thank you! ❤️ View changed files❔ lib/experimental/script-modules.php |
003a1d7
to
4fbc4a9
Compare
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.
To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
$get_marked_for_enqueue = new ReflectionMethod( 'WP_Script_Modules', 'get_marked_for_enqueue' ); | ||
$get_import_map = new ReflectionMethod( 'WP_Script_Modules', 'get_import_map' ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like the best way to access the necessary data right now, these methods are private.
It's easier in the Core PR (WordPress/wordpress-develop#6433) where the action can invoke a class method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://core.trac.wordpress.org/ticket/60597 - they should be exposed as public methods as explained by @johnbillion. It's exactly what you encountered when trying to enhance the current implementation.
See also: https://core.trac.wordpress.org/ticket/60597#comment:9
Alrighty let's bump this. I should be able to work around this via the magic of the reflection API for now.
So I guess, it's a way to go here as well for now.
lib/experimental/script-modules.php
Outdated
/** | ||
* @since 18.4.0 | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a proper description.
lib/compat/wordpress-6.5/interactivity-api/class-wp-interactivity-api.php
Outdated
Show resolved
Hide resolved
4268051
to
5ef7976
Compare
5543059
to
28cb271
Compare
*/ | ||
public function print_client_interactivity_data() { | ||
public function print_client_interactivity_data( $interactivity_data ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name would no longer fit here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can roll back all of the Interactivity API changes from this PR. It could be landed separately, or when this proposal makes it to core, or not at all.
It is useful for demonstration purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine to deprecate this name if necessary. I wouldn't worry too much about it. Whatever is necessary to get to the point where we have a single way to expose data from the server 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know the policy for these compatibility implementations? Maybe this function can just stay the same because it should be removed soon, but deal with renaming and deprecation in Core when the time comes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code path gets executed in WP 6.4 which is highly unlikely to be still the case for most sites using the Gutenberg plugin after a few weeks from WP 6.5 release date. Well, at least in theory, that's the goal for using Gutenberg plugin to be always on the latest versions of both the plugin and WP core.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to revert these changes. They demonstrate that the change works but it's not ideal to change a public function like this (and it's unlikely to be used anyways).
$json_encode_flags = JSON_HEX_TAG | JSON_UNESCAPED_SLASHES; | ||
} | ||
|
||
wp_print_inline_script_tag( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the advantage of using a script tag per module? That's closer to what we existed before, but it also means that a similar logic would need to be present to every possible module that wants to consume some data. What if several modules need the same data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good question and it came up on the proposal. I did share some ideas there, but I'm going to do some profiling so I can share some numbers and better understand the tradeoffs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have two main thoughts about this, one is related to usage and the interface, the other is related to performance. I'll break them into comments, here's the usage/interface bit:
What if several modules need the same data?
If we look at how inline scripts are used to provide data to Core Scripts, it doesn't seem like there's a lot of overlap. If things are well encapsulated and provide the functionality we need, we don't need to send the same data to many scripts. apiFetch
takes care of the API, other scripts use apiFetch
, and internally they don't need to worry about the REST URL because apiFetch
deals with that.
If a lot of different things need to know about some piece of data, maybe they should be getting it from the REST API.
Finally, if we reach a point where a lot of different modules really do need to access the same data, and that data needs to be immediately available, is essential, or it is very specific to the current page view, then we could do that within this model pretty easily by a "provider module". Consider:
add_filter( 'scriptmoduledata_@package/data-provider', function ( $data ) { $data[ 'veryInteresting' ] = 'Indeed!'; return $data } );
// @package/data-provider
let data = null
try {
const text = dom.getElementById( 'wp-scriptmodule-data_@wordpress/interactivity' )?.textContent;
if ( text ) {
data = JSON.parse( text );
}
} catch {}
export { data };
// @package/my-other-module
import { data } from '@package/data-provider';
if ( data?.veryInteresting ) {
console.log( "Interesting? %s", data?.veryInteresting ); // "Interesting? Indeed!"
}
Here the data provider module can parse the data once and make it available to other scripts that depend on it. A single module is responsible for reaching into the DOM element, parsing the data, heck maybe it parses it with a schema 🤷 , but that module is responsible for dealing with the data from the server and providing it to other modules. This also provides a nice, single filter for that data that is sent into this module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part 2! I wanted to understand how the browser behaves differently in each of these cases. When there's very little data embedded in the page, it's unlikely to have any significant impact on the page.
I expected multiple script tags to perform better generally, but I was surprised by the results. I suspect there may be some unusual things going on with the large amount of memory in use.
In general, it seems like multiple script tags are parsed more quickly by the browser, the
individual JSONs are parsed faster, and even when parsing all of the JSON split up into multiple
script tags it outperforms a single script tag.
I'll do a part 3 with a smaller pages to see if these results are consistent.
I generated a few pages with the same data set either in a single script tag or divided in multiple script tags. The items in the data looked like this, each with 1000 properties:
{
"878cac28b636400d": 0,
"3dd5389abcb79f72": 1,
"…": 1000
}
This was either in a single script tag or divided into multiple script tags:
<!-- single -->
<script id="single" type="application/json">
{
"1982e3a7bb241055": { "878cac28b636400d": 0, "…": 999 },
"65cd25028f98f158": { "d0698444ed39c832": 0, "…": 999 }
"…10,000 items": {}
}
</script>
<!-- multiple -->
<script id="1982e3a7bb241055" type="application/json">
{ "878cac28b636400d": 0, "…": 999 }
</script>
<script id="65cd25028f98f158" type="application/json">
{ "d0698444ed39c832": 0, "…": 999 }
</script>
<script id="…10,000 tags" type="application/json">
{}
</script>
The HTML was otherwise minimal. The pages contained a script module tag that would simulate getting embedded data from a script tag.
Some stats on the pages:
Scripts | JSON total bytes | HTML total bytes | Script tags |
---|---|---|---|
Single | 229,110,001 | 229,110,626 | 1 |
Multiple | 228,897,112 | 229,550,577 | 1,000 |
difference | -212,889 | +439,951 | +999 |
These were plain HTML pages stored on my computer and served via php -S localhost:9090
.
I opened Chrome and did some page load performance profiling with a 6x CPU slowdown. I expected multiple to outperform single, but I was surprised by the results when I started testing so I added multiple-all to try to
- Single - load the page, find the DOM element, parse its JSON and find a value (2 levels deep).
- Multiple - load the page, find the DOM element, parse its JSON and find a value (1 levels deep).
- Multiple-all - load the page, find all the
script[type="application/json"]
DOM elements, parse
the JSON for each, get it's first key and add it to an array.
Scripts | Network | FCP | DCL | parse |
---|---|---|---|---|
Single | 23.98s | 27.25s | 49.36s | 21.45s |
Multiple | 5.711s | 5.77s | 5.77s | 1.27ms |
Multiple-all | 5.681s | 5.74s | 23.04s | ≈ 17.3s |
I'm very surprised about the Network time, I'm not sure why the network request takes so long for
the single page. I tried several different servers, but it was consistent.
DCL (DOMContentLoaded) first after the module has executed, so at that point all parsing is
complete. We can clearly see that parsing all the data from a single JSON is very costly. I did not
calculate the sum of all the parse time on multiple-all, but I was surprised to see that the script
finishes much faster than the single script (23.04s - 5.74s ≈ 17.3s vs 21.45s) when it's doing more
work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With a much smaller amount of total JSON, I observe similar results. This time with 10 objects of 1000 elements, ~228910 bytes of JSON. (network was negligible here)
Scripts | FCP | DCL | parse |
---|---|---|---|
Single | 91.0ms | 54.1ms | 12.38ms |
Multiple | 65.2ms | 36.0ms | 0.77ms |
Multiple-all | 89.8ms | 53.0ms | – |
Multiple, unsurprisingly, outperforms single. Multiple-all is similar to single but it still seems to perform better.
It seems clear to me that multiple script tags are the way to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent analysis. It all makes perfect sense now that you shared the summary of your findings. In particular having multiple JSON objects optimizes the access as it only needs to process the data that is necessary at a given time. I also now remember that there were some performance issues with parsing serialized very large JSON objects. That is probably reflected in multiple-all scenario.
What was again the reasoning behind using the script tag to share data? I was wondering if we could use instead the data attribute on the script tag since it has to be parsed anyway on-demand. We have data to support the
reasoning of being in favor the approach where we expose the data scoped for every individual ES Module. So essentially, I would be curious to see how consolidation of exposed data on the connected script tag would impact the result - smaller HTML size? Easier printing by connecting with a single script tag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized one thing after some more thinking. When the module script is only listed in the import map, we don't print the script tag. So we rather should discard this idea, as it would only work better in the scenario when the script tag gets printed, but it would make it all more confusing otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think script tags are a better choice than attributes. The contents of these script tags are practically ignored by the browser, whereas attributes will be subject to more parsing rules. This makes the encoding/escaping more difficult and likely implies more work for the browser.
But the big thing is that there's no obvious place that attribute should go. Modules may appear in their own script tags (if enqueued) or in the importmap (if they're dependencies) or in both places. It's not clear which of those places they should go. Given that, I think there would probably need to be some other tag for this and <script type="application/json">
seems largely ideal to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was surprised at some of the perf findings and asked @sgomes to check my findings. In summary, it looks like the performance differences I was finding were not as drastic as it seems, this was likely due to some deoptimizations from the Chrome devtools.
With some adjustments to the benchmarks and other testing methodology, results are much closer. Now it looks like multiple script tags and parsing all of the script tags is about ~6% slower than parsing a single big script tag. This is more in line with what we might expect. This is 6% slower on a very fast operation and not likely cause for concern (a is single, b is multi-all):
Test results for old version (http://127.0.0.1:38080/a.html)
Mean: 278.28ms
Standard deviation: 4.02ms
Slowest measurement: 289.7ms
Fastest measurement: 272.1ms
Test results for new version (http://127.0.0.1:38080/b.html)
Mean: 296.2ms
Standard deviation: 3.25ms
Slowest measurement: 303ms
Fastest measurement: 290.4ms
The new version appears to be SLOWER than the old version.
The new version appears to be 6.44% slower (takes 106.44% of the time of the old version).
Other reasons remain to prefer multiple script tags. For example, it's unlikely that all the data for all the modules will be used immediately on page load, in which case it makes sense to only parse the desired data and not parse data we're not interested in over and over.
Sharing some additional context to think about. Regular scripts offer a formal API to add some metadata to registered script handles with https://developer.wordpress.org/reference/functions/wp_script_add_data/: wp_script_add_data( 'twentytwenty-js', 'strategy', 'defer' );
wp_script_add_data( 'twentysixteen-html5', 'conditional', 'lt IE 9' ); However, that would go against my other comment #61658 (comment) where I'm asking whether the data should be always connected to a module id. |
28cb271
to
a1d2c2a
Compare
My only remaining feedback is that the following could get further abstracted at some point: add_filter(
'scriptmoduledata_@wordpress/interactivity',
function ( $data ) {
$data['my-added-data'] = 'it works';
return $data;
}
); Something closer to: wp_script_module_add_data( '@wordpress/interactivity', function ( $data ) {
$data['my-added-data'] = 'it works';
return $data;
} ); The other question is whether |
I'm open to discussing this further, but the filter seemed like a perfect fit for what's needed and it seemed like a WordPress way of doing things. Is this question more about adding a helper method that would add the appropriate filter, or is it about the filter itself? First, I implemented a function that would just set the data for a module. The module data would be stored by the Script Module class. Then I noticed that some places set or add data conditionally from different places, so I thought maybe we'd need to merge the data, or provide Then I noticed that we only print data for the modules that may be used on the page, so we only need to worry about the data for some modules. Why even bother collecting and storing the data we'll never use? Additionally, we can do this late in the page lifecycle which makes a filter pretty easy to use. With the per-module filter, we don't have to store the data and only process what we need, it seems like an ideal fit.
This is related to what I mentioned above, one reason I like the per-module filter is that we only deal with the data we need to process. The data is created, filtered, printed if necessary, then discarded. The data is scoped to the function (and filters) that use it for a single module. If we change this model, then we'll need to start storing the data. We may get into situations where the order of processing is important - and not filter weights but problems where module A gets processed before module B. It just seems to invite complication when this feature has opportunities to remain small and focused. |
I checked at the related functionality for scripts and styles that have existed in the WordPress Core for a long time. I don't think we can conclude that filters are the primary way to add more functionality. See in particular the following helper methods: https://developer.wordpress.org/reference/functions/wp_script_add_data/ I shared some examples in #61658 (comment) for scripts, and here are some other styles: wp_style_add_data( 'twentytwenty-style', 'rtl', 'replace' ); // Activates RTL version of the style.
wp_style_add_data( $style_handle_name, 'path', $style_path_norm ); // Stores the path to the CSS file to use to inline styles. Sometimes, it's better to use a registry to hold the additional information like it's done with scripts and styles. Hooks are convenient and could be used for nearly everything, but we shouldn't default to them when considering the core functionality. |
@gziolo Would you like me to build out a |
I offered my feedback for your and other reviewers consideration. I did some additional research based on the patterns I observed while working with styles and scripts. The final API isn't something that has a clear path forward, as it all depends on what the context will be when this new API is going to be used - from WP core, from the plugin, or from a theme. |
I agree. My preference at this time would be to keep only the filter. That is the simplest form and should be completely compatible with an approach that has another function and stores the data associated with modules. |
a1d2c2a
to
284568e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @sirreal 👏
I reviewed the PR and the accompanying Proposal. In terms of the technical implementation, I have no comments other than the one above. Happy to approve once that's fixed.
I did initially wonder if we have sufficient use cases to make this a general API. The comment mentions the following packages that could benefit from it:
wp-api-fetch
sets up middleware.wp-date
sets up localization, timezone, and format data.wp-i18n
sets text direction locale data.
So my doubts are largely satisfied 🙂.
Use it for Interactivity API store data passing
Co-authored-by: Bernie Reiter <[email protected]>
284568e
to
becb188
Compare
Flaky tests detected in becb188. 🔍 Workflow run URL: https://github.com/WordPress/gutenberg/actions/runs/9301601226
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving this, to unblock it from being merged, which might be time-sensitive at this point.
While I haven't personally given the PR a thorough review, I'm basing my approval on @michalczaplinski's review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sirreal ! This looks good to me! 👍
These demonstrate that the proposed changes work and are adequate for Interactivity API, but change a public function signature and implementation.
PR adding the data passing to Core (analogous to this implementation): WordPress/wordpress-develop#6682 PR updating Core Interactivity API to use this data passing: |
$get_import_map = new ReflectionMethod( 'WP_Script_Modules', 'get_import_map' ); | ||
|
||
$modules = array(); | ||
foreach ( array_keys( $get_marked_for_enqueue->invoke( wp_script_modules() ) ) as $id ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm getting an error on this line. Seems it's because I'm running PHP 7.4. We support down to 7.2 according to #60680.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'll prepare a fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a `scriptmoduledata_{ MODULE_ID }` filter that allows data to be embedded in page HTML for use by Script Modules. For example: add_filter( 'scriptmoduledata_MyScriptModuleID', function ( array $data ): array { $data['doesIt'] = 'it works'; return $data; } ); See the proposal for details: https://make.wordpress.org/core/2024/05/06/proposal-server-to-client-data-sharing-for-script-modules/
Add a `scriptmoduledata_{ MODULE_ID }` filter that allows data to be embedded in page HTML for use by Script Modules. For example: add_filter( 'scriptmoduledata_MyScriptModuleID', function ( array $data ): array { $data['doesIt'] = 'it works'; return $data; } ); See the proposal for details: https://make.wordpress.org/core/2024/05/06/proposal-server-to-client-data-sharing-for-script-modules/
What?
Add a filter that allows data to be embedded in the page HTML for use by Script Modules.
Use the system to pass Interactivity API data.
Adds a new filter:
scriptmoduledata_{ MODULE_ID }
. If the filter returns non-empty data, it will be embedded in a script tag in the page HTML.If this moves into Core, we should be able to use the same filter name and disable the filter based on the presence of the core function name.
See this proposal for details: Proposal: Server to client data sharing for Script Modules
Why?
All the details are in the proposal.
Testing Instructions
This can be tested by anything using the Interactivity API with a server-provided store or configuration. The data should be embedded in an HTML tag on the page.
It should be covered by Interactivity e2e tests.