Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML API: Introduce safe HTML templating. #5949

Open
wants to merge 4 commits into
base: trunk
Choose a base branch
from

Conversation

dmsnell
Copy link
Member

@dmsnell dmsnell commented Jan 25, 2024

Trac ticket: Core-60229

🚧👷‍♂️🏗️ This feature is currently being proposed for the WordPress 6.6 release cycle, being pushed back from 6.5 to allow for more time to let the design work proceed.

Todo

  • embed replacement inside the Tag Processor
  • don't allow replacement of escaped funky comment syntax (currently occurs inside attribute value based on how this is built at the moment)
  • figure out what rules need to apply for nested HTML
  • figure out how to differentiate nested HTML without requiring sigils or other extra syntax

Description

Introduces a function providing context-aware auto-escaping HTML templating.

  • generate HTML markup in PHP without needing to escape anything.
  • replace placeholder content with data provided by PHP.
  • supply true to create a boolean attribute or false/null remove an attribute.

Why use these strange funky comments?

A funky comment looks like a tag closer, but the first character of what would be the tag name is a symbol. This was first mentioned in an HTML API progress report: there's a particular kind of HTML syntax error that meets a number of needs we have seen in attempting to replace dynamic content on the server. Namely:

  • Funky comments cannot be nested, by construction.
  • once inside a funky comment, all bytes are allowed until the first ASCII >, making parsing trivial and reliable.
  • The first symbol provides a very convenient sigil form to differentiate multiple kinds of bits of content: a placeholder, a shortcode, a translation, etc…
  • These are pure HTML and not a new superset syntax of HTML, meaning that when editing in an HTML editor they should stand out properly.
  • Being existing HTML syntax, they cannot break syntax boundaries and cause further parsing problems down the line
  • They are concise and easily hand-written.

What's in this patch?

This preliminary PR only renders child text and doesn't provide a mechanism for composing rendered HTML or rendering pre-processed HTML. That means everything passed as a child will be escaped to render verbatim in the browser. There are open questions about how best to represent nested HTML, and without a clear answer, this code prefers trust and safety over features.

  • does not render nested HTML (escapes everything)
  • relies on esc_attr() and esc_html() which today are the same thing in WordPress, but which could be greatly expanded to improve the overall escaping situation
echo WP_HTML_Template::render(
	<<<HTML
<a href="</%url>">
	<img src="</%url>">
	</%url>
</a>
HTML,
	array( 'url' => 'https://s.wp.com/i/atat.png?w=640&h=480&alt="atat>atst"' ),
);

outputs

<a href="https://s.wp.com/i/atat.png?w=640&amp;h=480&amp;alt=&quot;atat&gt;atst&quot;">
<img src="https://s.wp.com/i/atat.png?w=640&amp;h=480&amp;alt=&quot;atat&gt;atst&quot;">
https://s.wp.com/i/atat.png?w=640&amp;h=480&amp;alt=&quot;atat&gt;atst&quot;
</a>

This proposed templating syntax uses closing tags containing invalid tag names, so-called "funky comments," as placeholders, because they are converted to HTML comments in the DOM and because there is near universal existing support for them in all browsers, and because the syntax cannot be nested. The % at the front indicates that the value for the placeholder should come from the args array with a key named according to what follows the %.

@dmsnell dmsnell force-pushed the html-api/render-function branch 2 times, most recently from 0145fef to d275ba3 Compare January 25, 2024 23:59
@dmsnell dmsnell force-pushed the html-api/render-function branch from d275ba3 to 6d712f1 Compare January 30, 2024 23:11
Copy link

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

  • The Plugin and Theme Directories cannot be accessed within Playground.
  • All changes will be lost when closing a tab with a Playground instance.
  • All changes will be lost when refreshing the page.
  • A fresh instance is created each time the link below is clicked.
  • Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
    it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

@gziolo
Copy link
Member

gziolo commented Feb 16, 2024

I was chatting with @nerrad about createInterpolateElement implemented in @wordpress/element JavaScript package. The related dev note: https://make.wordpress.org/core/2020/07/17/introducing-createinterpolateelement/. What are your thoughts on offering a similar experience in PHP? For the context, I was wondering what it would require to reuse the same templating system in JavaScript.

The example from the dev note illustrates it better:

import { __ } from '@wordpress/i18n';
import { createInterpolateElement } from '@wordpress/element';
import { CustomComponent } from '../custom-component.js';
 
const translatedString = createInterpolateElement(
  __( 'This is a <span>string</span> with a <a>link</a> and a self-closing <custom_component/>.' ),
  {
    span: <span class="special-text" />,
    a: <a href="https://make.wordpress.org"/>,
    custom_component: <CustomComponent />,
  }
);

Copy link

github-actions bot commented Feb 16, 2024

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Unlinked Accounts

The following contributors have not linked their GitHub and WordPress.org accounts: @deeemsad.

Contributors, please read how to link your accounts to ensure your work is properly credited in WordPress releases.

Core Committers: Use this line as a base for the props when committing in SVN:

Props dmsnell, gziolo.

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@dmsnell
Copy link
Member Author

dmsnell commented Feb 16, 2024

What are your thoughts on offering a similar experience in PHP?

This is an interesting idea @gziolo but my initial reaction is wondering what this provides beyond the proposed templating. It feels different than JavaScript because we're already dealing with actual HTML strings. I'd be curious to see some of the tradeoffs this would bring.

In contrast, with the ability to "spread" attributes there's a built-in opportunity to make HTML tags placeholders. <span ...spanAttributes>something</span>. Also, if we pass things in like as with JavaScript, we're parsing two strings for each placeholder instead of one. In JavaScript all we're doing is directly creating an object for the replacement, but in PHP we're still creating a string.

@gziolo
Copy link
Member

gziolo commented Feb 20, 2024

In contrast, with the ability to "spread" attributes there's a built-in opportunity to make HTML tags placeholders. <span ...spanAttributes>something.

Right, that's a good point. The same functionality is available but with a different syntax that is, in fact, a bit closer to what people familiar with JSX would expect.

*
* ### Substitution Placeholders.
*
* - `</%named_arg>` finds `named_arg` in the arguments array, escapes its value if possible,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

% was chosen as the funky comment character because it's what sprintf uses?


// Replace entire attributes if their content is exclusively a placeholder, e.g. `title="</%title>"`.
$full_match = null;
if ( preg_match( '~^</%([^>]+)>$~', $value, $full_match ) ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

</%> is a funky comment that could set the "" (empty string) replacement. Was this a conscious decision to disallow empty string? I think in the HTML funky comment replacement it would handle this case.

I'm asking because I think the placeholder could be zero or more:

Suggested change
if ( preg_match( '~^</%([^>]+)>$~', $value, $full_match ) ) {
if ( preg_match( '~^</%([^>]*)>$~', $value, $full_match ) ) {

The tag and attribute handling should behave consistently on this point, either empty funky comments are allowed or disallowed everywhere.

(There are several more places where this change would be necessary.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants