[WIP] Try: Use JSON to encode block attribute values #1088

westonruter · 2017-06-09T03:56:25Z

Depends on #1087, among many other things, including a PHP parser generator for the PEG.js grammar to be utilized in the do_blocks PHP function.

Fixes #1086.

nb · 2017-06-12T14:33:25Z

❤️ the idea, I feel this will make things a lot simpler on so many levels:

for just a couple of attributes it’s almost as pretty;
allows more sophisticated attribute structures;
has some type support as it differentiates between numbers/strings/booleans/arrays/key-value pairs;
makes parsing trivial (and maybe faster);
makes serialization simpler and more accessible for other tools, plenty of JSON encoders there;
solves some escaping/quoting problems we already have.

dmsnell · 2017-06-12T14:42:56Z

thanks @nb for the comments.

for just a couple of attributes it’s almost as pretty;

I'm not sure I agree on this

name="Alice" age=25 vs. {"name":"Alice","age":25}

allows more sophisticated attribute structures;

for sure. of course, simply storing JSON in the existing attributes lets us do this already

has some type support as it differentiates between numbers/strings/booleans/arrays/key-value pairs;
makes serialization simpler and more accessible for other tools, plenty of JSON encoders there;

yes. 👍
(although still easily possible by just storing JSON in an existing named attribute)

makes parsing trivial (and maybe faster);

I doubt this would speed up the parse because we'd be moving from native JSON parsing in the browser (really really fast) to parsing in JavaScript.

solves some escaping/quoting problems we already have.

yeah I think so, through I don't see escaping as a big deal. there should be almost no exceptions on escaping that I can think of (single and double quote characters) so I'm not sure this is a big advantage here. more so it brings some new challenges because while JSON and POJOs (plain-old JavaScript objects) are similar, JSON has more constraints and we'd need to make sure we carefully consider which specifications we want to implement.

thanks!

nb · 2017-06-12T14:54:15Z

I'm not sure I agree on this

Almost :)

for sure. of course, simply storing JSON in the existing attributes lets us do this already

Adds another branch – you know the long-term effect on logic branching, especially at such a higher level.

types & serialization
yes. 👍
(although still easily possible by just storing JSON in an existing named attribute)

This is one of the strongest points for me – parsing consistency (in baba="1" is this a true or a 1 or the string one). Also, if I want to insert a block via the REST API from a totally different language, I don’t have a consistent and familiar way to do it. Maybe something in the vein of htmlspecialchars in PHP (or HTML escaping in any other language) will do it, but my developer confidence wouldn’t be very high. It’s so much easier to communicate to everybody – “JSON!” and it’s all 100% clear.

dmsnell · 2017-06-15T09:46:04Z

Proposed by @nb maybe online somewhere or maybe in person…

<!-- wp:core/text {
  description: "content is a valid JSON object or POJO"
  terms: {
    POJO: "Plain-old-javascript object. Like JSON but with fewer double-quotes"
  }
} -->
Just some content
<!-- /wp:core/text -->

It would be nice to know what kind of "gotcha"s we might encounter doing unserialized JSON/JS in the HTML comment. Further, should we include @westonruter's import of the JSON grammar? (at first I thought probably not, but now that we're discussing actual unescaped JSON I could easily see it being valuable)

What happens if the JSON parse fails but the block otherwise would be good?
What do we need to do to prevent breaking out of the HTML comment? (e.g. { text: 'End a comment with "-->"' } )

sirreal · 2017-06-15T10:52:01Z

Spec

The HTML spec on comments is pretty quick to read:

Comments must start with the four character sequence U+003C LESS-THAN SIGN, U+0021 EXCLAMATION MARK, U+002D HYPHEN-MINUS, U+002D HYPHEN-MINUS ().

I'll attempt to TL;DR:

start with <!--
start sequence isn't immediately followed by > or ->, ie  are not valid start sequences.
not contain --
not end with -, meaning ---> is an invalid closing sequence.
end with -->

Here's an example HTML to highlight problematic comments which can be pasted into a validator:

<!doctype html>
<html>
<head><title>Title</title></head>
<body>
<!-- VALID -->
<!--
  <Everything/> else is -\- `{ fair: "game" }`.
  Starting and ending with fixed character such as " " is a good practice.
  The remaining problem is two sequential "-" characters.
-->

<!--> INVALID -->
<!---> INVALID -->
<!-- INVALID -- -->
<!-- INVALID --->
</body>
</html>

Suggestions

The peculiarities of opening and closing sequences are easily mitigated by opening comments with something static, this appears to be either or / currently, so there shouldn't be any problems.

That means the only invalid sequence that concerns us is --. I'm not sure what the best way of escaping this is.

A few proposals to consider are:

Considering objects and arrays, I don't believe - can appear anywhere outside of a string. That would mean that -- could be escaped as -\-, identical in Javascript strings and legal comment text content
The unicode sequence is another alternative \u002D.

"--" === "-\-"          // true
"--" === "\u002D\u002D" // true

dmsnell · 2017-06-15T10:59:53Z

@sirreal thanks so much for finding and summarizing the spec. looks to me like replacing the hyphens with the Unicode literal would be best. we could run into other troubles with strings of dashes, like -------

though I would propose we use \u{2d} instead of \u002d even though they have the same number of characters. personal style is all.

nb · 2017-06-15T21:18:46Z

@dmsnell, @sirreal are there any notes from the work done today in person?

dmsnell · 2017-06-16T13:25:26Z

@nb after talking with you and fighting your very good challenges, we arrived on this transform…

const serializeAttrs = attrs => JSON.stringify( attrs )
	.replace( /--/g, '\\u002d\\u002d' ) // don't break HTML comments
	.replace( /</g, '\\u003c' ) // don't break standard-non-compliant tools
	.replace( />/g, '\\u003e' ) // ibid
	.replace( /&/g, '\\u0026' ) // ibid

nb · 2017-06-16T14:46:35Z

❤️

And unserializeAttr is still just JSON.parse?

dmsnell · 2017-06-16T14:49:44Z

@nb check out #1213 where I have implemented this idea

And unserializeAttr is still just JSON.parse?

yep, though it happens in the parser and the app need not change it. this was a choice I made because I didn't want the parser to produce an output which needed further processes sing to be a basic reasonable output

westonruter · 2017-06-16T16:41:11Z

Closing in favor of #1213. Love the outcome here guys. 👍

Use JSON to encode block attribute values

e6f2671

dmsnell mentioned this pull request Jun 11, 2017

Parser: Allow escaped quotes in attribute strings #1130

Merged

dmsnell mentioned this pull request Jun 15, 2017

Refactor serializer: rearrange code for clarity and introspection #1148

Merged

dmsnell mentioned this pull request Jun 16, 2017

Parser: Store attributes in unescaped JSON format #1213

Merged

westonruter closed this Jun 16, 2017

westonruter deleted the try/json-attribute-value-encoding branch June 16, 2017 16:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Try: Use JSON to encode block attribute values #1088

[WIP] Try: Use JSON to encode block attribute values #1088

westonruter commented Jun 9, 2017

nb commented Jun 12, 2017

dmsnell commented Jun 12, 2017

nb commented Jun 12, 2017

dmsnell commented Jun 15, 2017

sirreal commented Jun 15, 2017

dmsnell commented Jun 15, 2017

nb commented Jun 15, 2017

dmsnell commented Jun 16, 2017

nb commented Jun 16, 2017

dmsnell commented Jun 16, 2017

westonruter commented Jun 16, 2017

[WIP] Try: Use JSON to encode block attribute values #1088

[WIP] Try: Use JSON to encode block attribute values #1088

Conversation

westonruter commented Jun 9, 2017

nb commented Jun 12, 2017

dmsnell commented Jun 12, 2017

nb commented Jun 12, 2017

dmsnell commented Jun 15, 2017

sirreal commented Jun 15, 2017

Spec

Suggestions

dmsnell commented Jun 15, 2017

nb commented Jun 15, 2017

dmsnell commented Jun 16, 2017

nb commented Jun 16, 2017

dmsnell commented Jun 16, 2017

westonruter commented Jun 16, 2017