Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add code unit prefix, less than, and general sorting #248

Merged
merged 3 commits into from
May 10, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
106 changes: 104 additions & 2 deletions infra.bs
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,14 @@ Translation: ja https://triple-underscore.github.io/infra-ja.html
</pre>

<pre class="anchors">
urlPrefix: https://tc39.github.io/ecma262/; spec: ECMA-262;
urlPrefix: https://tc39.github.io/ecma262/#; spec: ECMA-262;
domenic marked this conversation as resolved.
Show resolved Hide resolved
type: dfn
text: %JSONParse%; url: sec-json.parse
text: %JSONStringify%; url: #sec-json.stringify
text: %JSONStringify%; url: sec-json.stringify
text: List; url: sec-list-and-record-specification-type
text: The String Type; url: sec-ecmascript-language-types-string-type
type: abstract-op; text: Call; url: sec-call
type: method; for: Array; text: sort(); url: sec-array.prototype.sort
</pre>


Expand Down Expand Up @@ -544,6 +545,73 @@ implementations of just <a>JavaScript strings</a> for performance and memory rea

<hr>

<p>A <a>string</a> <var>a</var> is a
<dfn export lt="code unit prefix|starts with">code unit prefix</dfn> of a <a>string</a> <var>b</var>
if the following steps return true:

<ol>
<li><p>Let <var>i</var> be 0.

<li>
<p><a>While</a> true:

<ol>
<li><p>Let <var>aCodeUnit</var> be the <var>i</var>th <a>code unit</a> of <var>a</var> if
<var>i</var> is less than <var>a</var>'s <a for=string>length</a>, otherwise null.

<li><p>Let <var>aCodeUnit</var> be the <var>i</var>th <a>code unit</a> of <var>b</var> if
<var>i</var> is less than <var>b</var>'s <a for=string>length</a>, otherwise null.

<li><p>If both <var>aCodeUnit</var> and <var>bCodeUnit</var> are null, then return true.

<li><p>If <var>aCodeUnit</var> is null and <var>bCodeUnit</var> is non-null, then return
true.

<li><p>If <var>aCodeUnit</var> is non-null and <var>bCodeUnit</var> is null, then return
false.

<li><p>Return false if <var>aCodeUnit</var> is different from <var>bCodeUnit</var>.

<li><p>Set <var>i</var> to <var>i</var> + 1.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've used "Increment i by x" for this, but maybe this is clearer? Another thing to settle when we do numbers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Increment" doesn't appear in Infra yet, so I'll leave it. But we do have an outstanding feature request, in #99.

</ol>
</li>
</ol>

<p>When it is clear from context that <a>code units</a> are in play, e.g., because one of the
strings is a literal containing only characters that are in the range U+0020 SPACE to U+007E (~),
"<var>a</var> starts with <var>b</var>" can be used as a synonym for "<var>b</var> is a
<a>code unit prefix</a> of <var>a</var>".

<p class=example id=code-unit-prefix-example>With unknown values, it is good to be explicit:
<var ignore>targetString</var> is a <a>code unit prefix</a> of <var>userInput</var>. But with a
literal, we can use plainer language: <var>userInput</var> starts with "<code>!</code>".

<p>A <a>string</a> <var>a</var> is <dfn export>code unit less than</dfn> a <a>string</a>
<var>b</var> if the following steps return true:

<ol>
<li><p>If <var>b</var> is a <a>code unit prefix</a> of <var>a</var>, then return false.

<li><p>If <var>a</var> is a <a>code unit prefix</a> of <var>b</var>, then return true.

<li><p>Let <var>n</var> be the smallest index such that the <var>n</var>th <a>code unit</a> of
<var>a</var> is different from the <var>n</var>th code unit of <var>b</var>. (There has to be such
an index, since neither string is a prefix of the other.)

<li><p>If the <var>n</var>th code unit of <var>a</var> is less than the <var>n</var>th code unit of
<var>b</var>, then return true.

<li><p>Return false.
</ol>

<p class="note">This matches the ordering used by JavaScript's <code>&lt;</code> operator, and its
{{Array/sort()}} method on an array of strings. This ordering compares the 16-bit code units in each
string, producing a highly efficient, consistent, and deterministic sort order. The resulting
ordering will not match any particular alphabet or lexicographic order, particularly for
<a>code points</a> represented by a surrogate pair. [[!ECMA-262]]

domenic marked this conversation as resolved.
Show resolved Hide resolved
<hr>

<p>To <dfn export>isomorphic encode</dfn> a <a>string</a> <var>input</var>, run these steps:</p>

<ol>
Expand Down Expand Up @@ -874,6 +942,26 @@ a new <a>ordered set</a> |clone|, so that <a for=set>replacing</a> "<code>a</cod
"<code>foo</code>" in |clone| gives « "<code>foo</code>", "<code>b</code>", "<code>c</code>" »,
while |original|[0] is still the <a>string</a> "<code>a</code>".

<p>To <dfn export for=list,stack,queue,set lt="sort in ascending order|sorting in ascending order|sort|sorting">sort in ascending order</dfn>
a <a>list</a> |list|, with a less than algorithm |lessThanAlgo|, is to create a new <a>list</a>
|sorted|, containing the same <a for=list>items</a> as |list| but sorted so that according to
|lessThanAlgo|, each item is less than the one following it, if any. For items that sort the same
(i.e., for which |lessThanAlgo| returns false for both comparisons), their relative order in
|sorted| must be the same as it was in |list|.

<p>To <dfn export for=list,stack,queue,set lt="sort in descending order|sorting in descending order">sort in descending order</dfn>
a <a>list</a> |list|, with a less than algorithm |lessThanAlgo|, is to create a new <a>list</a>
|sorted|, containing the same <a for=list>items</a> as |list| but sorted so that according to
|lessThanAlgo|, each item is less than the one preceding it, if any. For items that sort the same
(i.e., for which |lessThanAlgo| returns false for both comparisons), their relative order in
|sorted| must be the same as it was in |list|.

<p class=example id=example-list-sort>Let |original| be the <a>list</a> « (200, "<code>OK</code>"),
(404, "<code>Not Found</code>"), (null, "<code>OK</code>") ». <a for=list>Sorting</a> |original| in
ascending order, with |a| being less than |b| if |a|'s second <a for=struct>item</a> is
<a>code unit less than</a> |b|'s second <a for=struct>item</a>, gives the result « (404,
"<code>Not Found</code>"), (200, "<code>OK</code>"), (null, "<code>OK</code>") ».</p>

<hr>

<p>The <a>list</a> type originates from the JavaScript specification (where it is capitalized, as
Expand Down Expand Up @@ -1027,6 +1115,20 @@ a set of steps on each <a for=map>entry</a> in order, use phrasing of the form
"<a for=map>For each</a> |key| → |value| of |map|", and then operate on |key| and |value| in the
subsequent prose.

<p>To <dfn export for=map lt="sort in ascending order|sorting in ascending order|sort|sorting">sort in ascending order</dfn>
a <a>map</a> |map|, with a less than algorithm |lessThanAlgo|, is to create a new <a>map</a>
|sorted|, containing the same <a for=map>entries</a> as |map| but sorted so that according to
|lessThanAlgo|, each entry is less than the one following it, if any. For entries that sort the same
(i.e., for which |lessThanAlgo| returns false for both comparisons), their relative order in
|sorted| must be the same as it was in |map|.

<p>To <dfn export for=map lt="sort in descending order|sorting in descending order">sort in descending order</dfn>
a <a>map</a> |map|, with a less than algorithm |lessThanAlgo|, is to create a new <a>map</a>
|sorted|, containing the same <a for=map>entries</a> as |map| but sorted so that according to
|lessThanAlgo|, each entry is less than the one preceding it, if any. For entries that sort the same
(i.e., for which |lessThanAlgo| returns false for both comparisons), their relative order in
|sorted| must be the same as it was in |map|.


<h3 id=structs>Structs</h3>

Expand Down