From 7151ce31e593d734e9e117fc2378612eb4a6521f Mon Sep 17 00:00:00 2001 From: Domenic Denicola Date: Fri, 3 May 2019 16:38:07 -0400 Subject: [PATCH 1/3] Add code unit prefix, less than, and general sorting Closes #55. --- infra.bs | 95 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 93 insertions(+), 2 deletions(-) diff --git a/infra.bs b/infra.bs index 91f1360..99b8f3c 100644 --- a/infra.bs +++ b/infra.bs @@ -8,13 +8,14 @@ Translation: ja https://triple-underscore.github.io/infra-ja.html
-urlPrefix: https://tc39.github.io/ecma262/; spec: ECMA-262;
+urlPrefix: https://tc39.github.io/ecma262/#; spec: ECMA-262;
     type: dfn
         text: %JSONParse%; url: sec-json.parse
-        text: %JSONStringify%; url: #sec-json.stringify
+        text: %JSONStringify%; url: sec-json.stringify
         text: List; url: sec-list-and-record-specification-type
         text: The String Type; url: sec-ecmascript-language-types-string-type
     type: abstract-op; text: Call; url: sec-call
+    type: method; for: Array; text: sort(); url: sec-array.prototype.sort
 
@@ -542,6 +543,62 @@ actually ends up representing JavaScript and scalar value strings. It is even fairly typical for implementations to have multiple implementations of just JavaScript strings for performance and memory reasons. +

A string a is a code unit prefix of a string +b if the following steps return true:

+ +
    +
  1. Let i be 0.

  2. + +
  3. +

    While true:

    + +
      +
    1. Let aCodeUnit be the ith code unit of a, or null + if i is greater than or equal to a's length.

    2. + +
    3. Let bCodeUnit be the ith code unit of b, or null + if i is greater than or equal to b's length.

    4. + +
    5. If both aCodeUnit and bCodeUnit are null, then return true.

    6. + +
    7. If aCodeUnit is null and bCodeUnit is non-null, then return + true.

    8. + +
    9. If aCodeUnit is non-null and bCodeUnit is null, then return + false.

    10. + +
    11. Return false if aCodeUnit is different from bCodeUnit.

    12. + +
    13. Set i to i + 1.

    14. +
    +
  4. +
+ +

A string a is code unit less than a string +b if the following steps return true: + +

    +
  1. If b is a code unit prefix of a, then return false. + +

  2. If a is a code unit prefix of b, then return true. + +

  3. Let n be the smallest index such that the nth code unit of + a is different from the nth code unit of b. (There must be such an + index, since neither string is a prefix of the other.) + +

  4. If the nth code unit of a is less than the nth code unit of + b, then return true. + +

  5. Return false. +

+ +

This matches the ordering used by JavaScript's < operator, and its +{{Array/sort()}} method on an array of strings. This ordering compares the 16-bit code units in each +string, producing a highly efficient, consistent, and deterministic sort order. The resulting +ordering will not match any particular alphabet or lexoicographic order, particular for +code points represented by a surrogate pair. [[!ECMA-262]] + +


To isomorphic encode a string input, run these steps:

@@ -874,6 +931,26 @@ a new ordered set |clone|, so that replacing "afoo" in |clone| gives « "foo", "b", "c" », while |original|[0] is still the string "a". +

To sort in ascending order +a list |list|, with a less than algorithm |lessThanAlgo|, is to create a new list +|sorted|, containing the same items as |list| but sorted so that according to +|lessThanAlgo|, each item is less than the one following it, if any. For items that sort the same +(i.e., for which |lessThanAlgo| returns false for both comparisons), their relative order in +|sorted| must be the same as it was in |list|. + +

To sort in descending order +a list |list|, with a less than algorithm |lessThanAlgo|, is to create a new list +|sorted|, containing the same items as |list| but sorted so that according to +|lessThanAlgo|, each item is less than the one preceding it, if any. For items that sort the same +(i.e., for which |lessThanAlgo| returns false for both comparisons), their relative order in +|sorted| must be the same as it was in |list|. + +

Let |original| be the list « (200, "OK"), +(404, "Not Found"), (null, "OK") ». Sorting |original| in +ascending order, with |a| being less than |b| if |a|'s second item is +code unit less than |b|'s second item, gives the result « (404, +"Not Found"), (200, "OK"), (null, "OK") ».

+

The list type originates from the JavaScript specification (where it is capitalized, as @@ -1027,6 +1104,20 @@ a set of steps on each entry in order, use phrasing of the form "For each |key| → |value| of |map|", and then operate on |key| and |value| in the subsequent prose. +

To sort in ascending order +a map |map|, with a less than algorithm |lessThanAlgo|, is to create a new map +|sorted|, containing the same entries as |map| but sorted so that according to +|lessThanAlgo|, each entry is less than the one following it, if any. For entries that sort the same +(i.e., for which |lessThanAlgo| returns false for both comparisons), their relative order in +|sorted| must be the same as it was in |map|. + +

To sort in descending order +a map |map|, with a less than algorithm |lessThanAlgo|, is to create a new map +|sorted|, containing the same entries as |map| but sorted so that according to +|lessThanAlgo|, each entry is less than the one preceding it, if any. For entries that sort the same +(i.e., for which |lessThanAlgo| returns false for both comparisons), their relative order in +|sorted| must be the same as it was in |map|. +

Structs

From a2c46f195af234ab3beb021a2e117024d1561bfe Mon Sep 17 00:00:00 2001 From: Domenic Denicola Date: Fri, 3 May 2019 16:58:30 -0400 Subject: [PATCH 2/3] Add "starts with" synonym for "code unit prefix" --- infra.bs | 34 +++++++++++++++++++++++----------- 1 file changed, 23 insertions(+), 11 deletions(-) diff --git a/infra.bs b/infra.bs index 99b8f3c..f5960ef 100644 --- a/infra.bs +++ b/infra.bs @@ -543,37 +543,49 @@ actually ends up representing JavaScript and scalar value strings. It is even fairly typical for implementations to have multiple implementations of just JavaScript strings for performance and memory reasons. -

A string a is a code unit prefix of a string -b if the following steps return true:

+
+ +

A string a is a +code unit prefix of a string b +if the following steps return true:

    -
  1. Let i be 0.

  2. +
  3. Let i be 0.

  4. -

    While true:

    +

    While true:

    1. Let aCodeUnit be the ith code unit of a, or null - if i is greater than or equal to a's length.

    2. + if i is greater than or equal to a's length.
    3. Let bCodeUnit be the ith code unit of b, or null - if i is greater than or equal to b's length.

    4. + if i is greater than or equal to b's length. -
    5. If both aCodeUnit and bCodeUnit are null, then return true.

    6. +
    7. If both aCodeUnit and bCodeUnit are null, then return true.

    8. If aCodeUnit is null and bCodeUnit is non-null, then return - true.

    9. + true.
    10. If aCodeUnit is non-null and bCodeUnit is null, then return - false.

    11. + false. -
    12. Return false if aCodeUnit is different from bCodeUnit.

    13. +
    14. Return false if aCodeUnit is different from bCodeUnit. -

    15. Set i to i + 1.

    16. +
    17. Set i to i + 1.

+

When it is clear from context that code units are in play, e.g., because one of the +strings is a literal containing only characters that are in the range U+0020 SPACE to U+007E (~), +"a starts with b" can be used as a synonym for "b is a +code unit prefix of a". + +

With unknown values, it is good to be explicit: +targetString is a code unit prefix of userInput. But with a +literal, we can use plainer language: userInput starts with "!". +

A string a is code unit less than a string b if the following steps return true: From 66540e102a338ad889c00c7510cafdc53d22f959 Mon Sep 17 00:00:00 2001 From: Domenic Denicola Date: Thu, 9 May 2019 11:52:36 -0400 Subject: [PATCH 3/3] Fix nits --- infra.bs | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/infra.bs b/infra.bs index f5960ef..366f6de 100644 --- a/infra.bs +++ b/infra.bs @@ -556,11 +556,11 @@ if the following steps return true:

While true:

    -
  1. Let aCodeUnit be the ith code unit of a, or null - if i is greater than or equal to a's length. +

  2. Let aCodeUnit be the ith code unit of a if + i is less than a's length, otherwise null. -

  3. Let bCodeUnit be the ith code unit of b, or null - if i is greater than or equal to b's length. +

  4. Let aCodeUnit be the ith code unit of b if + i is less than b's length, otherwise null.

  5. If both aCodeUnit and bCodeUnit are null, then return true. @@ -595,8 +595,8 @@ literal, we can use plainer language: userInput starts with "!<

  6. If a is a code unit prefix of b, then return true.

  7. Let n be the smallest index such that the nth code unit of - a is different from the nth code unit of b. (There must be such an - index, since neither string is a prefix of the other.) + a is different from the nth code unit of b. (There has to be such + an index, since neither string is a prefix of the other.)

  8. If the nth code unit of a is less than the nth code unit of b, then return true. @@ -607,10 +607,9 @@ literal, we can use plainer language: userInput starts with "!<

    This matches the ordering used by JavaScript's < operator, and its {{Array/sort()}} method on an array of strings. This ordering compares the 16-bit code units in each string, producing a highly efficient, consistent, and deterministic sort order. The resulting -ordering will not match any particular alphabet or lexoicographic order, particular for +ordering will not match any particular alphabet or lexicographic order, particularly for code points represented by a surrogate pair. [[!ECMA-262]] -


    To isomorphic encode a string input, run these steps: