Add bit-operations to NatN and IntN, and wrapping operations (#2326)

* The bit-operations are added to Nat8, Int8 etc. On these, we only have `>>` (the signeness of the shift is derived from the type), no `+>>`. That operator is only available on `WordN`, and will be removed with it. * Wrapping operators `+%`, `-%`, `*%` and `**%` are added. `*%` traps on negative exponent for `IntN`. The assignment variants (`+%=`, `-%=`, `+%=`, `**%=`) are added as well * We now have two conversions `Nat→Nat8`, a trapping and a wrapping one, and likewise for `Int→Int8`. This builds on the bifurcaton of prims in #2324 * Conversions between equal bit-width numbers are always wrapping, and exists for all of them, including directly from `NatN` to `IntN` and back. * Little testing in this PR, but #2327 (which is based on this one) does some thorough randomized checking. The test suite is ported to not use `WordN` in #2309 and also passes. * A PR against motoko-base to expose the new functionality is in dfinity/motoko-base#217. (But this PR can go in before, it is compatible with the previous base.) * A changelog entry is added * The user’s guide is updated (picking changes from #2309 as appropriate, but not removing mention of `Word` entirely)
dfinity · Feb 12, 2021 · 833386c · 833386c
1 parent abf1d9a
commit 833386c
Show file tree

Hide file tree

Showing 21 changed files with 424 additions and 149 deletions.
diff --git a/Changelog.md b/Changelog.md
@@ -1,14 +1,28 @@
 = Motoko compiler changelog
 
-* `mo-doc` now generates cross-references for types in signatures in
-  both the Html as well as the Asciidoc output. So a signature like
-  `fromIter : I.Iter<Nat> -> List.List<Nat>` will now let you click on
-  `I.Iter` or `List.List` and take you to their definitions.
+* Wrapping arithmetic and bit-wise operations on `NatN` and `IntN`
+
+  The conventional arithmetic operators on `NatN` and `IntN` trap on overflow.
+  If wrap-around semantics is desired, the operators `+%`, `-%`, `*%` and `**%`
+  can be used. The corresponding assignment operators (`+%=` etc.) are also available.
+
+  Likewise, the bit fiddling operators (`&`, `|`, `^`, `<<`, `>>`, `<<>`,
+  `<>>` etc.) are now also available on `NatN` and `IntN`. The right shift
+  operator (`>>`) is an unsigned right shift on `NatN` and a signed right shift
+  on `IntN`; the `+>>` operator is _not_ available on these types.
+
+  The motivation for this change is to eventually deprecate and remove the
+  `WordN` types.
 
 * For values `x` of type `Blob`, an iterator over the elements of the blob
   `x.vals()` is introduced. It works like `x.bytes()`, but returns the elements
   as type `Nat8`.
 
+* `mo-doc` now generates cross-references for types in signatures in
+  both the Html as well as the Asciidoc output. So a signature like
+  `fromIter : I.Iter<Nat> -> List.List<Nat>` will now let you click on
+  `I.Iter` or `List.List` and take you to their definitions.
+
 * Bugfix: Certain ill-typed object literals are now prevented by the type
   checker.
 

diff --git a/doc/modules/language-guide/examples/grammar.txt b/doc/modules/language-guide/examples/grammar.txt
@@ -87,6 +87,10 @@
     '/'
     '%'
     '**'
+    '+%'
+    '-%'
+    '*%'
+    '**%'
     '&'
     '|'
     '^'
@@ -117,6 +121,10 @@
     '/='
     '%='
     '**-'
+    '+%='
+    '-%='
+    '*%='
+    '**%='
     '&='
     '|='
     '^='

diff --git a/doc/modules/language-guide/pages/basic-concepts.adoc b/doc/modules/language-guide/pages/basic-concepts.adoc
@@ -226,15 +226,13 @@ As a broader language overview, however, we briefly summarize the other value fo
  - Text values --- strings of unicode characters.
  - Words --- fixed-width numbers, _without_ overflow checks, and _with_ explicit wrap-around semantics.
 
-*Numbers.* By default, integers and natural numbers are _unbounded_ and do not overflow.  
+*Numbers.* By default, integers and natural numbers are _unbounded_ and do not overflow.
 Instead, they use representations that grow to accommodate any finite number.
 
 For practical reasons, {proglang} also includes _bounded_ types for integers and natural numbers, distinct from the default versions.
 Each bounded variant has a fixed width (one of `8`, `16`, `32`, `64`) and each carries the potential for "`overflow`". If and when this event occurs, it is an error and causes the
 <<overview-traps,program to trap>>.
-There are no unchecked, uncaught overflows in {proglang}, except in well-defined situations, for specific (`Word`-based) types.
-
-Word types permit bitwise operations that are unsupported by the other number types.
+There are no unchecked, uncaught overflows in {proglang}, except in well-defined situations, for explicitly _wrapping_ operations (indicated by a `%`  character in the operator) and specific (`Word`-based) types.
 The language provides primitive built-ins to convert between these various number representations.
 
 The link:language-manual{outfilesuffix}[language quick reference] contains a complete list of link:language-manual{outfilesuffix}#primitive-types[primitive types].

diff --git a/doc/modules/language-guide/pages/language-manual.adoc b/doc/modules/language-guide/pages/language-manual.adoc
@@ -212,7 +212,7 @@ To simplify the presentation of available operators, operators and primitive typ
 
 | A            | Arithmetic | arithmetic operations
 | L            | Logical    | logical/Boolean operations
-| B            | Bitwise    | bitwise operations
+| B            | Bitwise    | bitwise and wrapping operations
 | O            | Ordered    | comparison
 | T            | Text       | concatenation
 |===
@@ -269,7 +269,7 @@ Equality and inequality are structural and based on the observable content of th
 |===
 
 [[syntax-ops-bitwise]]
-=== Bitwise binary operators
+=== Bitwise and wrapping binary operators
 
 |===
 | `<binop>` | Category |
@@ -279,9 +279,14 @@ Equality and inequality are structural and based on the observable content of th
 | `^`   | B | exclusive or
 | `<<`  | B | shift left
 | `␣>>` | B | shift right *(must be preceded by whitespace)*
-| `+>>` | B | signed shift right
+| `+>>` | B | signed shift right (only on `Word`-types)
 | `<<>` | B | rotate left
 | `<>>` | B | rotate right
+
+|  `+%` | A | addition (wrap-on-overflow)
+|  `-%` | A | subtraction (wrap-on-overflow)
+|  `*%` | A | multiplication (wrap-on-overflow)
+|  `**%`| A | exponentiation (wrap-on-overflow)
 |===
 
 [[syntax-ops-string]]
@@ -311,9 +316,13 @@ Equality and inequality are structural and based on the observable content of th
 | `^=`   | B | in place exclusive or
 | `<\<=`  | B | in place shift left
 | `>>=`  | B | in place shift right
-| `+>>=` | B | in place signed shift right
+| `+>>=` | B | in place signed shift right (only on `Word`-types)
 | `<<>=` | B | in place rotate left
 | `<>>=` | B | in place rotate right
+| `+%=`   | B | in place add (wrap-on-overflow)
+| `-%=`   | B | in place subtract (wrap-on-overflow)
+| `*%=`   | B | in place multiply (wrap-on-overflow)
+| `**%=`  | B | in place exponentiation (wrap-on-overflow)
 | `#=`   | T | in place concatenation
 |===
 
@@ -330,18 +339,18 @@ Tokens on the same line have equal precedence with the indicated associativity.
 
 | LOWEST  | none | `if _ _` (no `else`), `loop _` (no `while`)
 |(higher)| none | `else`, `while`
-|(higher)| right | `:=`, `+=`, `-=`, `*=`, `/=`, `%=`, `**=`, `#=`, `&=`, `\|=`, `^=`, `<\<=`, `>>-`, `<<>=`, `<>>=`
+|(higher)| right | `:=`, `+=`, `-=`, `*=`, `/=`, `%=`, `**=`, `#=`, `&=`, `\|=`, `^=`, `<\<=`, `>>=`, `<<>=`, `<>>=`, `+%=`, `-%=`, `*%=`, `**%=`,`
 |(higher)| left | `:`
 |(higher)| left | `or`
 |(higher)| left | `and`
 |(higher)| none | `==`, `!=`, `<`, `>`, `\<=`, `>`, `>=`
-|(higher)| left | `+`, `-`, `#`
-|(higher)| left | `*`, `/`, `%`
+|(higher)| left | `+`, `-`, `#`, `+%`, `-%`
+|(higher)| left | `*`, `/`, `%`, `*%`
 |(higher)| left | `\|`
 |(higher)| left | `+&+`
 |(higher)| left | `+^+`
 |(higher)| none | `<<`, `>>`, `<<>`, `<>>`
-| HIGHEST | left | `+**+`
+| HIGHEST | left | `+**+`, `+**%+`
 |===
 
 
@@ -684,19 +693,48 @@ In particular, every value of type `Nat` is also a value of type `Int`, without
 
 The types `Int8`, `Int16`, `Int32` and `Int64` represent
 signed integers with respectively 8, 16, 32 and 64 bit precision.
-All have categories A (Arithmetic) and O (Ordered).
+All have categories A (Arithmetic), B (Bitwise) and O (Ordered).
 
 Operations that may under- or overflow the representation are checked and trap on error.
 
+The operations `+%`, `-%`, `*%` and `**%` provide access to wrap-around, modular arithmetic.
+
+As bitwise types, these types support bitwise operations *and* `(&)`,
+*or* `(|)` and *exclusive-or* `(^)`. Further, they can be rotated
+left `(<<>)`, right `(<>>)`, and shifted left `(<<)`, right `(>>)`.
+The right-shift preserves the two's-complement sign.
+All shift and rotate amounts are considered modulo the word's width *n*.
+
+Bounded integer types are not in subtype relationship with each other or with
+other arithmetic types, and their literals need type annotation if
+the type cannot be inferred from context, e.g. `(-42 : Int16)`.
+
+The corresponding module in the base library provides conversion functions:
+Conversion to `Int`, checked and wrapping conversions from `Int` and wrapping
+conversion to the bounded natural type of the same size.
+
+
 [[bounded-naturals]]
 === Bounded naturals `Nat8`, `Nat16`, `Nat32` and `Nat64`
 
 The types `Nat8`, `Nat16`, `Nat32` and `Nat64` represent
 unsigned integers with respectively 8, 16, 32 and 64 bit precision.
-All have categories A (Arithmetic) and O (Ordered).
+All have categories A (Arithmetic), B (Bitwise) and O (Ordered).
 
 Operations that may under- or overflow the representation are checked and trap on error.
 
+The operations `+%`, `-%`, `*%` and `**%` provide access to the modular, wrap-on-overflow operations.
+
+As bitwise types, these types support bitwise operations *and* `(&)`,
+*or* `(|)` and *exclusive-or* `(^)`. Further, they can be rotated
+left `(<<>)`, right `(<>>)`, and shifted left `(<<)`, right `(>>)`.
+The right-shift is logical.
+All shift and rotate amounts are considered modulo the word's width *n*.
+
+The corresponding module in the base library provides conversion functions:
+Conversion to `Int`, checked and wrapping conversions from `Int` and wrapping
+conversion to the bounded natural type of the same size.
+
 [[word-types]]
 === Word types
 

diff --git a/doc/modules/language-guide/pages/modules-and-imports.adoc b/doc/modules/language-guide/pages/modules-and-imports.adoc
@@ -116,10 +116,10 @@ This distinction can affect how some data structures are typed.
 
 For the imported canister actor, types are derived from the Candid file—the _project-name_.did file—for the canister rather than from {proglang} itself.
 
-The translation from {proglang} actor type to Candid service type is mostly, but not entirely, one-to-one, and there are some distinct {proglang} types that map to the same Candid type. For example, the {proglang} `Nat8` and `Word8` types both exported as Candid type `nat8`, but `nat8` is canonically  imported as {proglang} `Nat8`, not `Word8`.
+The translation from {proglang} actor type to Candid service type is mostly, but not entirely, one-to-one, and there are some distinct {proglang} types that map to the same Candid type. For example, the {proglang} `Nat32` and `Char` types both exported as Candid type `nat32`, but `nat32` is canonically  imported as {proglang} `Nat32`, not `Char`.
 
 The type of an imported canister function, therefore, might differ from the type of the original {proglang} code that implements it.
-For example, if the {proglang} function had type `+shared Word8 -> async Word64+` in the implementation, its exported Candid type would be `+(Nat8) -> Nat64+` but the {proglang} type imported from this Candid type will actually be the correct—but perhaps unexpected—type  `+shared Nat8 -> async Nat64+`.
+For example, if the {proglang} function had type `+shared Nat32 -> async Char+` in the implementation, its exported Candid type would be `+(nat32) -> (nat32)+` but the {proglang} type imported from this Candid type will actually be the correct—but perhaps unexpected—type  `+shared Nat32 -> async Nat32+`.
 
 These type differences are the result of the Candid-to-{proglang} composition layer inherent to the canister abstraction.
 

diff --git a/doc/overview-slides.md b/doc/overview-slides.md
@@ -118,20 +118,10 @@ Literals: `13`, `0xf4`, `1_000_000`
 
 ## Bounded numbers (trapping)
 
-`Nat8`, `Nat16`, `Nat32`, `Nat64`,  
+`Nat8`, `Nat16`, `Nat32`, `Nat64`,
 `Int8`, `Int16`, `Int32`, `Int64`
 
-Trap on over- and underflow.
-
-Needs type annotations (somewhere)
-
-Literals: `13`, `0xf4`, `-20`, `1_000_000`
-
-## Bounded numbers (wrapping)
-
-`Word8`, `Word16`, `Word32`, `Word64`
-
-Wrap-around on over/under-flow. Use for bit-fiddling.
+Trap on over- and underflow. Wrap-on-trap and bit-manipulating operations available.
 
 Needs type annotations (somewhere)
 
@@ -531,10 +521,10 @@ let ? name = d.find(1);
 ### Language prelude
 
 * connects internal primitives with surface syntax (types, operations)
-* conversions like `intToWord32`
+* conversions like `intToNat32`
 * side-effecting operations `debugPrintInt`
   (tie into execution environment)
-* utilities like `hashInt`, `clzWord32`
+* utilities like `hashInt`, `clzNat32`
 
 
 # Sample App
@@ -711,4 +701,3 @@ We focus on abstractions for implementing the database for the produce exchange:
 - [Hash trie](https://github.com/dfinity-lab/motoko/blob/stdlib-examples/design/stdlib/trie.md): Immutable finite map representation based on hashing each key.
 
 - [Association list](https://github.com/dfinity-lab/motoko/blob/stdlib-examples/design/stdlib/assocList.md): Immutable finite map representation based on a list of key-value pairs.
-
-Original file line number
+Diff line change
@@ Expand Up / @@ -87,6 +87,10 @@ @@
         '/'
         '%'
         '**'
+        '+%'
+        '-%'
+        '*%'
+        '**%'
         '&'
         '|'
         '^'
@@ Expand Down Expand Up / @@ -117,6 +121,10 @@ @@
         '/='
         '%='
         '**-'
+        '+%='
+        '-%='
+        '*%='
+        '**%='
         '&='
         '|='
         '^='
@@ Expand Down @@