Skip to content

Commit

Permalink
Add bit-operations to NatN and IntN, and wrapping operations (#2326)
Browse files Browse the repository at this point in the history
* The bit-operations are added to Nat8, Int8 etc.
   On these, we only have `>>` (the signeness of the shift is derived from the type), no `+>>`.
   That operator is only available on `WordN`, and will be removed with it.

* Wrapping operators `+%`, `-%`, `*%` and `**%` are added.
  `*%` traps on negative exponent for `IntN`.

  The assignment variants (`+%=`, `-%=`, `+%=`, `**%=`) are added as well
  
* We now have two conversions `Nat→Nat8`, a trapping and a wrapping one, and likewise for `Int→Int8`.
  This builds on the bifurcaton of prims in #2324

* Conversions between equal bit-width numbers are always wrapping, and exists for all of them, including
   directly from `NatN` to `IntN` and back.

* Little testing in this PR, but #2327 (which is based on this one) does some thorough randomized checking.
  The test suite is ported to not use `WordN` in #2309 and also passes.

* A PR against motoko-base to expose the new functionality is in dfinity/motoko-base#217.
  (But this PR can go in before, it is compatible with the previous base.)

* A changelog entry is added

* The user’s guide is updated (picking changes from #2309 as appropriate, but not removing mention of `Word` entirely)
  • Loading branch information
nomeata authored Feb 12, 2021
1 parent abf1d9a commit 833386c
Show file tree
Hide file tree
Showing 21 changed files with 424 additions and 149 deletions.
22 changes: 18 additions & 4 deletions Changelog.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,28 @@
= Motoko compiler changelog

* `mo-doc` now generates cross-references for types in signatures in
both the Html as well as the Asciidoc output. So a signature like
`fromIter : I.Iter<Nat> -> List.List<Nat>` will now let you click on
`I.Iter` or `List.List` and take you to their definitions.
* Wrapping arithmetic and bit-wise operations on `NatN` and `IntN`

The conventional arithmetic operators on `NatN` and `IntN` trap on overflow.
If wrap-around semantics is desired, the operators `+%`, `-%`, `*%` and `**%`
can be used. The corresponding assignment operators (`+%=` etc.) are also available.

Likewise, the bit fiddling operators (`&`, `|`, `^`, `<<`, `>>`, `<<>`,
`<>>` etc.) are now also available on `NatN` and `IntN`. The right shift
operator (`>>`) is an unsigned right shift on `NatN` and a signed right shift
on `IntN`; the `+>>` operator is _not_ available on these types.

The motivation for this change is to eventually deprecate and remove the
`WordN` types.

* For values `x` of type `Blob`, an iterator over the elements of the blob
`x.vals()` is introduced. It works like `x.bytes()`, but returns the elements
as type `Nat8`.

* `mo-doc` now generates cross-references for types in signatures in
both the Html as well as the Asciidoc output. So a signature like
`fromIter : I.Iter<Nat> -> List.List<Nat>` will now let you click on
`I.Iter` or `List.List` and take you to their definitions.

* Bugfix: Certain ill-typed object literals are now prevented by the type
checker.

Expand Down
8 changes: 8 additions & 0 deletions doc/modules/language-guide/examples/grammar.txt
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,10 @@
'/'
'%'
'**'
'+%'
'-%'
'*%'
'**%'
'&'
'|'
'^'
Expand Down Expand Up @@ -117,6 +121,10 @@
'/='
'%='
'**-'
'+%='
'-%='
'*%='
'**%='
'&='
'|='
'^='
Expand Down
6 changes: 2 additions & 4 deletions doc/modules/language-guide/pages/basic-concepts.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -226,15 +226,13 @@ As a broader language overview, however, we briefly summarize the other value fo
- Text values --- strings of unicode characters.
- Words --- fixed-width numbers, _without_ overflow checks, and _with_ explicit wrap-around semantics.

*Numbers.* By default, integers and natural numbers are _unbounded_ and do not overflow.
*Numbers.* By default, integers and natural numbers are _unbounded_ and do not overflow.
Instead, they use representations that grow to accommodate any finite number.

For practical reasons, {proglang} also includes _bounded_ types for integers and natural numbers, distinct from the default versions.
Each bounded variant has a fixed width (one of `8`, `16`, `32`, `64`) and each carries the potential for "`overflow`". If and when this event occurs, it is an error and causes the
<<overview-traps,program to trap>>.
There are no unchecked, uncaught overflows in {proglang}, except in well-defined situations, for specific (`Word`-based) types.

Word types permit bitwise operations that are unsupported by the other number types.
There are no unchecked, uncaught overflows in {proglang}, except in well-defined situations, for explicitly _wrapping_ operations (indicated by a `%` character in the operator) and specific (`Word`-based) types.
The language provides primitive built-ins to convert between these various number representations.

The link:language-manual{outfilesuffix}[language quick reference] contains a complete list of link:language-manual{outfilesuffix}#primitive-types[primitive types].
Expand Down
58 changes: 48 additions & 10 deletions doc/modules/language-guide/pages/language-manual.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -212,7 +212,7 @@ To simplify the presentation of available operators, operators and primitive typ

| A | Arithmetic | arithmetic operations
| L | Logical | logical/Boolean operations
| B | Bitwise | bitwise operations
| B | Bitwise | bitwise and wrapping operations
| O | Ordered | comparison
| T | Text | concatenation
|===
Expand Down Expand Up @@ -269,7 +269,7 @@ Equality and inequality are structural and based on the observable content of th
|===

[[syntax-ops-bitwise]]
=== Bitwise binary operators
=== Bitwise and wrapping binary operators

|===
| `<binop>` | Category |
Expand All @@ -279,9 +279,14 @@ Equality and inequality are structural and based on the observable content of th
| `^` | B | exclusive or
| `<<` | B | shift left
| `␣>>` | B | shift right *(must be preceded by whitespace)*
| `+>>` | B | signed shift right
| `+>>` | B | signed shift right (only on `Word`-types)
| `<<>` | B | rotate left
| `<>>` | B | rotate right

| `+%` | A | addition (wrap-on-overflow)
| `-%` | A | subtraction (wrap-on-overflow)
| `*%` | A | multiplication (wrap-on-overflow)
| `**%`| A | exponentiation (wrap-on-overflow)
|===

[[syntax-ops-string]]
Expand Down Expand Up @@ -311,9 +316,13 @@ Equality and inequality are structural and based on the observable content of th
| `^=` | B | in place exclusive or
| `<\<=` | B | in place shift left
| `>>=` | B | in place shift right
| `+>>=` | B | in place signed shift right
| `+>>=` | B | in place signed shift right (only on `Word`-types)
| `<<>=` | B | in place rotate left
| `<>>=` | B | in place rotate right
| `+%=` | B | in place add (wrap-on-overflow)
| `-%=` | B | in place subtract (wrap-on-overflow)
| `*%=` | B | in place multiply (wrap-on-overflow)
| `**%=` | B | in place exponentiation (wrap-on-overflow)
| `#=` | T | in place concatenation
|===

Expand All @@ -330,18 +339,18 @@ Tokens on the same line have equal precedence with the indicated associativity.

| LOWEST | none | `if _ _` (no `else`), `loop _` (no `while`)
|(higher)| none | `else`, `while`
|(higher)| right | `:=`, `+=`, `-=`, `*=`, `/=`, `%=`, `**=`, `#=`, `&=`, `\|=`, `^=`, `<\<=`, `>>-`, `<<>=`, `<>>=`
|(higher)| right | `:=`, `+=`, `-=`, `*=`, `/=`, `%=`, `**=`, `#=`, `&=`, `\|=`, `^=`, `<\<=`, `>>=`, `<<>=`, `<>>=`, `+%=`, `-%=`, `*%=`, `**%=`,`
|(higher)| left | `:`
|(higher)| left | `or`
|(higher)| left | `and`
|(higher)| none | `==`, `!=`, `<`, `>`, `\<=`, `>`, `>=`
|(higher)| left | `+`, `-`, `#`
|(higher)| left | `*`, `/`, `%`
|(higher)| left | `+`, `-`, `#`, `+%`, `-%`
|(higher)| left | `*`, `/`, `%`, `*%`
|(higher)| left | `\|`
|(higher)| left | `+&+`
|(higher)| left | `+^+`
|(higher)| none | `<<`, `>>`, `<<>`, `<>>`
| HIGHEST | left | `+**+`
| HIGHEST | left | `+**+`, `+**%+`
|===


Expand Down Expand Up @@ -684,19 +693,48 @@ In particular, every value of type `Nat` is also a value of type `Int`, without

The types `Int8`, `Int16`, `Int32` and `Int64` represent
signed integers with respectively 8, 16, 32 and 64 bit precision.
All have categories A (Arithmetic) and O (Ordered).
All have categories A (Arithmetic), B (Bitwise) and O (Ordered).

Operations that may under- or overflow the representation are checked and trap on error.

The operations `+%`, `-%`, `*%` and `**%` provide access to wrap-around, modular arithmetic.

As bitwise types, these types support bitwise operations *and* `(&)`,
*or* `(|)` and *exclusive-or* `(^)`. Further, they can be rotated
left `(<<>)`, right `(<>>)`, and shifted left `(<<)`, right `(>>)`.
The right-shift preserves the two's-complement sign.
All shift and rotate amounts are considered modulo the word's width *n*.

Bounded integer types are not in subtype relationship with each other or with
other arithmetic types, and their literals need type annotation if
the type cannot be inferred from context, e.g. `(-42 : Int16)`.

The corresponding module in the base library provides conversion functions:
Conversion to `Int`, checked and wrapping conversions from `Int` and wrapping
conversion to the bounded natural type of the same size.


[[bounded-naturals]]
=== Bounded naturals `Nat8`, `Nat16`, `Nat32` and `Nat64`

The types `Nat8`, `Nat16`, `Nat32` and `Nat64` represent
unsigned integers with respectively 8, 16, 32 and 64 bit precision.
All have categories A (Arithmetic) and O (Ordered).
All have categories A (Arithmetic), B (Bitwise) and O (Ordered).

Operations that may under- or overflow the representation are checked and trap on error.

The operations `+%`, `-%`, `*%` and `**%` provide access to the modular, wrap-on-overflow operations.

As bitwise types, these types support bitwise operations *and* `(&)`,
*or* `(|)` and *exclusive-or* `(^)`. Further, they can be rotated
left `(<<>)`, right `(<>>)`, and shifted left `(<<)`, right `(>>)`.
The right-shift is logical.
All shift and rotate amounts are considered modulo the word's width *n*.

The corresponding module in the base library provides conversion functions:
Conversion to `Int`, checked and wrapping conversions from `Int` and wrapping
conversion to the bounded natural type of the same size.

[[word-types]]
=== Word types

Expand Down
4 changes: 2 additions & 2 deletions doc/modules/language-guide/pages/modules-and-imports.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -116,10 +116,10 @@ This distinction can affect how some data structures are typed.

For the imported canister actor, types are derived from the Candid file—the _project-name_.did file—for the canister rather than from {proglang} itself.

The translation from {proglang} actor type to Candid service type is mostly, but not entirely, one-to-one, and there are some distinct {proglang} types that map to the same Candid type. For example, the {proglang} `Nat8` and `Word8` types both exported as Candid type `nat8`, but `nat8` is canonically imported as {proglang} `Nat8`, not `Word8`.
The translation from {proglang} actor type to Candid service type is mostly, but not entirely, one-to-one, and there are some distinct {proglang} types that map to the same Candid type. For example, the {proglang} `Nat32` and `Char` types both exported as Candid type `nat32`, but `nat32` is canonically imported as {proglang} `Nat32`, not `Char`.

The type of an imported canister function, therefore, might differ from the type of the original {proglang} code that implements it.
For example, if the {proglang} function had type `+shared Word8 -> async Word64+` in the implementation, its exported Candid type would be `+(Nat8) -> Nat64+` but the {proglang} type imported from this Candid type will actually be the correct—but perhaps unexpected—type `+shared Nat8 -> async Nat64+`.
For example, if the {proglang} function had type `+shared Nat32 -> async Char+` in the implementation, its exported Candid type would be `+(nat32) -> (nat32)+` but the {proglang} type imported from this Candid type will actually be the correct—but perhaps unexpected—type `+shared Nat32 -> async Nat32+`.

These type differences are the result of the Candid-to-{proglang} composition layer inherent to the canister abstraction.

Expand Down
19 changes: 4 additions & 15 deletions doc/overview-slides.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,20 +118,10 @@ Literals: `13`, `0xf4`, `1_000_000`

## Bounded numbers (trapping)

`Nat8`, `Nat16`, `Nat32`, `Nat64`,
`Nat8`, `Nat16`, `Nat32`, `Nat64`,
`Int8`, `Int16`, `Int32`, `Int64`

Trap on over- and underflow.

Needs type annotations (somewhere)

Literals: `13`, `0xf4`, `-20`, `1_000_000`

## Bounded numbers (wrapping)

`Word8`, `Word16`, `Word32`, `Word64`

Wrap-around on over/under-flow. Use for bit-fiddling.
Trap on over- and underflow. Wrap-on-trap and bit-manipulating operations available.

Needs type annotations (somewhere)

Expand Down Expand Up @@ -531,10 +521,10 @@ let ? name = d.find(1);
### Language prelude

* connects internal primitives with surface syntax (types, operations)
* conversions like `intToWord32`
* conversions like `intToNat32`
* side-effecting operations `debugPrintInt`
(tie into execution environment)
* utilities like `hashInt`, `clzWord32`
* utilities like `hashInt`, `clzNat32`


# Sample App
Expand Down Expand Up @@ -711,4 +701,3 @@ We focus on abstractions for implementing the database for the produce exchange:
- [Hash trie](https://github.com/dfinity-lab/motoko/blob/stdlib-examples/design/stdlib/trie.md): Immutable finite map representation based on hashing each key.

- [Association list](https://github.com/dfinity-lab/motoko/blob/stdlib-examples/design/stdlib/assocList.md): Immutable finite map representation based on a list of key-value pairs.

Loading

0 comments on commit 833386c

Please sign in to comment.