Home

💠 Cero language design notes

Built-in types

All built-in types are distinct types, and not aliases of other built-in types. They are not keywords but always accessible as global names, unless noted otherwise.

Integer types

Unsigned integer types

Types	Notes
`uint8`	unsigned 8-bit integer
`uint16`	unsigned 16-bit integer
`uint32`	unsigned 32-bit integer
`uint64`	unsigned 64-bit integer
`uint128`	unsigned 128-bit integer
`uintptr`	unsigned pointer-sized integer
`usize`	unsigned integer for memory amounts, array indexing and object sizes; analogue of `size_t` in C and C++

Signed integer types

All of these use two's complement.

Types	Notes
`int8`	signed 8-bit integer
`int16`	signed 16-bit integer
`int32`	signed 32-bit integer
`int64`	signed 64-bit integer
`int128`	signed 128-bit integer
`intptr`	signed pointer-sized integer
`isize`	signed integer for memory offsets, pointer differences and array indexing with negative values; analogue of `ptrdiff_t` and `ssize_t` in C and C++

Floating-point types

Types	Notes
`float16`	IEEE 754 binary16 format (5 exponent bits, 10 fraction bits)
`float32`	IEEE 754 binary32 format (8 exponent bits, 23 fraction bits)
`float64`	IEEE 754 binary64 format (11 exponent bits, 52 fraction bits)
`float128`	IEEE 754 binary128 format (15 exponent bits, 112 fraction bits)

These types must be imported:

Types	Notes
`BFloat16`	Brain 16-bit format (hardware support with ARMv8.6-A extensions and Intel AVX-512 BF16 extensions)
`XFloat80`	x86 80-bit extended format (hardware support on x87)
`DFloat128`	double-double arithmetic (hardware support on PowerPC)

Array types

Vector types

Pointer types

`bool`

Type representing values that can only be false or true. Its size and alignment are 1. When used as the type of a bit field, its minimum bit size is 1. Can be cast to any integer type, in which case false becomes 0 and true becomes 1.

Creating a bool with any other bit pattern than 0x0 and 0x1 through the use of memcpy or other mechanisms is undefined behavior. Whether only the use of such a bool is undefined (and not creating one) would still have to be investigated. This UB is justified because existing C and C++ ABIs also define this behavior to be undefined.

`void`

The unit type which is also the default return type of functions, indicating that it doesn't return anything. It is generally compatible with void from C and C++ but has none of its restrictions, so that using it in a generic context has no exceptional behavior requiring awkward workarounds. The size of void is zero. Objects of type void are useless and it's probably reasonable if constructing them outside of a generic context generates a warning.

Because zero-sized types are allowed, an empty struct has a layout identical to void.

It is probably unnecessary to give any special semantics to ^void. It would be fine if the convention of using them to represent type-erased pointers is kept, as in C and C++. The lack of an implicit conversion from any pointer type to ^void will probably not be sorely missed, but that remains to be seen. Loads and stores of ^void are guaranteed to be no-ops (even when volatile).

It could be useful to make pointer arithmetic on ^[]void operate in units of 1. This would prevent a potential footgun in containers using pointer arithmetic for element count computations. This has precedent and seems to work fine in practice, since pointer-to-void arithmetic using a unit of 1 is provided by a GNU extension for C, seemingly without issues.

`never`

The bottom type that indicates that an operation never completes to produce a value. It is compatible with [[noreturn]] from C and C++ for the purpose of return types in function signatures. Because it is a bottom type, no objects of type never can ever be legally formed at runtime, which is ensured by the type system and tricking the compiler into breaking this guarantee would cause undefined behavior. However, it is still an object type and it is possible to declare variables and fields of type never and use it in other contexts where object declarations of otherwise normal types are accepted. They are just effectively unreachable. The representation of the never type is that of void: its size is 0 and its alignment is 1.

Function call expressions calling a never-returning function, break, continue, throw and return expressions have this type. Because it is the bottom type, it coerces to any other type. Expressions of this type are useful because they can be used in place of expressions where any other type would be expected, for example:

killProgram(String message) -> never {
    print(message);
    exit(-1); // exit also returns never
}

describeSomeEnumValue(SomeEnum e) -> String {
    return switch e {
        Value1 => { "First value" }
        Value2 => { "Second value" }
        Value3 => { "Third value" }
        else   => { killProgram("invalid enum value") }
    };
}

Constant types

Built-in types whose values only exist at compile-time and can only be directly used as constants.

User-defined types

`struct` types

Structs are composed of named fields made from other types. An empty struct has size 0 and alignment 1. A struct can inherit the interface and data of another struct by declaring a field without a name, which is an easy way to reuse code:

struct Image {
    uint32 width,
    uint32 height,
    ByteBuffer data
}

struct UserImage {
    Image,
    String userName
}

getWidth(UserImage userImage) -> uint32 {
    return userImage.width;
}

However, this will not make that struct a subtype of the other struct. To make it a subtype in addition to inheritance, name the field this:

struct Image {
    uint32 width,
    uint32 height,
    ByteBuffer data
}

struct UserImage {
    Image this,
    String userName
}

treatAsSupertype(^UserImage userImage) -> ^Image {
    return userImage;
}

`enum` types

Operators

Unless noted otherwise, operators marked as built-in for some category of types will also be built-in for vectors of such types.

Name	Syntax	Built-in for	Notes
Assignment	`a = b`	all built-in types	Not overloadable.
Addition	`a + b` `a += b`	integers, floats, `^[]T`	May cause overflow for integers.
Subtraction	`a - b` `a -= b`	integers, floats, `^[]T`	May cause overflow for integers.
Negation	`-a`	signed integers, floats	May cause overflow for integers.
Multiplication	`a * b` `a *= b`	integers, floats	May cause overflow for integers.
Division	`a / b` `a /= b`	integers, floats	May cause overflow for signed integers. May cause division-by-zero for integers. May cause division-by-zero for floats in fast-float mode
Remainder	`a % b` `a %= b`	integers, floats	May cause division-by-zero for integers. May cause division-by-zero for floats in fast-float mode
Exponentiation	`a b` `a = b`	integers, floats	May cause overflow for integers.
Pre-increment	`++a`	integers, floats, `^[]T`	May cause overflow for integers.
Pre-decrement	`--a`	integers, floats, `^[]T`	May cause overflow for integers.
Post-increment	`a++`	integers, floats, `^[]T`	May cause overflow for integers.
Post-decrement	`a--`	integers, floats, `^[]T`	May cause overflow for integers.
Bitwise AND	`a & b` `a &= b`	integers, `bool`
Bitwise OR	`a \| b` `a \|= b`	integers, `bool`
XOR	`a ~ b` `a ~= b`	integers, `bool`
NOT	`~a`	integers, `bool`
Left shift	`a << b` `a <<= b`	integers
Right shift	`a >> b` `a >>= b`	integers
Logical AND	`a && b` `a &&= b`	`bool`	Not overloadable. Not available for `bool` vectors. Only evaluates `b` if `a` is true.
Logical OR	`a \|\| b` `a \|\|= b`	`bool`	Not overloadable. Not available for `bool` vectors. Only evaluates `b` if `a` is false.
Equal	`a == b`	all built-in types	Always returns `bool`.
Not equal	`a != b`	all built-in types	Always returns `bool`.
Less than	`a < b`	integers, floats, `^[]T`	Always returns `bool`. Not available for vectors.
Less than or equal	`a <= b`	integers, floats, `^[]T`	Always returns `bool`. Not available for vectors.
Greater than	`a > b`	integers, floats, `^[]T`	Always returns `bool`. Not available for vectors.
Greater than or equal	`a >= b`	integers, floats, `^[]T`	Always returns `bool`. Not available for vectors.
Address of	`&a`	all non-constant types, functions	Not overloadable.
Dereference	`a^`	pointers
Function call	`a(b)`	functions
Indexing	`a[b]`	arrays, vectors, `^[]T`
Member access	`a.b`	all types	Not overloadable.

Precedence

Operator precedence should be intuitive and lead to obvious behavior. When operators have no conventionally agreed upon precedence in mathematical notation or are otherwise ambiguous, combining them will lead to syntax errors as noted in the table below with footnotes. Those cases can be resolved by adding parentheses to explicitly specify the intent.

This is intended to prevent pitfalls when combining operators that are not frequently combined with each other, and therefore a user should not feel the need to consult the precedence table while programming.

Level	Operators		Associativity
1	Postfix `a.` `a^` `a(b)` `a[b]` `a++` `a--`		left-to-right
2	Prefix `&a` `-a` `~a` `++a` `--a` `a ** b`		right-to-left⁽¹⁾
3	Multiplicative `a * b` `a / b` `a % b`	Bitwise `a & b` `a \| b` `a ~ b` `a << b` `a >> b`	left-to-right⁽²⁾
4	Additive `a + b` `a - b`	Bitwise `a & b` `a \| b` `a ~ b` `a << b` `a >> b`	left-to-right⁽²⁾
5	Comparison `a == b` `a != b` `a < b` `a <= b` `a > b` `a >= b`		left-to-right⁽³⁾
6	Logical `a && b` `a \|\| b`		left-to-right⁽⁴⁾
7	Assignment `a = b` `a += b` `a -= b` `a = b` `a /= b` `a %= b` `a *= b` `a &= b` `a \|= b` `a ~= b` `a <<= b` `a >>= b` `a &&= b` `a \|\|= b`		right-to-left

⁽¹⁾: Associating unary - with ** is a syntax error, since there is no accepted convention on whether ‑a**b should mean (‑a)**b or ‑(a**b).

⁽²⁾: A bitwise operator cannot be combined without parentheses with another arithmetic operator that isn't itself. For example, the binary & operator associates left with itself, so a & b & c is valid to write, but does not associate with other arithmetic operators. Therefore expressions like a & b | c, a & b * c or a + b & c result in a syntax error and require parentheses.

⁽³⁾: When comparison operators are associated with each other, like a <op> b <op> c, they behave as a <op> b && b <op> c, except that operand b is only evaluated once. Only transitive comparison chains are allowed, meaning those where if a <op> b is true and b <op> c is true, it implies a <op> c. The transitive comparisons are:
a < b < c
a < b <= c
a <= b < c
a <= b <= c
a == b == c
a > b > c
a > b >= c
a >= b > c
a >= b >= c

Non-transitive comparison chains, such as a < b > c, a == b > c or a != b != c, will result in a syntax error, since they are less useful, less clear and likely to cause confusion or bugs due to falsely expecting some kind of transitive relation to hold. If parentheses are used in any of these cases, the special chaining behavior does not occur, so (a == b) == c is not the same as a == b == c. It will instead equality-compare the boolean result of comparing a and b, with c.

⁽⁴⁾: Associating && with || without parentheses is a syntax error, so a && b && c is valid to write, but a && b || c requires parentheses.

Comments

Line comments

use ce.io;

main() {
    // A line comment begins with `//` and continues until the next new-line character.
    print("Hello world!"); // Another comment.
}

Block comments

use ce.io;

main() {
    /* A block comment begins with `/*` and must be closed with `*/`.
     * It's conventional to put a star at the beginning of every new block comment line.
     */
    print("Hello world!" /* Block comments don't end at
                            the end of the line. */);

    /* Block comments can be /*/* nested */*/
     * as many times
     * /*/*/**/*/*/
     * as you want. */
}

Documentation comments

Might be added either as a distinct language feature or as a variant of block comments when a certain format for the comment text is used.

Syntax alternatives

It might make sense to change some parts of the syntax to these alternative designs.

Alternate built-in type names

uint8   => u8          int8   => i8
uint16  => u16         int16  => i16       float16  => f16
uint32  => u32         int32  => i32       float32  => f32
uint64  => u64         int64  => i64       float64  => f64
uint128 => u128        int128 => i128      float128 => f128
usize   => usize       isize  => isize
uintptr => uptrsize    intptr => iptrsize

Undesired features

Features that I would avoid adding to the language at all costs, in decreasing order of undesirability:

Garbage collection, because it defeats the purpose of the language, which is full control over memory and static memory management; even opt-in GC is detrimental because it splits the ecosystem (see the D language as a case study)
Preprocessor, because it's just unnecessary complexity
Undefined behavior without specially designated code blocks where it can be caused
Built-in high-level data types like dynamic arrays, maps, strings, because the language should be expressive enough to implement these things in libraries efficiently and conveniently
Uniform function call syntax (UFCS), because it adds unnecessary complexity for little benefit, and introducing arbitrary choices in writing function calls adds nothing of value; a library author should be able to decide how their facilities are used syntactically by users; note that this doesn't exclude a feature like extension methods or traits, UFCS itself is just syntax sugar that complicates name lookup
User-defined operator syntax, because they make parsing much harder and encourage write-only code
Invisible C++-style exception handling
C-style macros, because they are unhygienic and other metaprogramming features should be powerful enough to make them unnecessary
Rust-style macros, because of their complexity and being very divorced from the rest of the language, and also because metaprogramming features should be powerful enough to make them unnecessary
First-class built-in tuple types, because their primary use cases are provided by other mechanisms and usually normal structs are preferable anyway because giving a name to each tuple member and the tuple as a whole makes code more readable; multiple return values can just be provided as-is and destructuring has to be provided for convenience for struct types anyway
goto
defer
Named arguments, because changing parameter names should not lead to API breaks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

💠 Cero language design notes

Built-in types

Integer types

Unsigned integer types

Signed integer types

Floating-point types

Array types

Vector types

Pointer types

`bool`

`void`

`never`

Constant types

User-defined types

`struct` types

`enum` types

Operators

Precedence

Comments

Line comments

Block comments

Documentation comments

Syntax alternatives

Alternate built-in type names

Undesired features

Clone this wiki locally

Home

💠 Cero language design notes

Built-in types

Integer types

Unsigned integer types

Signed integer types

Floating-point types

Array types

Vector types

Pointer types

bool

void

never

Constant types

User-defined types

struct types

enum types

Operators

Precedence

Comments

Line comments

Block comments

Documentation comments

Syntax alternatives

Alternate built-in type names

Undesired features

Clone this wiki locally

`bool`

`void`

`never`

`struct` types

`enum` types