diff --git a/proposals/p2173.md b/proposals/p2173.md new file mode 100644 index 0000000000000..f8f780a8be8a3 --- /dev/null +++ b/proposals/p2173.md @@ -0,0 +1,1360 @@ +# Associated constant assignment versus equality + + + +[Pull request](https://github.com/carbon-language/carbon-lang/pull/2173) + + + +## Table of contents + +- [Abstract](#abstract) +- [Problem](#problem) + - [Equal types with different interfaces](#equal-types-with-different-interfaces) + - [Type canonicalization](#type-canonicalization) + - [Transitivity of equality](#transitivity-of-equality) + - [Coherence](#coherence) +- [Background](#background) +- [Proposal](#proposal) +- [Details](#details) + - [Rewrite constraints](#rewrite-constraints) + - [Combining constraints with `&`](#combining-constraints-with-) + - [Combining constraints with `and`](#combining-constraints-with-and) + - [Combining constraints with `extends`](#combining-constraints-with-extends) + - [Combining constraints with `impl as` and `is`](#combining-constraints-with-impl-as-and-is) + - [Constraint resolution](#constraint-resolution) + - [Precise rules and termination](#precise-rules-and-termination) + - [Qualified name lookup](#qualified-name-lookup) + - [Type substitution](#type-substitution) + - [Examples](#examples) + - [Termination](#termination) + - [Same type constraints](#same-type-constraints) + - [Implementation of same-type `ImplicitAs`](#implementation-of-same-type-implicitas) + - [Observe declarations](#observe-declarations) + - [Implementing an interface](#implementing-an-interface) + - [Ergonomics](#ergonomics) + - [Implementation experience](#implementation-experience) +- [Rationale](#rationale) +- [Alternatives considered](#alternatives-considered) + - [Status quo](#status-quo) + - [Restrict constraints to allow computable type equality](#restrict-constraints-to-allow-computable-type-equality) + - [Find a fully transitive approach to type equality](#find-a-fully-transitive-approach-to-type-equality) + - [Different syntax for rewrite constraint](#different-syntax-for-rewrite-constraint) + - [Different syntax for same-type constraint](#different-syntax-for-same-type-constraint) + - [Required ordering for rewrites](#required-ordering-for-rewrites) + - [Multi-constraint `where` clauses](#multi-constraint-where-clauses) + - [Rewrite constraints in `impl as` constraints](#rewrite-constraints-in-impl-as-constraints) + + + +## Abstract + +Split the `where A == B` constraint in two: `where .A = B` produces a new +constraint from an old one, where the value of `.A` in the new constraint is +known and eagerly rewritten to `B`, and `where A == B`, which does not cause `A` +and `B` to be considered as identical by language rules but does permit implicit +(no-op) conversion between them. + +This aims to provide an efficiently-computable and human-understandable type +equality rule, with type canonicalization and therefore transitive type +equality, without sacrificing too much in the way of ergonomics and without +sacrificing determinism, while still providing the full power of a general type +constraint system in a less ergonomic form. + +## Problem + +### Equal types with different interfaces + +When an associated type is constrained to be a concrete type, it is desirable +for the associated type to behave like that concrete type: + +``` +interface Hashable { + fn Hash[me: Self]() -> u128; +} +interface X { + let R:! Hashable; + fn Make() -> R; +} +fn F[T:! X where .R == i32](x: T) -> i32 { + return T.Make() + 1; +} +``` + +Here, we want to be able to use the result of `F.Make()` in all regards like an +`i32`. Assuming an `i32.Abs` function exists, and `i32` implements `Hashable` +externally, it is more desirable for `F.Make().Abs()` to work than it is for +`F.Make().Hash()` to work. + +We are +[currently aiming to address this](https://github.com/carbon-language/carbon-lang/pull/2070) +by saying that when two type expressions are constrained to be equal, a value +whose type is either of those type expressions provides the (disjoint) union of +their interfaces. This leads to a surprise: + +``` +fn G[T:! X where .R == i32](x: T) -> u128 { + let n: i32 = 1; + let m: i64 = 2; + return n.Hash() ^ m.Hash(); +} +``` + +Here, the call to `n.Hash()` is valid but the call to `m.Hash()` is invalid, +because the type `i32` of `n` is constrained to be equal to `T.R` which is of +type `Hashable`, but the type `i64` is not constrained in that way and does not +implement `Hashable` internally. + +This suggests that there might be two different behaviors that we want when +adding a constraint: either we are changing the interface, or not. This is +analogous to internal versus external implementation of an interface. + +### Type canonicalization + +There are a variety of contexts in which types are compared for equality. In +some of those contexts, it might be acceptable to perform a search and a +computation, and it might be acceptable to have false negatives. For example, +when analyzing an argument being passed to a function in a generic, it might be +OK to say "we can't prove that the argument has a type that is equal to (or +convertible to) the type of the parameter", so long as the developer has some +way to demonstrate that the types are the same if that is indeed the case. + +However, not all comparisons of generic types have that property. As an extreme +and hopefully avoidable example, suppose a C++-like linkage model is used, with +name mangling. We may need to determine whether two type expressions are equal +at link time, by reducing them both to strings that are guaranteed to be +identical if the types are equal and non-identical otherwise. This requires some +form of type canonicalization: reducing all equal type expressions to some +common representation where we can faithfully perform equality comparisons. + +Even if the semantics of Carbon don't require canonicalization, it is still a +very useful property for implementation purposes. For example, if type +canonicalization is possible to perform efficiently, building a data structure +mapping from a type to some value becomes much simpler, and if the +representation of a type expression carries its canonical form then type +equality comparisons can be done in constant time. + +### Transitivity of equality + +For ergonomic purposes, it is valuable for the notion of type expression +equality to be transitive: if type A is known to be the same as type B, and type +B is known to be the same as type C, then shouldn't we be able to use an A where +a C is required? See +[#1369](https://github.com/carbon-language/carbon-lang/issues/1369) for an +explanation of why this is important. + +However, if we permit arbitrary constraints and have a transitive equality rule, +then type equality is isomorphic to the +[word problem for finitely-presented monoids](https://en.wikipedia.org/wiki/Word_problem_for_groups), +which is undecidable, and certainly not computable, in general. + +Therefore we must impose a restriction somewhere: either not all constraints are +permissible, or we do not have transitivity, or we impose some other constraint +on the language to avoid the hard cases. Some languages merely impose some kind +of depth limit on their search for a proof of equality. + +Note that if we can perform type canonicalization, then type equality is +necessarily transitive: if all equal _pairs_ of types canonicalize to the same +representation, then _all_ equal types canonicalize to the same representation +transitively. + +### Coherence + +In order to avoid surprises in the language behavior, we want Carbon to have the +property that learning more about the type of a value should not cause the +behavior of an expression to change from one valid behavior to another valid +behavior. + +It's important to distinguish here between learning more about a type, such as +learning that it implements a particular interface – for example, from an added +import – and changing the type, such as by writing a different expression after +a `:` or `:!`. Changing the type of a value is expected to be able to change the +behavior of an expression using that value. + +## Background + +There has been a lot of work in this space. Some previous relevant proposals: + +- [#731: generics details 2](https://github.com/carbon-language/carbon-lang/pull/731) + introduced associated types, with the intent that a mechanism to constrain + their value would be provided in a later proposal. +- [#818: generics details 3](https://github.com/carbon-language/carbon-lang/pull/818) + introduced `where` constraints for generics, with an open question for + whether and how we might restrict constraints in order to give a decidable + type equality rule. + - #818 represents the status quo of approved proposals. It has both `=` + and `==` where constraints, which are similar to, but somewhat different + from, those in this proposal. + - `=` constraints require a concrete type on their right-hand side. The + left-hand side is not restricted. The facet type of the left-hand side + type is changed to match the right-hand side. + - `==` constraints do not affect the facet type of either operand. + - As an answer to the question of how to produce a decidable type equality + rule, both kinds of constraints are applied automatically, but only at a + depth of one constraint. Neither form is transitive but both can be + transitively extended using `observe` declarations. +- [#2070: always `==` not `=` in `where` clauses](https://github.com/carbon-language/carbon-lang/pull/2070) + describes an attempt to unify the syntax and semantics of `=` and `==` + constraints, under which we would always merge the facet types of the + operands. That proposal has not yet been accepted, and this proposal intends + to supersede it. + +## Proposal + +Replace the semantics of the existing `where A == B` and `where A = B` +constraints and restrict the syntax of `where A = B` as follows: + +- `where .A = T` specifies a _rewrite_ constraint. The left-hand side is + required to name an associated constant. If that associated constant `.A` is + named as a member of a type declared with this constraint, it is rewritten + to `T`. This means that the interface of the type changes, and the behavior + is in all respects as if `T` had been specified directly. +- `where X == Y` specifies a _same-type_ constraint. This is treated much like + `where X is SameAs(Y)`. Here, `SameAs` is a hypothetical constraint that is + reflexive (`T is SameAs(T)` for all `T`), commutative (`T is SameAs(U)` if + and only if `U is SameAs(T)`), and not implementable by the developer. + Moreover, if `F` is any pure type function, `T is SameAs(U)` implies + `F(T) is SameAs(F(U)`). `observe` statements can be used to form transitive + `SameAs` relations. Same-type constraints are not used automatically by the + language rules for any purpose, but a blanket `ImplicitAs` impl permits + conversions between types that are known to be the same: + ``` + impl forall [T:! type, U:! type where .Self == T] T as ImplicitAs(U); + ``` + +A type implements `C where .A = T` if and only if it implements +`C where .A == T`, assuming both constraints are valid – the inhabitants of +these two facet types are the same, but they are different facet types and +provide different interfaces for corresponding type values. + +## Details + +### Rewrite constraints + +In a rewrite constraint, the left-hand operand of `=` must be a `.` followed by +the name of an associated constant. `.Self` is not permitted. + +``` +interface RewriteSelf { + // ❌ Error: `.Self` is not the name of an associated constant. + let Me:! type where .Self = Self; +} +interface HasAssoc { + let Assoc:! type; +} +interface RewriteSingleLevel { + // ✅ Uses of `A.Assoc` will be rewritten to `i32`. + let A:! HasAssoc where .Assoc = i32; +} +interface RewriteMultiLevel { + // ❌ Error: Only one level of associated constant is permitted. + let B:! RewriteSingleLevel where .A.Assoc = i32; +} +``` + +This notation is permitted anywhere a constraint can be written, and results in +a new constraint with a different interface: the named member effectively no +longer names an associated constant of the constrained type, and is instead +treated as a rewrite rule that expands to the right-hand side of the constraint, +with any mentioned parameters substituted into that type. + +``` +interface Container { + let Element:! type; + let Slice:! Container where .Element = Element; + fn Add[addr me: Self*](x: Element); +} +// `T.Slice.Element` rewritten to `T.Element` +// because type of `T.Slice` says `.Element = Element`. +// `T.Element` rewritten to `i32` +// because type of `T` says `.Element = i32`. +fn Add[T:! Container where .Element = i32](p: T*, y: T.Slice.Element) { + // ✅ Argument `y` has the same type `i32` as parameter `x` of + // `T.(Container.Add)`, which is also rewritten to `i32`. + p->Add(y); +} +``` + +Rewrites aren't performed on the left-hand side of such an `=`, so +`where .A = .B and .A = C` is not rewritten to `where .A = .B and .B = C`. +Instead, such a `where` clause is invalid when the constraint is +[resolved](#constraint-resolution) unless each rule for `.A` specifies the same +rewrite. + +Note that `T:! C where .R = i32` can result in a type `T.R` whose behavior is +different from the behavior of `T.R` given `T:! C`. For example, member lookup +into `T.R` can find different results and operations can therefore have +different behavior. However, this does not violate coherence because the facet +types `C` and `C where .R = i32` don't differ by merely having more type +information; rather, they are different facet types that have an isomorphic set +of values, somewhat like `i32` and `u32`. An `=` constraint is not merely +learning a new fact about a type, it is requesting different behavior. + +### Combining constraints with `&` + +Suppose we have `X = C where .R = A` and `Y = C where .R = B`. What should +`C & X` produce? What should `X & Y` produce? + +We could perhaps say that `X & Y` results in a facet type where the type of `R` +has the union of the interface of `A` and the interface of `B`, and that `C & X` +similarly results in a facet type where the type of `R` has the union of the +interface of `A` and the interface originally specified by `C`. However, this +proposal suggests a simpler rule: + +- Combining two rewrite rules with different rewrite targets results in a + facet type where the associated constant is ambiguous. Given `T:! X & Y`, + the type expression `T.R` is ambiguous between a rewrite to `A` and a + rewrite to `B`. But given `T:! X & X`, `T.R` is unambiguously rewritten to + `A`. +- Combining a constraint with a rewrite rule with a constraint with no rewrite + rule preserves the rewrite rule. For example, supposing that + `interface Container` extends `interface Iterable`, and `Iterable` has an + associated constant `Element`, the constraint + `Container & (Iterable where .Element = i32)` is the same as the constraint + `(Container & Iterable) where .Element = i32` which is the same as the + constraint `Container where .Element = i32`. + +If the rewrite for an associated constant is ambiguous, the facet type is +rejected during [constraint resolution](#constraint-resolution). + +### Combining constraints with `and` + +It's possible for one `=` constraint in a `where` to refer to another. When this +happens, the facet type `C where A and B` is interpreted as +`(C where A) where B`, so rewrites in `A` are applied immediately to names in +`B`, but rewrites in `B` are not applied to names in `A` until the facet type is +[resolved](#constraint-resolution): + +``` +interface C { + let T:! type; + let U:! type; + let V:! type; +} +class M { + alias Me = Self; +} +// ✅ Same as `C where .T = M and .U = M.Me`, which is +// the same as `C where .T = M and .U = M`. +fn F[A:! C where .T = M and .U = .T.Me]() {} +// ❌ No member `Me` in `A.T:! type`. +fn F[A:! C where .U = .T.Me and .T = M]() {} +``` + +### Combining constraints with `extends` + +Within an interface or named constraint, `extends` can be used to extend a +constraint that has rewrites. + +``` +interface A { + let T:! type; + let U:! type; +} +interface B { + extends A where .T = .U and .U = i32; +} + +var n: i32; + +// ✅ Resolved constraint on `T` is +// `B where .(A.T) = i32 and .(A.U) = i32`. +// `T.(A.T)` is rewritten to `i32`. +fn F(T:! B) -> T.(A.T) { return n; } +``` + +### Combining constraints with `impl as` and `is` + +Within an interface or named constraint, the `impl T as C` syntax does not +permit `=` constraints to be specified directly. However, such constraints can +be specified indirectly, for example if `C` is a named constraint that contains +rewrites. Because these constraints don't change the type of `T`, rewrite +constraints in this context will never result in a rewrite being performed, and +instead are equivalent to `==` constraints. + +``` +interface A { + let T:! type; + let U:! type; +} +constraint C { + extends A where .T = .U and .U = i32; +} +constraint D { + extends A where .T == .U and .U == i32; +} +interface B { + // OK, equivalent to `impl as D;` or + // `impl as A where .T == .U and .U == i32;`. + impl as C; +} + +var n: i32; + +// ❌ No implicit conversion from `i32` to `T.(A.T)`. +// Resolved constraint on `T` is +// `B where T.(A.T) == T.(A.U) and T.(A.U) = i32`. +// `T.(A.T)` is single-step equal to `T.(A.U)`, and +// `T.(A.U)` is single-step equal to `i32`, but +// `T.(A.T)` is not single-step equal to `i32`. +fn F(T:! B) -> T.(A.T) { return n; } +``` + +A purely syntactic check is used to determine if an `=` is specified directly in +an expression: + +- An `=` constraint is specified directly in its enclosing `where` expression. +- If an `=` constraint is specified directly in an operand of an `&` or + `(`...`)`, then it is specified directly in that enclosing expression. + +In an `impl as C` or `impl T as C` declaration in an interface or named +constraint, `C` is not allowed to directly specify any `=` constraints: + +``` +// Compile-time identity function. +fn Identity[T:! type](x:! T) -> T { return x; } + +interface E { + // ❌ Rewrite constraint specified directly. + impl as A where .T = .U and .U = i32; + // ❌ Rewrite constraint specified directly. + impl as type & (A where .T = .U and .U = i32); + // ✅ Not specified directly, but does not result + // in any rewrites being performed. + impl as Identity(A where .T = .U and .U = i32); +} +``` + +The same rules apply to `is` constraints. Note that `.T == U` constraints are +also not allowed in this context, because the reference to `.T` is rewritten to +`.Self.T`, and `.Self` is ambiguous. + +``` +// ❌ Rewrite constraint specified directly in `is`. +fn F[T:! A where .U is (A where .T = i32)](); + +// ❌ Reference to `.T` in same-type constraint is ambiguous: +// does this mean the outer or inner `.Self.T`? +fn G[T:! A where .U is (A where .T == i32)](); + +// ✅ Not specified directly, but does not result +// in any rewrites being performed. Return type +// is not rewritten to `i32`. +fn H[T:! type where .Self is C]() -> T.(A.U); + +// ✅ Return type is rewritten to `i32`. +fn I[T:! C]() -> T.(A.U); +``` + +### Constraint resolution + +When a facet type is used as the declared type of a type `T`, the constraints +that were specified within that facet type are _resolved_ to determine the +constraints that apply to `T`. This happens: + +- When the constraint is used explicitly, when declaring a generic parameter + or associated constant of the form `T:! Constraint`. +- When declaring that a type implements a constraint with an `impl` + declaration, such as `impl T as Constraint`. Note that this does not include + `impl ... as` constraints appearing in `interface` or `constraint` + declarations. + +In each case, the following steps are performed to resolve the facet type's +abstract constraints into a set of constraints on `T`: + +- If multiple rewrites are specified for the same associated constant, they + are required to be identical, and duplicates are discarded. +- Rewrites are performed on other rewrites in order to find a fixed point, + where no rewrite applies within any other rewrite. If no fixed point exists, + the generic parameter declaration or `impl` declaration is invalid. +- Rewrites are performed throughout the other constraints in the facet type -- + that is, in any `==` constraints and `is` constraints -- and the type + `.Self` is replaced by `T` throughout the constraint. + +``` +// ✅ `.T` in `.U = .T` is rewritten to `i32` when initially +// forming the facet type. +// Nothing to do during constraint resolution. +fn InOrder[A:! C where .T = i32 and .U = .T]() {} +// ✅ Facet type has `.T = .U` before constraint resolution. +// That rewrite is resolved to `.T = i32`. +fn Reordered[A:! C where .T = .U and .U = i32]() {} +// ✅ Facet type has `.U = .T` before constraint resolution. +// That rewrite is resolved to `.U = i32`. +fn ReorderedIndirect[A:! (C where .T = i32) & (C where .U = .T)]() {} +// ❌ Constraint resolution fails because +// no fixed point of rewrites exists. +fn Cycle[A:! C where .T = .U and .U = .T]() {} +``` + +To find a fixed point, we can perform rewrites on other rewrites, cycling +between them until they don't change or until a rewrite would apply to itself. +In the latter case, we have found a cycle and can reject the constraint. Note +that it's not sufficient to expand only a single rewrite until we hit this +condition: + +``` +// ❌ Constraint resolution fails because +// no fixed point of rewrites exists. +// If we only expand the right-hand side of `.T`, +// we find `.U`, then `.U*`, then `.U**`, and so on, +// and never detect a cycle. +// If we alternate between them, we find +// `.T = .U*`, then `.U = .U**`, then `.V = .U***`, +// then `.T = .U**`, then detect that the `.U` rewrite +// would apply to itself. +fn IndirectCycle[A:! C where .T = .U and .U = .V* and .V = .U*](); +``` + +After constraint resolution, no references to rewritten associated constants +remain in the constraints on `T`. + +If a facet type is never used to constrain a type, it is never subject to +constraint resolution, and it is possible for a facet type to be formed for +which constraint resolution would always fail. For example: + +``` +package Broken api; + +interface X { + let T:! type; + let U:! type; +} +let Bad:! auto = (X where .T = .U) & (X where .U = .T); +// Bad is not used here. +``` + +In such cases, the facet type `Broken.Bad` is not usable: any attempt to use +that facet type to constrain a type would perform constraint resolution, which +would always fail because it would discover a cycle between the rewrites for +`.T` and for `.U`. In order to ensure that such cases are diagnosed, a trial +constraint resolution is performed for all facet types. Note that this trial +resolution can be skipped for facet types that are actually used, which is the +common case. + +### Precise rules and termination + +This section explains the detailed rules used to implement rewrites. There are +two properties we aim to satisfy: + +1. After type-checking, no symbolic references to associated constants that + have an associated rewrite rule remain. +2. Type-checking always terminates in a reasonable amount of time, ideally + linear in the size of the problem. + +Rewrites are applied in two different phases of program analysis. + +- During qualified name lookup and type checking for qualified member access, + if a rewritten member is looked up, the right-hand side's value and type are + used for subsequent checks. +- During substitution, if the symbolic name of an associated constant is + substituted into, and substitution into the left-hand side results in a + value with a rewrite for that constant, that rewrite is applied. + +In each case, we always rewrite to a value that satisfies property 1 above, and +these two steps are the only places where we might form a symbolic reference to +an associated cosntant, so property 1 is recursively satisfied. Moreover, we +apply only one rewrite in each of the above cases, satisfying property 2. + +#### Qualified name lookup + +Qualified name lookup into either a type parameter or into an expression whose +type is a symbolic type `T` -- either a type parameter or an associated type -- +considers names from the facet type `C` of `T`, that is, from `T`’s declared +type. + +``` +interface C { + let M:! i32; + let U:! C; +} +fn F[T:! C](x: T) { + // Value is C.M in all four of these + let a: i32 = x.M; + let b: i32 = T.M; + let c: i32 = x.U.M; + let d: i32 = T.U.M; +} +``` + +When looking up the name `N`, if `C` is an interface `I` and `N` is the name of +an associated constant in that interface, the result is a symbolic value +representing "the member `N` of `I`". If `C` is formed by combining interfaces +with `&`, all such results are required to find the same associated constant, +otherwise we reject for ambiguity. + +If a member of a class or interface is named in a qualified name lookup, the +type of the result is determined by performing a substitution. For an interface, +`Self` is substituted for the self type, and any parameters for that class or +interface (and enclosing classes or interfaces, if any) are substituted into the +declared type. + +``` +interface SelfIface { + fn Get[me: Self]() -> Self; +} +class UsesSelf(T:! type) { + // Equivalent to `fn Make() -> UsesSelf(T)*;` + fn Make() -> Self*; + impl as SelfIface; +} + +// ✅ `T = i32` is substituted into the type of `UsesSelf(T).Make`, +// so the type of `UsesSelf(i32).Make` is `fn () -> UsesSelf(i32)*`. +let x: UsesSelf(i32)* = UsesSelf(i32).Make(); + +// ✅ `Self = UsesSelf(i32)` is substituted into the type +// of `SelfIface.Get`, so the type of `UsesSelf(i32).(SelfIface.Get)` +// is `fn [me: UsesSelf(i32)]() -> UsesSelf(i32)`. +let y: UsesSelf(i32) = x->Get(); +``` + +None of this is new in this proposal. This proposal adds the rule: + +If a facet type `C` into which lookup is performed includes a `where` clause +saying `.N = U`, and the result of qualified name lookup is the associated +constant `N`, that result is replaced by `U`, and the type of the result is the +type of `U`. No substitution is performed in this step, not even a `Self` +substitution -- any necessary substitutions were already performed when forming +the facet type `C`, and we don’t consider either the declared type or value of +the associated constant at all for this kind of constraint. Going through an +example in detail: + +``` +interface A { + let T:! type; +} +interface B { + let U:! type; + // More explicitly, this is of type `A where .(A.T) = Self.(B.U)` + let V:! A where .T = U; +} +// Type of T is B. +fn F[T:! B](x: T) { + // The type of the expression `T` is `B`. + // `T.V` finds `B.V` with type `A where .(A.T) = Self.(B.U)`. + // We substitute `Self` = `T` giving the type of `u` as + // `A where .(A.T) = T.(B.U)`. + let u:! auto = T.V; + // The type of `u` is `A where .(A.T) = T.(B.U)`. + // Lookup for `u.T` resolves it to `u.(A.T)`. + // So the result of the qualified member access is `T.(B.U)`, + // and the type of `v` is the type of `T.(B.U)`, namely `type`. + // No substitution is performed in this step. + let v:! auto = u.T; +} +``` + +The more complex case of + +``` +fn F2[T:! B where .U = i32](x: T); +``` + +is discussed later. + +#### Type substitution + +At various points during the type-checking of a Carbon program, we need to +substitute a set of (binding, value) pairs into a symbolic value. We saw an +example above: substituting `Self = T` into the type `A where .(A.T) = Self.U` +to produce the value `A where .(A.T) = T.U`. Another important case is the +substitution of inferred parameter values into the type of a function when +type-checking a function call: + +``` +fn F[T:! C](x: T) -> T; +fn G(n: i32) -> i32 { + // Deduces T = i32, which is substituted + // into the type `fn (x: T) -> T` to produce + // `fn (x: i32) -> i32`, giving `i32` as the type + // of the call expression. + return F(n); +} +``` + +Qualified name lookup is not re-done as a result of type substitution. For a +template, we imagine there’s a completely separate step that happens before type +substitution, where qualified name lookup is redone based on the actual value of +template arguments; this proceeds as described in the previous section. +Otherwise, we performed the qualified name lookup when type-checking the +generic, and do not do it again: + +``` +interface IfaceHasX { + let X:! type; +} +class ClassHasX { + class X {} +} +interface HasAssoc { + let Assoc:! IfaceHasX; +} + +// Qualified name lookup finds `T.(HasAssoc.Assoc).(IfaceHasX.X)`. +fn F(T:! HasAssoc) -> T.Assoc.X; + +fn G(T:! HasAssoc where .Assoc = ClassHasX) { + // `T.Assoc` rewritten to `ClassHasX` by qualified name lookup. + // Names `ClassHasX.X`. + var a: T.Assoc.X = {}; + // Substitution produces `ClassHasX.(IfaceHasX.X)`, + // not `ClassHasX.X`. + var b: auto = F(T); +} +``` + +During substitution, we might find a member access that named an opaque symbolic +associated constant in the original value can now be resolved to some specific +value. It’s important that we perform this resolution: + +``` +interface A { + let T:! type; +} +class K { fn Member(); } +fn H[U:! A](x: U) -> U.T; +fn J[V:! A where .T = K](y: V) { + // We need the interface of `H(y)` to include + // `K.Member` in order for this lookup to succeed. + H(y).Member(); +} +``` + +The values being substituted into the symbolic value are themselves already +fully substituted and resolved, and in particular, satisfy property 1 given +above. + +Substitution proceeds by recursively rebuilding each symbolic value, bottom-up, +replacing each substituted binding with its value. Doing this naively will +propagate values like `i32` in the `F`/`G` case earlier in this section, but +will not propagate rewrite constants like in the `H`/`J` case. The reason is +that the `.T = K` constraint is in the _type_ of the substituted value, rather +than in the substituted value itself: deducing `T = i32` and converting `i32` to +the type `C` of `T` preserves the value `i32`, but deducing `U = V` and +converting `V` to the type `A` of `U` discards the rewrite constraint. + +In order to apply rewrites during substitution, we associate a set of rewrites +with each value produced by the recursive rebuilding procedure. This is somewhat +like having substitution track a refined facet type for the type of each value, +but we don’t need -- or want -- for the type to change during this process, only +for the rewrites to be properly applied. For a substituted binding, this set of +rewrites is the rewrites found on the type of the corresponding value prior to +conversion to the type of the binding. When rebuilding a member access +expression, if we have a rewrite corresponding to the accessed member, then the +resulting value is the target of the rewrite, and its set of rewrites is that +found in the type of the target of the rewrite, if any. Because the target of +the rewrite is fully resolved already, we can ask for its type without +triggering additional work. In other cases, the rewrite set is empty; all +necessary rewrites were performed when type-checking the value we're +substituting into. + +Continuing an example from [qualified name lookup](#qualified-name-lookup): + +``` +interface A { + let T:! type; +} +interface B { + let U:! type; + let V:! A where .T = U; +} + +// Type of the expression `T` is `B where .(B.U) = i32` +fn F2[T:! B where .U = i32](x: T) { + // The type of the expression `T` is `B where .U = i32`. + // `T.V` is looked up and finds the associated type `(B.V)`. + // The declared type is `A where .(A.T) = Self.U`. + // We substitute `Self = T` with rewrite `.U = i32`. + // The resulting type is `A where .(A.T) = i32`. + // So `u` is `T.V` with type `A where .(A.T) = i32`. + let u:! auto = T.V; + // The type of `u` is `A where .(A.T) = i32`. + // Lookup for `u.T` resolves it to `u.(A.T)`. + // So the result of the qualified member access is `i32`, + // and the type of `v` is the type of `i32`, namely `type`. + // No substitution is performed in this step. + let v:! auto = u.T; +} +``` + +#### Examples + +``` +interface Container { + let Element:! type; +} +interface SliceableContainer { + extends Container; + let Slice:! Container where .Element = Self.(Container.Element); +} +// ❌ Qualified name lookup rewrites this facet type to +// `SliceableContainer where .(Container.Element) = .Self.(Container.Element)`. +// Constraint resolution rejects this because this rewrite forms a cycle. +fn Bad[T:! SliceableContainer where .Element = .Slice.Element](x: T.Element) {} +``` + +``` +interface Helper { + let D:! type; +} +interface Example { + let B:! type; + let C:! Helper where .D = B; +} +// ✅ `where .D = ...` by itself is fine. +// `T.C.D` is rewritten to `T.B`. +fn Allowed(T:! Example, x: T.C.D); +// ❌ But combined with another rewrite, creates an infinite loop. +// `.C.D` is rewritten to `.B`, resulting in `where .B = .B`, +// which causes an error during constraint resolution. +// Using `==` instead of `=` would make this constraint redundant, +// rather than it being an error. +fn Error(T:! Example where .B = .C.D, x: T.C.D); +``` + +``` +interface Allowed; +interface AllowedBase { + let A:! Allowed; +} +interface Allowed { + extends AllowedBase where .A = .Self; +} +// ✅ The final type of `x` is `T`. It is computed as follows: +// In `((T.A).A).A`, the inner `T.A` is rewritten to `T`, +// resulting in `((T).A).A`, which is then rewritten to +// `(T).A`, which is then rewritten to `T`. +fn F(T:! Allowed, x: ((T.A).A).A); +``` + +``` +interface MoveYsRight; +constraint ForwardDeclaredConstraint(X:! MoveYsRight); +interface MoveYsRight { + let X:! MoveYsRight; + // Means `Y:! MoveYsRight where .X = X.Y` + let Y:! ForwardDeclaredConstraint(X); +} +constraint ForwardDeclaredConstraint(X:! MoveYsRight) { + extends MoveYsRight where .X = X.Y; +} +// ✅ The final type of `x` is `T.X.Y.Y`. It is computed as follows: +// The type of `T` is `MoveYsRight`. +// The type of `T.Y` is determined as follows: +// - Qualified name lookup finds `MoveYsRight.Y`. +// - The declared type is `ForwardDeclaredConstraint(Self.X)`. +// - That is a named constraint, for which we perform substitution. +// Substituting `X = Self.X` gives the type +// `MoveYsRight where .X = Self.X.Y`. +// - Substituting `Self = T` gives the type +// `MoveYsRight where .X = T.X.Y`. +// The type of `T.Y.Y` is determined as follows: +// - Qualified name lookup finds `MoveYsRight.Y`. +// - The declared type is `ForwardDeclaredConstraint(Self.X)`. +// - That is a named constraint, for which we perform substitution. +// Substituting `X = Self.X` gives the type +// `MoveYsRight where .X = Self.X.Y`. +// - Substituting `Self = T.Y` with +// rewrite `.X = T.X.Y` gives the type +// `MoveYsRight where .X = T.Y.X.Y`, but +// `T.Y.X` is replaced by `T.X.Y`, giving +// `MoveYsRight where .X = T.X.Y.Y`. +// The type of `T.Y.Y.X` is determined as follows: +// - Qualified name lookup finds `MoveYsRight.X`. +// - The type of `T.Y.Y` says to rewrite that to `T.X.Y.Y`. +// - The result is `T.X.Y.Y`, of type `MoveYsRight`. +fn F4(T:! MoveYsRight, x: T.Y.Y.X); +``` + +#### Termination + +Each of the above steps performs at most one rewrite, and doesn't introduce any +new recursive type-checking steps, so should not introduce any new forms of +non-termination. Rewrite constraints thereby give us a deterministic, +terminating type canonicalization mechanism for associated constants: in `A.B`, +if the type of `A` specifies that `.B = C`, then `A.B` is replaced by `C`. +Equality of types constrained in this way is transitive. + +However, some existing forms of non-termination may remain, such as template +instantiation triggering another template instantiation. Such cases will need to +be detected and handled in some way, such as by a depth limit, but doing so +doesn't compromise the soundness of the type system. + +### Same type constraints + +A same-type constraint describes that two type expressions are known to evaluate +to the same value. Unlike a rewrite constraint, however, the two type +expressions are treated as distinct types when type-checking a generic that +refers to them. + +Same-type constraints are brought into scope, looked up, and resolved exactly as +if there were a `SameAs(U:! type)` interface and a `T == U` impl corresponded to +`T is SameAs(U)`, except that `==` is commutative. As such, it's not possible to +ask for a list of types that are the same as a given type, nor to ask whether +there exists a type that is the same as a given type and has some property. But +it is possible to ask whether two types are (non-transitively) the same. + +In order for same-type constraints to be useful, they must allow the two types +to be treated as actually being the same in some context. This can be +accomplished by the use of `==` constraints in an `impl`, such as in the +built-in implementation of `ImplicitAs`: + +``` +final impl forall [T:! type, U:! type where .Self == T] T as ImplicitAs(U) { + fn Convert[me: Self](other: U) -> U { ... } +} +``` + +It superficially seems like it would be convenient if such implementations were +made available implicitly – for example, by writing +`impl forall [T:! type] T as ImplicitAs(T)` – but in more complex examples that +turns out to be problematic. Consider: + +``` +interface CommonTypeWith(U:! type) { + let Result:! type; +} +final impl forall [T:! type] T as CommonTypeWith(T) where .Result = T {} + +fn F[T:! Potato, U:! Hashable where .Self == T](x: T, y: U) -> auto { + // What is T.CommonTypeWith(U).Result? Is it T or U? + return (if cond then x else y).Hash(); +} +``` + +With this proposal, `impl` validation for `T as CommonTypeWith(U)` fails: we +cannot pick a common type when given two distinct type expressions, even if we +know they evaluate to the same type, because we would not know which API the +result should have. + +#### Implementation of same-type `ImplicitAs` + +It is possible to implement the above `impl` of `ImplicitAs` directly in Carbon, +without a compiler builtin, by taking advantage of the built-in conversion +between `C where .A = X` and `C where .A == X`: + +``` +interface EqualConverter { + let T:! type; + fn Convert(t: T) -> Self; +} +fn EqualConvert[T:! type](t: T, U:! EqualConverter where .T = T) -> U { + return U.Convert(t); +} +impl forall [U:! type] U as EqualConverter where .T = U { + fn Convert(u: U) -> U { return u; } +} + +impl forall [T:! type, U:! type where .Self == T] T as ImplicitAs(U) { + fn Convert[self: Self]() -> U { return EqualConvert(self, U); } +} +``` + +The transition from `(T as ImplicitAs(U)).Convert`, where we know that `U == T`, +to `EqualConverter.Convert`, where we know that `.T = U`, allows a same-type +constraint to be used to perform a rewrite. + +This implementation is in use in Carbon Explorer. + +### Observe declarations + +Same-type constraints are non-transitive, just like other constraints such as +`ImplicitAs`. The developer can use an `observe` declaration to bring a new +same-type constraint into scope: + +``` +observe A == B == C; +``` + +notionally does much the same thing as + +``` +impl A as SameAs(C) { ... } +``` + +where the `impl` makes use of `A is SameAs(B)` and `B is SameAs(C)`. + +### Implementing an interface + +When implementing an interface, the `=` syntax must be used, and each associated +constant without a default must be explicitly assigned-to: + +``` +impl IntVec as Container where .Element = i32 { ... } +``` + +not + +``` +impl IntVec as Container where .Element == i32 { ... } +``` + +The latter would be treated as + +``` +impl IntVec as Container where IntVec.Element is SameAs(i32) { ... } +``` + +... which only checks the value of `Element` rather than specifying it. + +As in other contexts, `Self.Element` is rewritten to `i32` within the context of +the `impl`. + +### Ergonomics + +This proposal can be viewed as dividing type constraints into two categories: + +- Ergonomic, but quite restricted, constraints using `=`. +- Non-ergonomic, but fully general, constraints using `==`. + +In order for this to be an effective and ergonomic strategy, we require both +that the difference between these two kinds of constraints are readily +understood by developers, and that the more-ergonomic `=` constraints cover +sufficiently many scenarios that the developer seldom needs to write code to +request a conversion between types that are known to be the same. + +Ideally, we hope that `=` constraints will cover all real situations, and `==` +constraints will never need to be written, outside of the one `ImplicitAs` +implementation described above. If this turns out to be true in practice, we +should consider removing `==` syntax from the language. However, this will not +remove `==` semantics, both because of the behavior of `ImplicitAs` and because +`impl T as C where .A = B` constraints in an interface or named constraint give +rise to `==` semantics. + +### Implementation experience + +This proposal has been mostly implemented in Carbon Explorer, and appears to +behave as expected. However, Explorer does not yet support forward declarations +of interfaces and constraints, which means that some of the more interesting +cases can't be tested. + +## Rationale + +- [Language tools and ecosystem](/docs/project/goals.md#language-tools-and-ecosystem) + - This rule is easy to implement, and even a naive implementation should + be efficient. +- [Software and language evolution](/docs/project/goals.md#software-and-language-evolution) + - The status quo causes coincidentally equal types to have the same + interface. Under this proposal, the interface of a type is not affected + by coincidental equality, only by intentional assignment to an + associated type. +- [Code that is easy to read, understand, and write](/docs/project/goals.md#code-that-is-easy-to-read-understand-and-write) + - Using `=` rather than `==` in `impl`s is easier to read and write. + - The interface available on a type is more predictable in this proposal + than in the status quo ante, making code easier to understand. + +## Alternatives considered + +### Status quo + +The [Problem](#problem) section describes some concerns with the status quo +approach. This proposal aims to address those concerns. + +### Restrict constraints to allow computable type equality + +We could somehow restrict which constraints can be specified, in order to ensure +that our type equality rule is efficiently computable, perhaps even in a way +that permits a deterministic computation of a canonical type. The exact details +of how we would do this have not been worked out, but this might lead to a +coherent rule that has transitive type equality. However, this would not solve +the "equal types with different interfaces" problem. + +### Find a fully transitive approach to type equality + +This proposal makes type equality transitive, but still has non-transitive +"same-type" constraints describing types that are known to always eventual be +the same after substitution, but which are not treated as the same type value +while type-checking a generic. This proposal also imposes a directionality on +rewrites, a restriction to only rewrite at a depth of exactly one associated +constant, and a restriction that only one value can be specified as equal to a +given associated constant. We could seek a type equality rule that would avoid +having two different kinds of equality and would avoid some or all of the +restrictions placed on rewrite constraints. + +One candidate approach is to accept that the full generality of the equality +problem is +[hard](https://link.springer.com/content/pdf/10.1007/3-540-17220-3_7.pdf), but +attempt to solve it anyway. Because of the close correspondence between Turing +machines and string rewriting systems, this results in a type system with no +upper bound on its runtime. We consider this unacceptable for Carbon, but it is +the approach taken by some other modern languages: + +- Swift has an + [undecidable type system](https://forums.swift.org/t/swift-type-checking-is-undecidable/39024) + due to this, and handles the problem by placing a limit on the complexity of + cases they will accept, but that seems unsuited to Carbon's goals, as the + rule cannot be readily understood by a developer nor effectively worked + around. +- Rust also has an + [undecidable type system](https://sdleffler.github.io/RustTypeSystemTuringComplete/) + (content warning: contains many instances of an obscene word as part of a + programming language name) for the same reason, and similarly has a + recursion limit. + +See also [Typing is Hard](https://3fx.ch/typing-is-hard.html), which lists +similar information for a variety of other languages. Note that most languages +listed have an undecidable type system. + +Another candidate approach is to find a way to reduce the inputs to the +type-checking step as a set of non-quantified equalities. The resulting system +of equalities can then be +[solved efficiently](https://dl.acm.org/doi/pdf/10.1145/322186.322198) to +determine whether two types are known to be the same. This approach appears +sufficient to cope with locally declared constraints, but when constraints +appear within interfaces, they can give rise to an infinite family of +equalities, and picking any finite subset risks the result being non-transitive +in general, if a different subset of equalities might be considered in a +different type-checking context. + +### Different syntax for rewrite constraint + +We could use a different syntax, such as `~>`, for rewrite constraints. This +would have the advantage of not using assignment for an operation that's not +always doing something that developers will think of as assignment. It also +avoids hard-to read code in cases where a binding has an initializer: + +``` +// This proposal. +let T:! Add where .Result = i32 = i32; +// Alternative syntax. +let T:! Add where .Result ~> i32 = i32; +``` + +However, it seems valuable to use only a single syntax for specifying these +constraints, and equality is the appropriate mental model most of the time. +Moreover, in an `impl` declaration, we really are assigning to the associated +constant in the sense of setting a value, rather than merely constraining a +value. + +This decision should be revisited if a superior syntax that avoids the above +visual ambiguity is suggested. + +### Different syntax for same-type constraint + +We considered various options for the spelling of a same-type constraint: + +- `where A == B` +- `where A ~ B` +- some other operator symbol +- `where A is SameType(B)`, or some other constraint name + +The advantage of `==` is that it has a lot of desirable implications: it's +familiar, symmetric, and connotes the left and right operand being the same. +However, it also might be taken as suggesting that the `==` overloaded operator +is used to determine equality. A different spelling would also indicate that +this is an unusual constraint, and would be easier to search for. Concerns were +also raised that `==` might suggest transitivity if taken to mean "same value" +rather than something like the Carbon `==` binary operator, and the transitivity +of `==` constraints is not automatic. + +The `~` operator is currently not in use as a binary operator. However, that +makes it quite valuable syntactic real estate, and it may see little use in this +context in practice. A longer operator could be used, but any choice other than +`==` will present an unfamiliarity barrier, and longer operators will be harder +to remember. + +A named constraint is appealing, but this operator does not behave like a named +constraint in that it is automatically symmetrized, not implementable, and +especially in its interactions with `observe` declarations. It is unclear how +`observe` declarations would be written with a named `SameType` constraint: + +``` +// Maybe something like this special-case and verbose syntax? +observe A is SameType(B) and B is SameType(C) => A is SameType(C); +``` + +The fundamental connection between same-type constraints and `observe` suggests +that dedicated syntax is justified. + +For now, we use `==`, despite it having a different meaning in the context of a +constraint than in expression contexts. This is analogous to how `=` and `and` +have different meanings in this context from in expressions, so the behavior +divergence is at least consistent. It's also not clear at this stage how much +usage same-type constraints will have, compared to rewrite constraints, so it's +hard to judge the other arguments in favor of and against other operators. It +would be reasonable to revisit this decision when we have more usage +information. + +### Required ordering for rewrites + +We considered a rule where a rewrite would need to be specified in a constraint +before the associated constant is used. For example: + +``` +interface C { + let T:! type; + let U:! type; +} +// Could reject, `.U` referenced before rewrite is defined. +fn G[A:! C where .T = .U and .U = i32]() {} +``` + +This would remove the need to perform constraint resolution. Instead, we could +apply rewrites as we form the constraint, and constraints formed in this way +would always satisfy the property that they don't refer to associated constants +that have a rewrite. + +It may also be less surprising to use this rule. Because Carbon uses top-down +processing for type-checking in general, it may be unexpected that a rewrite +rule would rewrite an earlier occurrence of its rewrite target, even though this +happens in practice at a later point in time, during constraint resolution. + +However, such an approach would not give a satisfactory outcome for cases such +as: + +``` +constraint X { + extends C where .T = .U; +} +constraint Y { + extends C where .U = i32; +} +// Desired behavior is that `.T` and `.U` both rewrite to `i32`. +fn G[A:! X & Y]() -> A.T { return 0; } +``` + +Performing earlier-appearing rewrites in later constraints and additionally +performing a constraint resolution step to apply rewrites throughout the +constraint seems to give better outcomes for the examples we considered. + +### Multi-constraint `where` clauses + +Instead of treating `A where B and C` as `(A where B) where C`, we could model +it as `(A where B) & (A where C)`. + +Specifically, given + +``` +interface C { + let T:! type; + let U:! type; +} +``` + +this would treat `C where .T = A and .U = B` as +`(C where .T = A) & (C where .U = B)`, where the type of `.Self` would be `C` in +both clauses, with both constraints applied "simultaneously" to `C`. This would +mean that the order of clauses doesn’t matter. + +However, if we do this, then neither right-hand side is rewritten by name +lookup, and thus we would reject cases such as +`C where .T = i32 and .U = .T.(Add.Result)`. This case seems reasonable, and we +would prefer to accept it. + +Therefore, the rule we select is that, as soon as a rewrite is declared, it is +in scope, and later references to that associated constant refer to the +rewritten value with its rewritten type. We still reject cases like +`C where .U = .T.(Add.Result) and .T = i32`, because we do not know that +`T is Add` from the declared type of `C.T`. + +The same rule applies to `is` constraints: we accept +`C where .T is Add and .U = .T.(Add.Result)`, but reject +`C where .U = .T.(Add.Result) and .T is Add`. + +### Rewrite constraints in `impl as` constraints + +A rewrite constraint can appear indirectly in an `impl as` constraint in an +interface or named constraint: + +``` +interface A { + let T:! type; +} +constraint AInt { + extends A where .T = i32; +} +interface B { + // Or, `impl as A where .T = i32;` + impl as AInt; +} +``` + +When this happens, the rewrite constraint does not result in rewrites being +performed when the associated constant is mentioned in a member access into that +interface or named constraint: + +``` +// Return type is not rewritten to `i32`, +// but there is still a same-type constraint. +fn F(T:! B) -> T.(A.T) { + // OK, `i32` implicitly converts to `T.(A.T)`. + return 0 as i32; +} +``` + +This may be surprising: `B` says that its `.(A.T)` has a rewrite to `i32`, but +that rewrite is not performed. In contrast, rewrites in an `extends` constraint +do become part of the enclosing interface or named constraint. + +``` +interface C { + extends AInt; +} +// Return type is rewritten to `i32`. +fn G(T:! C) -> T.T; +``` + +Something similar happens with `impl T as` constraints. The following two +interfaces `A1` and `A2` are not equivalent: + +``` +interface A1 { + let X:! Add where .Result = i32; +} +constraint AddToi32 { + extends Add where .Result = i32; +} +interface A2 { + let X:! Add; + impl X as AddToi32; +} +``` + +In the former case, references to `A.X.Result` are rewritten to `i32`, and in +the latter case, they are not, because the rewrite is not part of the type of +`X`. + +The general principle here is that rewrites are part of the declared type of a +name, and can't be changed after the fact by another declaration. For the +`impl T as` case, this behavior is also a necessary part of the termination +argument. If rewrites from `impl T as` constraints were used, it would be +possible to form a rewrite cycle: + +``` +interface C { let X:! type; } +interface A { + let U:! C; + let T:! C; +} +interface B1 { + extends A; + impl T as C where .X = U.X; +} +interface B2 { + extends A; + impl U as C where .X = T.X; +} +// `V.T.X` ~> `V.U.X` ~> `V.T.X` ~> ... +fn F(V:! B1 & B2) -> V.T.X; +``` + +We could allow the specific case of `impl as` (not `impl T as`) to introduce +rewrites. The advantage of this change is that this behavior may be less +surprising in some cases. However, this would mean that an `impl as` sometimes +changes the interface of the enclosing constraint, and behaves inconsistently +from an `impl T as` constraint, so this proposal does not adopt that behavior. + +We could allow `impl T as` to introduce rewrites for `T` in general, but we +would need to find some way to fix the resulting holes in the termination +argument. + +To minimize the scope of confusion, we currently disallow rewrites from +appearing _syntactically_ within an `impl as` or `impl T as` constraint. We +could allow these constructs. However, a `.T = V` constraint would be equivalent +to a `.T == V` constraint, and the latter more clearly expresses the resulting +semantics, so disallowing the misleading form seems preferable.