-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ordered quantification (foreach loops and sequence comprehensions - take 2) #10
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nicely written proposal. I'm still concerned about the notion of ordering. In particular, the way the proposal is designed makes it so that in some cases you can't write seq x <- s | …
and instead must write seq x | x in s && …
. But in other cases, writing seq x <- s | …
is possible and means something different from seq s | x in s && …
:
seq x <- {1,2,3} // rejected
seq x | x in {1,2,3} // ⇒ [1,2,3]
seq x <- [1,1,1] // ⇒ [1,1,1]
seq x | x in [1,1,1] // ⇒ [1]
I would much prefer to ban sequence comprehensions without a <-
clause.
0005-ordered-quantification.md
Outdated
the bindings will be ordered according to the *natural ordering* of the type `T`. Not all Dafny types will | ||
have such a natural ordering defined, but at a minimum `int` and `real`, and any subset type or newtype based on them, | ||
will use the natural mathematical ordering. | ||
`x: int | x in {2, 5, -4}`, for example, would bind `x` to `-4`, `2`, and `5` in that order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In which case is the natural ordering needed? I'm surprised that we still need an ordering at all: for the example above we can write seq x: real <- sorted(mySet) | P(x)
, right? And for this one the libraries can provide a conversion function from sets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, how will this work with types that don't have comparisons? I'm assuming that types that don't support ==
also don't support comparisons, and yet this feature will let me compare arbitrary values: var aLeB := (seq x | x in {a, b}) == [a, b]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, how will this work with types that don't have comparisons? I'm assuming that types that don't support
==
also don't support comparisons, and yet this feature will let me compare arbitrary values:var aLeB := (seq x | x in {a, b}) == [a, b]
This is a very good point - I've been assuming in all my examples that the types were equality-supporting to focus on defining the semantics. That means in some cases you can't even create a compiled {a, b}
value in the first place. I'll call that out more explicitly and watch out for under-specified examples.
But yes, not all types will have natural orderings (any reference type for example) so some types will require <-
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your proposal makes sense, but it has caveats.
x: nat
says that x is in [0, 1, 2, 3, 4, ....] and then the range says how to filter this infinite range, so it's natural to think that the result be sorted if the underlying type "int" is sorted. In practice, we would compile to something different: We would infer what to enumerate x from (the range), and sort it. But this is a hidden performance cost based on heuristics.
We might as well not accept this "implicit sorting" for sequences, and immediately suggest that the enumeration be part of the x: nat
syntax, and enable non-deterministic set orderings.
0005-ordered-quantification.md
Outdated
The ordering of bindings is non-deterministic unless `C` is a sequence. | ||
If `C` is a `multiset`, multiple bindings with the same value of `x` will be created, but their ordering is still non-deterministic. | ||
|
||
The `<-` symbol would be read as "from", so a statement like `foreach x <- C { ... }` would be read as "for each x from C, ...". Note that `<-` is not an independent operator and is intentionally distinct from the `in` boolean operator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recycling in
might be fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is only that the difference between x in C
as a quantified variable declaration and x in C
as a boolean expression will be pretty subtle, when C is a multiset
or seq
: if C is [1, 1, 1]
, for example, we'd have:
assert (seq x in C) == [1, 1, 1]
assert (seq x | x in C) == [1]
That may be less of a concern if I drop the natural ordering idea that allows the second form, so we'll see how this plays out in the next revision. :)
0005-ordered-quantification.md
Outdated
3. `<CasePatternLocal> <- C` | ||
|
||
This is a generalization of the previous case that supports pattern matching, as in variable declarations and match statements. | ||
It allows destructuring datatypes and tuples, as in `(k, v) <- myMap.Items`, and filtering, as in `Some(x) <- mySetOfOptionals`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have filtering assignments anywhere else, maybe this is best left to a later stage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completely agree it doesn't have to be in the first set of changes - I've got delayed until phase 4 in the implementation plan. I strongly want it eventually though, if nothing else to support quantifying over the key-value pairs in a map succinctly and efficiently, which is a very common pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Destructuring lets are already supported; I'm not worried about them. It's filtering ones that we don't have.
So (x, y) <- [(1, 2)]
is good; Some(x) <- [None]
is what I'm worried about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see it as a natural extension of the semantics of match statements, which is like filtering as well. These two loops are equivalent, for example:
foreach Some(x) <- [None, Some(42), None] {
print x;
}
foreach o <- [None, Some(42), None] {
match o {
case Some(x) => print x;
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I love it and I want that feature; but I want it in more cases (including if var Some (x) := opt then …
), and I think it distracts from the main aspect of this proposal without adding expressivity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well awesome then! :) I mainly wanted to enforce that there was a good option for the map iteration case, and it was a slippery slope to allow filtering as well.
I could make it yet another Future Possibility if that would make the proposal tighter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes please, that sounds great :)
0005-ordered-quantification.md
Outdated
It allows destructuring datatypes and tuples, as in `(k, v) <- myMap.Items`, and filtering, as in `Some(x) <- mySetOfOptionals`. | ||
|
||
A single quantifier domain may include multiple such declarations separated by commas, | ||
in which case the orderings described for each clause take lexicographic precedence. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that mean that x <- [2, 1], y <- [4, 3]
is the same as x <- [1, 2], y <- [3, 4]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, seq x <- [2, 1], y <- [4, 3] :: (x, y)
would be [(2, 4), (2, 3), (1, 4), (1, 3)]
. This was a confusing use of "lexicographic" and I'll rewrite this.
0005-ordered-quantification.md
Outdated
The quantifier domain is allowed to be potentially infinite if the loop is used in a compiled context and an explicit `decreases` clause is provided. | ||
`decreases *` is permitted, in which the `foreach` loop may never terminate. Any other `decreases` clause can be provided | ||
to ensure the loop terminates even if the domain is potentially infinite. For example, the following (very slow) example collects | ||
five arbitrary primes (assuming that Dafny can figure out how to enumerate the `allPrimes` infinite set): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would also need to know that it's not empty, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worse, it would have to know there are at least five elements. Fortunately the code is sound, but my statement about how it behaves is wrong - it should say that it collects AT MOST five arbitrary primes. I'll fix that.
0005-ordered-quantification.md
Outdated
|
||
Since sequence comprehensions are expressions rather than statements, they carry an additional restriction | ||
when used in functions: their ordering must be fully deterministic. If `s` is a `set<int>`, for example, | ||
`seq x <- s` would be rejected in specification contexts, whereas `seq x | x in s` would be allowed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why have this distinction?
0005-ordered-quantification.md
Outdated
As mentioned in the guide-level explanation, `foreach` loops and sequence comprehensions are both able to | ||
borrow concepts and implementation substantially from other features. Parsing, resolving, verifying, and compiling | ||
quantifier domains is already a mature aspect of the Dafny implementation. The most significant implementation burden | ||
is ensuring that enumeration ordering is threaded through, and deterministic in contexts where it needs to be. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In which contexts does it not need to be deterministic?
0005-ordered-quantification.md
Outdated
The proposed sequence comprehension expression allows more | ||
logic that ultimately produces a sequence of values to be expressed as a value rather than a statement. | ||
Just producing a sequence of the values in the set above, sorted by the natural ordering of `int` values, | ||
is simple using a sequence comprehension expression: `seq i: int | i in s`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that it's worth having this, since users can simply write SetToSeq(s)
. And, once you remove this (keeping only <-
), I think you don't need the notion of quantification ordering at all? I expanded on this below.
0005-ordered-quantification.md
Outdated
and I will label the `x: nat` and `y: nat` sections as *quantifier variable declarations*. | ||
We can augment this existing syntax with a notion of ordering, and allow quantified variables to bind to duplicate values. The key points are: | ||
|
||
* Any quantifier domain defines a potentially-infinite, partially-ordered set of *quantifier bindings*. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure you can order all types, see below
Co-authored-by: Clément Pit-Claudel <[email protected]>
Shoot, I've incorrectly stated in a few places that the ordering has to be deterministic in specification contexts, where I should have said when used in a compiled expression. It's the same reason that let-such-that expressions can only be compiled if they have unique solutions, so that they behave deterministically at runtime. I'll fix that and also call this out more explicitly where relevant. Independently of that, though, I don't think this kind of restriction is unreasonable, since there's already plenty of precedent in Dafny.
The main reason I want to allow omitting Would you be comfortable with the incremental implementation plan I'm proposing, where the first few iterations would not support plain quantifier variables without |
I don't think we would have to be very clever. If we had syntax for that sequence (say
But step 1 claims that there is a need for ordered quantification, and I don't understand that part. |
…fcs into ordered-quantification
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job figuring out that the core of everything you are suggesting is the "ordered" enumeration part. A few comments there and there.
// After: | ||
method AllPairs(s: seq<nat>) returns (result: seq<(nat, nat)>) { | ||
result := []; | ||
foreach left <- s, right <- s { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to look stubborn with pointing back at your original idea 😁 , but anyway, what about the syntax:
foreach left <- s && right <- s {
}
? Other booleans operators like || or ! would not be permitted except possibly in ghost contexts.
With this generalization, you could easily add conditions, such as:
left <- s && right <- s && left < right
or
left <- s && left % 2 == 0 && right <- s && right % 2 == 0
or
left <- s && left !in other && right <- s
I know sometimes we use the word "heuristic" to describe how to "infer" some knowledge, but here it would be plainly deterministic, straightforward and well documented.
it would even work great for integers, as soon as their range is next to their declaration:
left: int && 0 <= left < max && ...
or you could reuse the current for syntax:
left <- 0 to max
Alternatively, if you really prefer the comma, you could enable both enumeration and variable declaration at the same time.
left <- s, right <- s, left < right
or
left <- s, left % 2 == 0, right <- s, right % 2 == 0
or
left <- s, left !in other, right <- s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a quick TL;DR as I revise the proposal, your last version is closest to what I'm thinking now. Those examples would be as follows instead:
left <- s, right <- s | left < right
or
left <- s | left % 2 == 0, right <- s | right % 2 == 0
or
left <- s | left !in other, right <- s
i.e. I'm keeping the hard distinction of x <- c
not being a boolean operator, but a syntax attached directly to declaring the quantified variable. But each quantified variable can have an optional range attached with |
. I realized that's still backwards compatible, because existing x1, x2, ..., xN | <Range>
quantifier domains will just be reinterpreted as having a range attached to the final xN
variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very interesting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A note: I like the left <- 0 to max
syntax, if we would ever have a use for it. In fact, I like it better than left <- 0 .. max
. If we were to introduce this at some point, then left <- max downto 0
is also a natural possibility, since for for
loop supports similar syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, it's nice that it wouldn't clash with the existing s[i..j]
syntax, and that there is already precedent for how to handle the reverse order.
If we do that, is it really worth continuing to support both for x := 0 to 10
and foreach x <- 0 to 10
?
0005-ordered-quantification.md
Outdated
the bindings will be ordered according to the *natural ordering* of the type `T`. Not all Dafny types will | ||
have such a natural ordering defined, but at a minimum `int` and `real`, and any subset type or newtype based on them, | ||
will use the natural mathematical ordering. | ||
`x: int | x in {2, 5, -4}`, for example, would bind `x` to `-4`, `2`, and `5` in that order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your proposal makes sense, but it has caveats.
x: nat
says that x is in [0, 1, 2, 3, 4, ....] and then the range says how to filter this infinite range, so it's natural to think that the result be sorted if the underlying type "int" is sorted. In practice, we would compile to something different: We would infer what to enumerate x from (the range), and sort it. But this is a hidden performance cost based on heuristics.
We might as well not accept this "implicit sorting" for sequences, and immediately suggest that the enumeration be part of the x: nat
syntax, and enable non-deterministic set orderings.
0005-ordered-quantification.md
Outdated
|
||
```dafny | ||
var myDoubleDeckerSeq: seq<seq<int>> := ...; | ||
foreach x <- myDoubleDeckerSeq, y <- x | y != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if you want to filter on x
before you extract y
from x
? What would be the syntax like? Can you write down a proposal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See latest commit
```dafny | ||
// Filtering to optional values | ||
var s: seq<Option<T>> := ...; | ||
var filtered: seq<T> := seq o <- s | o.Some? :: o.value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reminds me that I would love to be able to create a comprehension that would create an Option<T>
by "enumerating other options", like Scala enables. But for now, we have the elephant operator, which It think is strictly better.
Yet, I love your proposal of collect
at the end of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, considering failure-compatible types to be collection values compatible with <-
is another future possibility. But I realized that supporting pattern matching makes that unnecessary because you can just write Some(x) <- optionals
instead.
0005-ordered-quantification.md
Outdated
Note that the existing `seq(size, i => i + 1)` syntax is inconsistently-named in the Dafny reference manual, | ||
but we refer to it here as a "sequence construction" expression, to disambiguate it from the proposed sequence comprehension feature. | ||
Sequence comprehensions are strictly more expressive, as any construction `seq(size, f)` can be rewritten as | ||
`seq i | 0 <= i < size :: f(i)`. They also avoid the common trap of forgetting to explicitly attach the `requires 0 <= i < size` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love this remark about the requires.
programmers how their code will behave at runtime, and hence these new features come with | ||
the same downside. Dafny is free to implement an expression like `seq x <- a | x in b` by either enumerating the values | ||
of `a` and skipping those not in `b`, or by sorting the elements of `b` according to the ordering in `a`. This flexibility | ||
is excellent for future optimizations, but makes it harder to reason about runtime performance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 about reasoning about runtime performance.
I really read x <- a | x in b
as "enumerate values in a, and skip those not in b". Sorting the elements of b
according to the ordering in a
and skipping the elements of a
should be something explicit, such as x <- sorted(b, a) | x in a
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree.
no additional filtering, and with `C` directly providing the enumeration and not just the ordering of values. | ||
This is certainly a well-formed and sound construct, but far less powerful than the proposed version, especially | ||
when sequence comprehensions are excluded. Offering `foreach` loops along with sequence comprehensions means we can | ||
recommend the latter where possible, with the former as a fallback that is inevitably more complicated to verify. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please discuss my alternative at least here?
The one that says "add '<-' as a synonym of 'in' in ranges, but such that they would be used for specifying the enumeration (i.e. no more conflicting heuristics needed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely! I might as well summarize the best version of the first RFC here. :) I'm also adding Clement's version where <-
is required everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACTUALLY @cpitclaudel has given me the kernel of an idea that allows us to chain filtering as you are asking for, while preserving the purity of boolean expressions, and still turns out to be backwards compatible with existing quantifier domains! I am tweaking the proposal now and will push a revision to see if you like it as much as I do. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You got me very curious now.
0005-ordered-quantification.md
Outdated
|
||
Along similar lines, will users frequently want to customize the ordering of their quantifications without | ||
creating explicit collection values? To produce the sequence `[4, 3, 2, 1, 0]`, for example, | ||
we could consider supporting something like `seq i by > | 0 <= i < 5`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think sequences are both increasing and decreasing in everyday's usage, so having a way to support both would be better.
Alternatively to your suggestion:
seq i: int | 0 <= i < 5 :: 4 - i
would not require any changeseq i: int | 5 > i >= 4 :: i
might seem a bit WTF.seq i: int | i <- 4 downto 0 :: i
to constrain the notion of ordering not in the sequence comprehension, but in the original iterable sequence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seq i: int | 0 <= i < 5 :: 4 - i
would not require any change
Yup, that's a nice application of the generic comprehension syntax!
seq i: int | 5 > i >= 4 :: i
might seem a bit WTF.
More than just a bit WTF methinks :D That's another case where I don't think we should mess with deeply established semantics of boolean operators. Plus I know @RustanLeino prefers the idiom of always using <
and <=
so values are ordered naturally left-to-right in expressions, and I tend to like that too.
seq i: int | i <- 4 downto 0 :: i
THAT'S not a bad idea, since there's precedent in for i := 4 downto 0 { .. }
loops...
0005-ordered-quantification.md
Outdated
|
||
Since sequence comprehensions are expressions rather than statements, they carry an additional restriction | ||
when compiled: their ordering must be fully deterministic. If `s` is a `set<int>`, for example, | ||
`seq x <- s` would be rejected in compiled code, whereas `seq x | x in s` would be allowed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I would buy into that. Deterministic means that the result will always be the same if the same set is given. The compilers should allow traversing a set in a deterministic order if they are equal anyway, even if it just means comparing references of non-comparable types.
So seq x <- s
could be accepted in compiled code.
I suspect that, most of the time, however, users will not want to convert sets to sequences, but always use sequences, and use set in ghost contexts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I would buy into that. Deterministic means that the result will always be the same if the same set is given. The compilers should allow traversing a set in a deterministic order if they are equal anyway, even if it just means comparing references of non-comparable types. So
seq x <- s
could be accepted in compiled code.
This can be done in some cases but definitely makes the runtime more complicated: most HashSet implementations might give you a different iteration order for {1, 2, 3}
vs. {3, 2, 1}
. I tried to explain how that could be addressed in many cases with a more specialized implementation in this section in the original RFC, but didn't go back to clean that up since I've dropped the requirement in this version. I can take another stab at it here if it would help though.
I suspect that, most of the time, however, users will not want to convert sets to sequences, but always use sequences, and use set in ghost contexts.
Well now that it's public (🎉!), allow me to point to the natively-implemented ComputeSetToOrderedSequence
method that the ESDK uses in several places to deterministically serialize the keys of a map!
0005-ordered-quantification.md
Outdated
collect(*) x: T | P(x) :: Q(x) == (product) | ||
collect(<) x: T | P(x) :: Q(x) == (minimum) | ||
collect(Averager) x: T | P(x) :: Q(x) == (average) | ||
collect(First(n)) x: T | P(x) :: Q(x) == Take(seq x: T | P(x) :: Q(x), n) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow this blows my mind I love it !
0005-ordered-quantification.md
Outdated
1. `x [: T]` | ||
|
||
In the existing syntax for quantifier variables (where the type `T` may be explicitly provided or omitted and inferred), | ||
the bindings will be ordered according to the *natural ordering* of the type `T`. Not all Dafny types will |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My immediate reaction to the difference between in
and <-
is to think of in
as more "mathematical" and <-
as more "implementation-focused" (given how those symbols have been used in other contexts before). Because of that, it seems more natural to me to have <-
impose more ordering, and in
impose less. What if in
produced elements in a non-deterministic order, with no duplicates, and <-
always used a collection-defined ordering with the potential for duplicates? That would eliminate the need for "natural ordering" altogether.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have a look at the latest commit, I went with almost exactly that option. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
|
||
This mirrors the `Collector` concept from the Java streaming APIs, and ideally the shape of the | ||
parameter to `collect` expressions would define similar operations for merging intermediate results | ||
to leave the door open for parallel or even distributed computation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would collect
require a "zero" or "identity" element? Most such formalisms do, I think (such as Monoid
in Haskell). In Dafny, we could avoid it by requiring T
to be non-empty, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had that thought too. Formalizing the parameter to collect
definitely needs a lot more thought and I didn't attempt to make this too concrete yet - it might work better to parameterize with a type parameter that implements a particular "type characteristic".
This is a fair point (and I'd considered the option of some kind of interval sequence syntax in earlier versions too). I decided I'm fine with replacing sequence constructions being out of scope for now, and requiring
In the latest version I've dropped the idea of a natural ordering for types, but still kept the idea of ordered quantification. Let me know if it resolves your concerns! |
0005-ordered-quantification.md
Outdated
|
||
2. `x [: T] <- C` | ||
Assuming that we have `posNat == iset n: nat | 0 <= n`, the set comprehension above could also then be expressed as: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't subset types just iset
? :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True! In general the quantification x: T
is equivalent to x <- (iset x: T)
. Since Dafny supports potentially infinite collections, types and collections are very similar. But collections are strictly more expressive since we don't have a fully dependent type system; you can't express "sequences of length n" as a subset type if n is a dynamic value, but you can declare a dynamic, local iset
collection for that. Part of the reason I was fine with dropping the notion of default orderings of types is that we could still support infinite ordered collections in the future, and hence achieve the same thing with x <- (collection of all T's ordered by >)
.
This may not be the strongest possible example to motivate quantifier variable domains, since it could be done with a subset type instead, and in fact I'd prefer that style in this case. A more complicated example where the domain depended on dynamic values and couldn't be a subset type would be better.
@@ -360,16 +383,10 @@ assert (seq x <- b | x in a) == [3, 1]; | |||
|
|||
Since sequence comprehensions are expressions rather than statements, they carry an additional restriction | |||
when compiled: their ordering must be fully deterministic. If `s` is a `set<int>`, for example, | |||
`seq x <- s` would be rejected in compiled code, whereas `seq x | x in s` would be allowed. | |||
`seq x <- s` would be rejected in compiled code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`seq x <- s` would be rejected in compiled code. | |
`seq x <- s` would be rejected in compiled code if the compiler does not have a way to enumerate `s` in a deterministic way. |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good point, I had a lot more detail on this in the first RFC that is still valid. Let me pull some of it in here.
|
||
1. Membership: `forall x :: P(x) <==> x in S` | ||
1. Membership: `forall x :: x in C && P(x) ==> Q(x) in S` | ||
2. Cardinality: `|S| <= |C|` | ||
3. Ordering: `forall i, j | 0 <= i < j < |S| :: | ||
exists i', j' | 0 <= i' < j' < |C| :: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you strengthen this by adding
exists i', j' | 0 <= i' < j' < |C| :: | |
exists i', j' | 0 <= i' < j' < |C| && i' <= i && j' <= j :: |
2. Cardinality: `|S| <= |C|` | ||
3. Ordering: `forall i, j | 0 <= i < j < |S| :: | ||
exists i', j' | 0 <= i' < j' < |C| :: | ||
C[i'] == S[i] && C[j'] == S[j]` | ||
Q(C[i']) == S[i] && Q(C[j']) == S[j]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you have a new construct that says:
countIf(x <- C, P(x))
that is always between |0| and |C|, that increases with C,
that counts the number of P(x) that hold for all the x
in C
?
That way, you wouldn't need an existential inside the forall:
Cardinality: `|S| == countIf(x <- C, Q(x))`
Ordering:
forall i <
Q(C[i']) == S[i]
forall i | 0 <= i < |S| ::
var i' := countIf(x <- C[0..i], P(x));
Q(C[i']) == S[i]
Your axioms could be deduced from that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly! I only wanted to propose an encoding that seemed feasible enough, I don't claim to have found the optimal encoding yet nor am I trying to nail that before starting on implementation.
## Range expressions | ||
|
||
This would interpret the existing `i..j` syntax, | ||
currently only usable in sequence or array indexing expressions such as `s[i..j]`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you envision it would be possible to say something like this:
var x := i..j;
s[x]
and obtain a slice of s
?
I personally was thinking that i..j
would be available only in [i..j]
or after an àrrow <- i..j
.
I do love this syntax, but what about reverse iteration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I was thinking it would be possible, just so we're not introducing a new type of value or special-casing a lot. I also like the syntax but have no good suggestion for the reverse order, and that's one reason this idea is safely quarantined in Future Possibilities for now. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it!
The one thing I'm not clear about is what happens with sequence comprehensions when the s
in x <- s
is a set. Is such a sequence comprehension allowed only in ghost contexts? Or is there something that, at run time, will figure out a deterministic ordering?
More generally, what types of e
are allowed in x <- e
?
I think the current proposal does not give a way to deterministically step through the elements of a set. Is that right? (I'm not saying this is necessary. Just checking my understanding.)
```dafny | ||
// Before: | ||
method AllPairs(s: seq<nat>) returns (result: seq<(nat, nat)>) { | ||
for i := 0 to |s| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another whoops: missing initialization of result
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To which I say "whoops" :)
I legitimately toyed with figuring out a way for this repo to test code snippets as the main dafny
repo is.
var ss := s; | ||
while ss != {} | ||
{ | ||
var i: int :| i in ss; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The : int
can be omitted, since it will be inferred.
// After: | ||
method AllPairs(s: seq<nat>) returns (result: seq<(nat, nat)>) { | ||
result := []; | ||
foreach left <- s, right <- s { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A note: I like the left <- 0 to max
syntax, if we would ever have a use for it. In fact, I like it better than left <- 0 .. max
. If we were to introduce this at some point, then left <- max downto 0
is also a natural possibility, since for for
loop supports similar syntax.
If `C` is a `map`, the bound values will be the keys of the `map`, in order to be consistent with the meaning of `x in C`; | ||
`map.Items` can be used instead to bind key-value pairs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to keep the two consistent. But I have also thought that x in m
ought to mean x in m.Items
, rather than x in m.Keys
. Maybe we'll change both in m
and <- m
in the future.
0005-ordered-quantification.md
Outdated
If `C` is a `map`, the bound values will be the keys of the `map`, in order to be consistent with the meaning of `x in C`; | ||
`map.Items` can be used instead to bind key-value pairs. | ||
|
||
Assuming that we have `posNat == iset n: nat | 0 <= n`, the set comprehension above could also then be expressed as: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the mathematics involved in this example, I think you intended 0 < n
.
This would allow destructuring datatypes and tuples, as in `(k, v) <- myMap.Items`. | ||
It could also support filtering, as in `Some(x) <- mySetOfOptionals`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this and I think we should do it (after the more basic functionality has been implemented and loved). Just a note: it will complicate the lookahead disambiguation that will need to be written for Coco (see my comments regarding the ","
above).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I very much like this, too.
the necessary interface. This is also likely the best way to provide a better user experience | ||
for looping or quantifying over the values produced by an `iterator` type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
## Range expressions | ||
|
||
This would interpret the existing `i..j` syntax, | ||
currently only usable in sequence or array indexing expressions such as `s[i..j]`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless range expressions would be allowed everywhere and have their own type (which I resist at the moment), then ranges expressions would be allowed only to the right of <-
. If so, I like the syntax @MikaelMayer suggested above, namely x <- i to j
, which looks like in a for
loop.
There would have similar semantics to sequence comprehensions, | ||
where the multiplicity of results is significant but not the ordering. | ||
This feature would be very cheap to implement once sequence comprehensions are implemented. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:)
The translation of `foreach` loops should be reducible to sequence comprehensions | ||
and other existing features for loops in general, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sequence comprehensions are more difficult to formalize for the verifier than foreach
loops, so I don't suggest formalizing foreach
in terms of sequence comprehensions. I'm thinking that the verifier would handle foreach
loops like this:
- Check the loop invariant initially (as usual for a loop).
- Inside the loop, play havoc with the loop assignment targets (as usual for a loop) and the bound variables (as is also done in the
for
loop). Then, assume the loop invariant and assume the condition that you get by replacing every<-
with anin
in theforeach
loop header. Then, check the body and check that the body maintains the loop invariant. - After the loop, assume the loop invariant (as usual for a loop). Here's what I don't know what to do with: one also needs to assume some condition that's analogous to "the negation of the loop guard". I don't know what that condition is here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I thought translating to sequence comprehensions was appealing because it at least solves the last point: the invariants reduce to those of for
loops, so the loop guard is just an expression of completing the enumeration. I may still try this encoding first, especially since I plan to have sequence comprehensions committed and stable before moving to foreach
loops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent writing and exciting proposal!
} | ||
``` | ||
|
||
The feature supports much more sophisticated uses as well, however, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wary of adding syntax for features whose use-cases are also covered by multiple foreach loops and higher-order method calls.
I think it's important that if we add support for the 'sophisticated uses' syntax, that it's exactly the same syntax as used in comprehensions so it doesn't require any additional documentation or implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of the big motivating styles: https://github.com/dafny-lang/rfcs/pull/10/files#diff-e3e01b715e6d0ac6dce19064a36b244c495633e9760a43233eaf22cdb18849f8R376 Supporting multiple quantified variables in one expression is necessary to support that flattening pattern, as you get a different value if you use nested comprehensions instead.
Totally agree that keeping the syntax and semantics consistent is very important. Hence the strategy of extending the existing quantification syntax slightly in a way that makes sense everywhere, but then also enables the new features.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was commenting only on providing sophisticated syntax for the foreach
statement. For comprehensions I think it opens up new use-cases so that syntax is warranted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For foreach
I think the stronger argument is consistency (given all other uses of quantification allow multiple quantified variables, and it won't be very hard to implement).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it would be so inconsistent not to do it, that doing it reduces the documentation and implementation we need, then of course let's do it!!!
``` | ||
|
||
Since sequence comprehensions are expressions rather than statements, they carry an additional restriction | ||
when compiled: their ordering must be fully deterministic. If `s` is a `set<int>`, for example, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure the ordering must only be deterministic when compiled? Wouldn't it be difficult to ensure (seq x <- someSet) == (seq x <- someSet)
at the verification level as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is easy to ensure at the verification level. In fact, it difficult NOT to ensure it, because expressions in math are deterministic. This is the reason that the compiler also must make sure to do this consistently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think this is SUCH a common point of confusion, it needs to be taught and documented more prominently. We already have this problem with :|
and it surprises even advanced Dafny users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still, I wonder whether it wouldn't be better to disallow seq x <- someSet
in all contexts. What would be the use-case of doing this in a ghost context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You definitely need some efficient way of accessing the values in a set in sequence at runtime, but it might be enough to only support foreach
. Supporting collections other than sequences as quantified variable domains is not until step 4 of the implementation plan, so I'm definitely happy to hold off until we feel there's a need for it.
|
||
This RFC proposes adding two closely-related new features to Dafny for working with collections and iteration: | ||
|
||
1. `foreach` loops over any enumerable domain, initially only supporting the builtin collection types but leaving the door open for user-defined collections. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we considered introducing the concept of an iterator/enumerator type prior to introducing a foreach
? The iterator/enumerator would be similar to that in Java/C#, and you could use it to do something similar to a foreach
:
method ForEach<T>(hasEnumerator: HasEnumerator<T>)
{
var enum: BoundedEnumerator<T> := hasEnumerator.GetEnumerator();
while(enum.HasNext())
decreases enum.MaxLeft() {
var iterationVariable := enum.MoveNext();
// do something useful
}
}
trait HasEnumerator<T> {
method GetEnumerator() returns (result: BoundedEnumerator<T>)
}
trait BoundedEnumerator<T> {
function MaxLeft(): int
function method HasNext(): bool
method MoveNext() returns (value: T)
requires HasNext()
ensures MaxLeft() < old(MaxLeft())
}
Building the foreach
concept on top of such an iterator would mean we immediately enable user-defined collections to work with foreach
. More importantly, starting out with simple fundamentals and building higher-level language constructs on top of those might mean we avoid having to backtrack in our implementation in the future.
Can you provide a sketch for how you plan to support foreach
for user-defined collections?
Similarly, for supporting user-defined collections to be used in the quantified variable domain of a sequence comprehension, I expect we would need to define a trait Ordered
that can be used like this:
function SequenceComprehension<T>(domain: Ordered<T>): seq<T> {
domain.Fold([], (value: T, result: seq<T>) => [value] + result)
}
trait Ordered<T> {
ghost var values: seq<T>
function method Fold<Result>(initial: Result, combine: (T, Result) -> Result): Result
ensures Fold(initial, combine) == FoldSpec(values, initial, combine)
function FoldSpec<Result>(values: seq<T>, initial: Result, combine: (T, Result) -> Result): Result {
if |values| == 0 then initial else FoldSpec(values[1..], combine(values[0], initial), combine)
}
}
And for set comprehensions we'd need a trait Unordered
:
function SetComprehension<T>(domain: Unordered<T>): set<T>
reads *
{
var sets := domain.Select<set<T>>(x => {x});
var combine := (a,b) => a + b;
assert forall x, y | x in sets.values && y in sets.values :: combine(x,y) == combine(y,x);
sets.Aggregate({}, combine)
}
trait Unordered<T(==)> {
ghost var values: multiset<T>;
function method Select<U>(f: T -> U): Unordered<U>
function method Aggregate(zero: T, aggregate: (T, T) -> T): T
reads this
requires forall x, y | x in values && y in values :: aggregate(x,y) == aggregate(y,x)
ensures Aggregate(zero, aggregate) == AggregateSpec(values, zero, aggregate)
function AggregateSpec(values: multiset<T>, zero: T, aggregate: (T, T) -> T): T
requires forall x, y | x in values && y in values :: aggregate(x,y) == aggregate(y,x) {
if |values| == 0 then
zero
else
(var x: T :| x in values; AggregateSpec(values - multiset{x}, aggregate(x, zero), aggregate))
}
}
function FailedSeqComprehension<T>(domain: Unordered<T>): seq<T>
reads *
{
var seqs := domain.Select<seq<T>>(x => [x]);
var combine := (a,b) => a + b;
//. Error: assertion might not hold
assert forall x, y | x in seqs.values && y in seqs.values :: combine(x,y) == combine(y,x);
seqs.Aggregate([], combine)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we considered introducing the concept of an iterator/enumerator type prior to introducing a foreach?
Oh absolutely, and dafny-lang/libraries#37 has a snapshot of that work to date. It looks similar to what you sketched out with an Enumerator<T>
trait, and I'd expect an Enumerable<T>
trait with a GetEnumerator()
method. That would be the bare minimum to allow more values to be used as quantified variable domains.
There are a number of factors that make that approach much harder to implement and less beneficial to users though, at least in the current state of Dafny:
- Your
// do something useful
comment inForEach
skips over a big gap. :) Lambda expressions in Dafny can't have any side-effects, so you can't make a call likeForEach(myCollection, x => { print x; }
work. That means you have to declare a general-purpose "Action" trait and an explicit class for every use case, e.g. https://github.com/aws/aws-encryption-sdk-dafny/blob/c8be143088490301d486958ad126d53c46ec1311/src/AwsCryptographicMaterialProviders/Keyrings/AwsKms/AwsKmsKeyring.dfy#L599-L600. It's a massive pain and an order of magnitude harder to use than a built inforeach
construct. - I found that attaching accurate enough termination metrics and invariants to the various
Enumerator<T>
implementations required creating a combinatorial explosion of concrete classes. I'm not against doing that in the standard library if necessary (and possibly leaning on code generation to make it easier/more maintainable), but I'd like to explore solving the more fundamental limitation first. - Consuming such enumerator types using a particular idiom of
while
loops (e.g. https://github.com/dafny-lang/libraries/pull/37/files#diff-1927bc3853704ac5f502a37509d998e3ac29f7fbe8c4e8c479810b6a02ca4c3bR15-R28) still takes a lot of boilerplate, and requires you to then prove your enumerator'sRepr
is disjoint from any other objects in your code. - Dafny also doesn't support having "default" implementations of methods in a trait that can be overridden in classes. That means we can't decide the
Enumerator<T>
interface should really have aContains(x)
method (for example) in the future without breaking existing code. Supporting default implementations was necessary in Java 8 so they could add the streams API to existing interfaces likeCollection
without breaking existing implementations. - Modelling the enumeration constraints as a trait is also limiting, since only reference types can currently implement them. The alternative is introducing yet more type characteristics similar to "failure-compatible" (and more complicated ones at that), and I'd rather avoid that and wait until Non-reference traits dafny#1226 is addressed.
The appealing thing about the features in this RFC is that they fully abstract away from the details of enumerating at runtime. I absolutely plan for us to support user-defined collections in the future, but supporting ordered quantification over just the built-in collection types provides a ton of value while keeping the door open for the future. I've tried very hard to keep the semantics as general and flexible as possible, to reduce the risk you point out of finding that they interfere with how we'd like to define lower-level features that can implement them, but please let me know if you're concerned about anything specific.
Note that the aggregation/Fold
functionality you also sketch out is out of scope for now. I suspect it will also work out better as a more built in concept as the "collect" future possibility sketches out, for a lot of the same reasons - aggregation is essentially the dual of enumeration!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, but how do you plan to support foreach
for user-defined collections?
If that means introducing Enumerable<T>
and rebuilding foreach
on top of it as syntax sugar for while based approach, wouldn't that mean we lose some of the effort that we spend now on a first-class foreach
implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initial foreach
implementation will translate pretty directly to the existing BoundedPool
/ SinglePassCompiler.Create[Guarded]ForeachLoop/CompileGuardedLoops
implementations already used to compile features such as set comprehensions and assign-such-that statements, so the implementation cost beyond the language changes is actually quite cheap.
If we end up replacing that code with a more pure-Dafny implementation of builtin collections in the future, very little effort will be wasted. And it's likely sequence comprehensions will be useful in in the specification of such code!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, sounds good :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it!
The one thing I'm not clear about is what happens with sequence comprehensions when the
s
inx <- s
is a set. Is such a sequence comprehension allowed only in ghost contexts? Or is there something that, at run time, will figure out a deterministic ordering?
Only allowed in ghost contexts, since the current runtimes can't provide a deterministic ordering. I added a bit of content here about that point: https://github.com/dafny-lang/rfcs/pull/10/files#diff-e3e01b715e6d0ac6dce19064a36b244c495633e9760a43233eaf22cdb18849f8R391
More generally, what types of
e
are allowed inx <- e
?
Any "collection" type, or alternatively any value for which the expression x in e
is valid.
I think the current proposal does not give a way to deterministically step through the elements of a set. Is that right? (I'm not saying this is necessary. Just checking my understanding.)
That's correct. There are a number of options for providing that feature outside the scope of this RFC:
- A standard library
function by method
can useseq x <- s
in the method implementation and sort the result. - The runtime representation of sets could be changed to guarantee a deterministic ordering (not necessarily sorted, but purely determined by the contents).
- As described in one of the future possibilities, we could add support for an infinite ordered collection type, so that an expression like
seq x <- intsInIncreasingOrder | x in s
would then express "the contents of s sorted", and would be implemented at runtime by building a sorted data structure with the contents ofs
and then enumerating that.
```dafny | ||
// Before: | ||
method AllPairs(s: seq<nat>) returns (result: seq<(nat, nat)>) { | ||
for i := 0 to |s| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To which I say "whoops" :)
I legitimately toyed with figuring out a way for this repo to test code snippets as the main dafny
repo is.
// After: | ||
method AllPairs(s: seq<nat>) returns (result: seq<(nat, nat)>) { | ||
result := []; | ||
foreach left <- s, right <- s { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, it's nice that it wouldn't clash with the existing s[i..j]
syntax, and that there is already precedent for how to handle the reverse order.
If we do that, is it really worth continuing to support both for x := 0 to 10
and foreach x <- 0 to 10
?
0005-ordered-quantification.md
Outdated
|
||
2. `x [: T] <- C` | ||
Assuming that we have `posNat == iset n: nat | 0 <= n`, the set comprehension above could also then be expressed as: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True! In general the quantification x: T
is equivalent to x <- (iset x: T)
. Since Dafny supports potentially infinite collections, types and collections are very similar. But collections are strictly more expressive since we don't have a fully dependent type system; you can't express "sequences of length n" as a subset type if n is a dynamic value, but you can declare a dynamic, local iset
collection for that. Part of the reason I was fine with dropping the notion of default orderings of types is that we could still support infinite ordered collections in the future, and hence achieve the same thing with x <- (collection of all T's ordered by >)
.
This may not be the strongest possible example to motivate quantifier variable domains, since it could be done with a subset type instead, and in fact I'd prefer that style in this case. A more complicated example where the domain depended on dynamic values and couldn't be a subset type would be better.
0005-ordered-quantification.md
Outdated
quantified variable domains also specify the ordering of quantification bindings, and allow variables to be bound to duplicate values. | ||
This ordering of bindings is non-deterministic unless `C` is a sequence. | ||
If `C` is a `multiset`, multiple bindings with the same value of `x` will be created, but their ordering is still non-deterministic. | ||
Note that ordering and multiplicity is irrelevant for all current uses: | ||
the semantics of `[i]set` and `[i]map` comprehensions, `forall` and `exists` expressions, | ||
and `forall` statements all do not depend on the order of quantification and semantically ignore duplicates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I like that version better - I've massaged it slightly and used it.
This means that the parser will accept attributes in-between variables, | ||
but these can be easily rejected by the resolver instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Three technical points:
- If the resolver is what rejects the in-between attributes, then the AST still needs a place to hang them. Better seems to be for the parser to reject them. More precisely, the parser would first parse in-between attributes, but before when it has gathered all the variables and "from" expressions and is about to create the AST node, then the parser would generate errors for any in-between attributes it may have parsed along the way.
I came to the same conclusion. I may still add a new QuantifiedVar
type for the AST in the first PR so that we can have better resolution errors without losing source tokens, but I'm going to try to avoid it.
- Don't worry about the specific attribute
{:heapQuantifier}
, since I don't think it's supported any more. We should remove it.
Sure, I was only looking for an example of a quantification attribute. As long as we still support at least one I just want to make sure we can still parse them.
- Note that the
","
causes an ambiguity in the grammar. To resolve it, the lookahead mechanism needs to determine if what follows the","
is anotherQuantifierVarDecl
or not.
Ah you spotted what I only figured out after making the parsing changes. :) I did figure out that it was helpful to look ahead for ", <Identifier>" at least. But note that this is still not fully backwards compatible! The statement:
print set x: int | 0 <= x < 10, y;
Is currently parsed as:
print (set x: int | 0 <= x < 10), (y);
But will then be parsed as:
print (set x: int | 0 <= x < 10, y);
And produce an error, since it is then a comprehension with more than one variable but no term. Our plan is to accept this minor breaking change with a helpful error message like "if y
is not part of the comprehension, try adding parentheses or a term expression before it". We could add an option similar to /functionSyntax
to avoid breaking people before the next major version.
2. Cardinality: `|S| <= |C|` | ||
3. Ordering: `forall i, j | 0 <= i < j < |S| :: | ||
exists i', j' | 0 <= i' < j' < |C| :: | ||
C[i'] == S[i] && C[j'] == S[j]` | ||
Q(C[i']) == S[i] && Q(C[j']) == S[j]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly! I only wanted to propose an encoding that seemed feasible enough, I don't claim to have found the optimal encoding yet nor am I trying to nail that before starting on implementation.
including Scala, Cryptol, and Haskell. Using something other than `in` is recommended to avoid confusion with the `in` boolean operator, | ||
but I don't have a strong opinion about which symbol or keyword we use. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from
reads quite nicely too, but <-
seems to fit better with the mathematical notation leanings of Dafny IMO. This is something we can bikeshed more effectively on in the first PR when looking at lots of code examples, I think.
0005-ordered-quantification.md
Outdated
match o { | ||
case Some(x) => print x; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another whoops because my markdown renderer is not verifying my Dafny code. :P Thanks.
The translation of `foreach` loops should be reducible to sequence comprehensions | ||
and other existing features for loops in general, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I thought translating to sequence comprehensions was appealing because it at least solves the last point: the invariants reduce to those of for
loops, so the loop guard is just an expression of completing the enumeration. I may still try this encoding first, especially since I plan to have sequence comprehensions committed and stable before moving to foreach
loops.
|
||
```dafny | ||
ghost var xs := []; | ||
foreach x <- s { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See top level response
Resolves #2187. First phase of implementing dafny-lang/rfcs#10. This change adds two new features on several uses of quantified variables in the language: 1. Quantified variable *domains*, specified with the syntax `x <- C`, where C is an expression with a collection type (i.e. set, multiset, sequence, or map). Besides offering a more compact syntax for a common pattern (i.e. replacing `myLovelyButLongVariableName | myLovelyButLongVariableName in someSet` with `myLovelyButLongVariableName <- someSet`), this is forwards-compatible with the notion of *ordered quantification*, which the two features designed in the linked RFC depend on, and potentially others in the future. 2. Quantified variable *ranges*, specified with the syntax `x | E`, where E is a boolean expression. These replace the existing concept of a single `| E` range after all quantified variables (which is now bound to the last declared variable, with equivalent semantics). These are helpful for restricting the values of one quantified variable so that it can be used in the domain expression of another variable. These features apply to the quantifier domains used in `forall` and `exists` expressions, `set` and `map` comprehensions, and `forall` statements, and the parsing for quantifier domains is now shared between these features. As a small consequence, set comprehensions now no longer require range expressions, and will default to `| true`. This new syntax is not fully backwards-compatible: the statement `print set x | 0 <= x < 10, y`, for example, was previously parsed as `print (set x | 0 <= x < 10), (y)` but will now be parsed as `print (set x | 0 <= x < 10, y)` and rejected. The `/quantifierSyntax` option is used to help migrate this otherwise breaking change: * `/quantifierSyntax:3` keeps the old behaviour where a `| <Range>` always terminates the list of quantified variables. * `/quantifierSyntax:4` instead attempts to parse additional quantified variables. Like `/functionSyntax`, this option will default to `4` instead in Dafny 4.0. This is implemented with pure desugaring for now: the `QuantifiedDomain` production parses a list of `QuantifiedVar` objects, but then immediately rewrites a snippet such as `x <- C | P(x), y <- D(x) | Q(x, y)` into the equivalent (for now) `x | x in C && P(x) && y in D(x) && Q(x, y)`. Token wrappers are used to ensure error messages can still refer to the original syntax. This will not be equivalent once features that depend on the ordering of quantification are added, though, so a subsequent change will add the new fields to `BoundVar` instead and plumb these extra components through the whole Dafny pipeline. On testing: I duplicated several existing test files and modified them to use the new syntax features, to get good white-box test coverage even though the implementation is very simple for now. Also resolves #2038 (and the same issue for Java) since I hit it when converting test cases, and it was dead easy to fix.
Coming to this late, and I understand the useful generalizations, but now the very intuitive simple case |
Since we decided to require a quantification domain for sequence comprehensions (i.e. We have a strong inclination to support |
After @robin-aws 's improvement, we could always define the previously built-in constructor as a simple library helper: function seq<T>(n: nat, f: nat ~> T): seq<T>
requires forall i | 0 <= i < n :: f.requires(i)
{
seq i | 0 <= i < n :: f(i)
} |
Unfortunately not, since we decided to require a domain to define the ordering in sequence comprehensions. It would have to be this (adding a necessary reads clause, and changing function Seq<T>(n: nat, f: nat ~> T): seq<T>
requires forall i | 0 <= i < n :: f.requires(i)
reads set o, i | 0 <= i < n && o in f.reads(i) :: o
{
seq(n, f)
} But if we allowed function Seq<T>(n: nat, f: nat ~> T): seq<T>
requires forall i | 0 <= i < n :: f.requires(i)
reads set o, i | 0 <= i < n && o in f.reads(i) :: o
{
seq i <- 0 to n :: f(i)
} |
I would vote for keeping the seq(n,F) syntax, and defining it as a sugar of what you write in the previous comment. I expect it will be the commonest use. And in this context |
(Second revision of the) design of the first step of dafny-lang/dafny#1753, and follow-up to #4. Replaces the first revision at #9
The key difference between this version and the old one is explicitly defining ordered quantification up front, and restricting the semantics to the first clauses of quantifier domains instead of inferring it from the range.
cc all the folks that provided such awesome feedback on the first version (turns out you can copy-paste this from the "participants" section of a PR's sidebar): @samuelgruetter @atomb @MikaelMayer @cpitclaudel @keyboardDrummer @RustanLeino. And while I'm at it why not @backtracking, @mschlaipfer, @parno, @amaurremi, @emina, and @seebees.
Edit: Revised to make
<-
required in ordered contexts (and drop the concept of natural ordering of types), and allow per quantified variable ranges. Also added motivation of idiomatic compilation.