Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"incomplete value" regression in evalv3 which appears to be order dependent #3638

Closed
mvdan opened this issue Dec 30, 2024 · 16 comments
Closed
Labels
evaluator evalv3-win Issues resolved by switching from evalv2 to evalv3 evalv3 issues affecting only the evaluator version 3

Comments

@mvdan
Copy link
Member

mvdan commented Dec 30, 2024

# With the old evaluator.
env CUE_EXPERIMENT=evalv3=0
exec cue vet -c

# With the new evaluator.
env CUE_EXPERIMENT=evalv3=1
exec cue vet -c

-- input.cue --
package p

routes: default: {}
routes: [string]: spec: [{
	refs: [{"default"}]
}]
#Fields: {
	input: _
	output: [for x in input { x }]
	...
}
#Kubernetes: {
	#Fields & {
		Routes: route: [_]: spec!: HTTPROUTESPEC.#x
	}
	#Fields // Commenting out this line also fixes evalv3.
	Routes: _
	input: res1: Routes
}
let HTTPROUTESPEC = {
	#x: [...{
		refs: [...{string}]
	}] | *[{
		other: []
	}]
}

// Order dependent! Breaks if the lines below are moved to the top of the file.
entrypoint: used.output
used: #Kubernetes & {
	Routes: route: routes
}

As of 46fc54aa9caf95393dcd3fe0ac066c8112da986d:

# With the old evaluator. (0.010s)
> env CUE_EXPERIMENT=evalv3=0
> exec cue vet -c
# With the new evaluator. (0.030s)
> env CUE_EXPERIMENT=evalv3=1
> exec cue vet -c
[stderr]
used.output.0.route.default.spec: incomplete value [{refs:["default"]}] | [{refs:["default"],other:[]}]
[exit status 1]

See the comments; removing one repetitive definition embedding, or moving the last two fields to the top of the file, both make the error go away. So it seems like there's some rather nasty bug in evalv3 that causes order-dependent or inconsistent evaluation.

@mvdan mvdan added evaluator evalv3 issues affecting only the evaluator version 3 labels Dec 30, 2024
@mvdan
Copy link
Member Author

mvdan commented Dec 30, 2024

Reduced from a failure in Holos as reported by @jeffmccune. Note that #3637 was a spin-off from the example above, but a different bug entirely.

@mpvl
Copy link
Member

mpvl commented Jan 16, 2025

There appear to be two issues going on: somewhere a closedness value is erased, allowing the structs in a disjunction to be unified that otherwise fails. However, for the location where the error occurs, V2 actually suffers from the same problem. One could argue that the embedding, in that case, should actually allow it and that it is not an issue. Either way, in one other location V3 does suffer from this issue.

The other issue is that the default marker is incorrectly erased. A reduction that focusses on that issue is:

input: k1: [string]: a: 1
input: k1: foo: #x
out: {
	for x in input { x }
	for x in input { x }
}
#x: { a: int } | *{ b: 2 }

In this case, out.foo will have a default marker for V2, but not V3, causing the value to be incomplete.

The closedness issue is exposed by:

input: k1: [string]: a: 1
input: k1: foo: #x
#x: { a: int } | *{ b: 2 }

This issue only occurs when DebugDebs is disabled!, which is a good hint as to where to start debugging.

@mpvl
Copy link
Member

mpvl commented Jan 16, 2025

The default issue can be analyzed as follows. Comprehensions are treated as embeddings. Each x that each comprehension yields inserts foo: #x and [string]: a: 1.

Let's focus on the former. Embedding foo: #x by itself twice yields a cross product, as embedding (so as an open struct), that yields foo: {a: int} | {a: int, b: 2} | *{b: 2}, where each of these structs should be treated as closed.

So far so good. This is the same for both V2 and V3. Where they differ is where [string]: a: 1 is applied. As the latter is not embedded as a sibling to the comprehensions, but rather within the term of each embedding, I find the result of V2 hard to justify. Basically the result should either be

(close({a: int}) | close({a: int, b: 2}) | *close({b: 2})) & {a: 1}

which is close({a: int}) | close({a: 1, b: 2}
or

((close({a: int) | close({ b: 2}) & {a: 1}) & ((close({a: int) | close(b: 2}) & {a: 1})

which is close({a: 1}).

V2 returns close({a: int}) | *close({a: 1, b: 2}, which seems unjustifiable.

V3 currently returns close({a: int}) | close({a: 1, b: 2}.

I think the correct answer should be close({a: 1}), however.

The motivation for this is that unification is idempotent, but embedding is not. So to get the correct result, one should distribute the unifications over disjunctions before embedding them.

Note that if we replace [string]: a: 1 with foo: a: 1, the result is indeed {a: 1} in V3. So that would be consistent.

Taking this stance would mean that V3 will differ from V2, although it would no longer result in an incomplete value. Judging from the particular meaning here, I reckon that will be acceptable.

Note that V2 is inconsistent with itself: if we unify a pattern constraint with a single disjunction it is most eager.

@myitcv
Copy link
Member

myitcv commented Jan 17, 2025

The main question that arises for me here looking at this example:

input: k1: [string]: a: 1
input: k1: foo: #x
out: {
	for x in input { x }
	for x in input { x }
}
#x: { a: int } | *{ b: 2 }

is "what was the intent of the author?"

I ask that in light of my observations in #3296 (comment). i.e. the CUE written by the author does not necessarily capture their intent, for whatever reason:

  • poor explanation, on our part, of the way something works
  • lack of functionality within CUE to achieve a specific result, i.e. needed finalize() or similar
  • etc...

Through one lens, the expected result might be, in rather more colloquial terms: "out should be a copy of the data structure of input". I say that because when I have seen comprehensions of this sort people are often doing data transformation. But I can't honestly say for sure.

I ask this question about the original intent because without properly understanding that intent I can't be sure that the user intended the following implications:

  1. That the values "yielded" by the comprehensions are embedded, and hence the closedness constraints of #x are not enforced in the declaration of out shown. (Noting that a separate declaration does enforce the closedness, modulo the next point)
  2. That the "double embedding" of foo: #x results in the cross product described, allowing the value {a: 1, b: 2} in out (this is not possible in input):
input: k1: [string]: a: 1
input: k1: foo: #x
out: {
	for x in input {x}
	for x in input {x}
}
#x: {a: int} | *{b: 2}

out: foo: b: 2
  1. That it's possible to see a situation like that observed in evaluator: incorrect behaviour when choosing defaults #3296, with a different concrete value in out compared to input (see my question regarding the original intent), because of points 1 and 2.
  2. That embedding is not idempotent:
out1: {
	for x in input {x}
}

out2: {
	for x in input {x}
	for x in input {x}
}

out3: {
	for x in input {x}
}
out3: {
	for x in input {x}
}

These implications are not obvious to my mind, as demonstrated by the level of analysis required in this issue.

I ask this question and raise these points because I wonder whether we can actually perform a neat side-step here. Because if the user actually intended something that doesn't require that we answer what the result of the presented code should be, then is there a simpler solution? Indeed can we leave this behaviour unspecified (or an error in case we can detect certain edge cases)?

@mpvl
Copy link
Member

mpvl commented Jan 17, 2025

Through one lens, the expected result might be, in rather more colloquial terms: "out should be a copy of the data structure of input". I say that because when I have seen comprehensions of this sort people are often doing data transformation. But I can't honestly say for sure.

In that case, my suggested correct semantics would indeed get closest to that.
BTW, I had a very different argument that my suggested fix is probably okay: in most cases, and here this seems true as well, it is most important that one get a concrete value, where the exact value often does not matter that much. If a disjunction is ambiguous in such cases, it is usually because of some additional "empty" or meaningless fields, like what seems to be the case here.

and hence the closedness constraints of #x are not enforced in the declaration of out shown

That is not how closedness works. The closedness is temporarily not enforced, a small but important distinction.

These implications are not obvious to my mind, as demonstrated by the level of analysis required in this issue.

Well, there is a factor of correctness.

I've identified the problem that causes V3 to treat foo: a: 1 different from [string]: a: 1. It was a clear and obvious bug. After fixing that, the result is indeed {a: 1}.

After more analysis it is quite clear that this is the correct result.

@mpvl
Copy link
Member

mpvl commented Jan 17, 2025

For more info see https://cuelang.org/cl/1207410/2.

cueckoo pushed a commit that referenced this issue Jan 17, 2025
Note that the added explanation explains the diff
for after the upcoming fix.

Issue #3638

Signed-off-by: Marcel van Lohuizen <[email protected]>
Change-Id: Iccc5d04f2fb59560a1d715d254a21dfde1c87f2c
Reviewed-on: https://gerrithub.io/c/cue-lang/cue/+/1207409
Unity-Result: CUE porcuepine <[email protected]>
Reviewed-by: Daniel Martí <[email protected]>
TryBot-Result: CUEcueckoo <[email protected]>
@mpvl
Copy link
Member

mpvl commented Jan 17, 2025

For posterity, here is a runtime visualization of just before the disjunction is computed of

// use regular field
input: k1: foo: a: 1
input: k1: foo: #x
out: {
	for x in input { x }
	for x in input { x }
}
#x: { a: int } | *{ b: 2 }

and

// use pattern constraint
input: k1: [string]: a: 1
input: k1: foo: #x
out: {
	for x in input { x }
	for x in input { x }
}
#x: { a: int } | *{ b: 2 }

The graph with the pattern constraint case is clearly broken (if you know what to look for). Most notably, the ARC link is missing. This causes a whole path (red) to be ignored in the disjunction. The ARC was missing because it erroneously got deduped. Once that was fixed it behaved exactly as the regular field case, as one would expect.

Image

Image

@jeffmccune
Copy link

@myitcv

The main question that arises for me here looking at this example is "what was the intent of the author?"

input: k1: [string]: a: 1
input: k1: foo: #x
out: {
	for x in input { x }
	for x in input { x }
}
#x: { a: int } | *{ b: 2 }

Let me see if I can clarify. At the highest level, I want to manage a Kubernetes HTTPRoute by composing in multiple backend references. For example here is a more explicit form that I'd like to generalize somehow to allow backendRefs mixed in from a struct rather than explicitly listed for each value of the HTTPRoutes struct.

The schema was obtained from timoni mod vendor crds -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.0.0/standard-install.yaml so the intent was to get a "good enough" representation of the HTTPRoute schema into CUE. Good enough meaning it reliably treats unknown fields as errors, validates basic type checking and doesn't result in holos render platform taking longer than ~10 seconds.

The schema timoni imported is here: https://github.com/holos-run/holos/blob/v0.103.0/internal/generate/platforms/cue.mod/gen/gateway.networking.k8s.io/httproute/v1/types_gen.cue#L261

Note in my own httproutes.cue I don't use defaults, so I'm unclear where those are coming from in the example you provided.

There are quite a few places in the imported schema where they're used, searching for | * so I think that's where the #x: { a: int } | *{ b: 2 } portion of the above example is attributable to. My intent is the ability to compose backendRefs using a struct which will ultimately be transformed into the backendRefs list. The composition of these backendRefs into a struct for transformation into a list is my own intent, but I'm also open to other ways to accomplish the same.

All this to say: If the issue can be attributed to the imported schema and not my own use of that schema, I'd love to understand because I'm able to replace the schema with a different one that avoids the problem.

Let me know if this helps clarify the intent enough. If not I can try again and break it down further.

Versions:

❯ timoni version
api: timoni.sh/v1alpha1
client: 0.22.1
cue: 0.9.2

@jeffmccune
Copy link

Also, the double embed isn't really intentional, it was more simply for readability:

  1. Platform team constrains the resource type here: https://github.com/holos-run/kargo-demo/blob/v0.0.12/stacks/resources.cue#L34
  2. Network team also constrains it in their code here: https://github.com/holos-run/kargo-demo/blob/v0.0.12/stacks/network/components/httproutes/httproutes.cue#L11

The two get combined together here: https://github.com/holos-run/kargo-demo/blob/v0.0.12/stacks/network/components/httproutes/httproutes.cue#L181

This is perhaps more accidental than intentional, but I do want the act of composing into HTTPRoutes to be type-checked, and it'll ultimately be composed again into Component.Resources.HTTPRoute which has the same schema, so maybe it's best to simply remove the v1.#HTTPRoute from the HTTPRoutes struct on line 11.

@jeffmccune
Copy link

Hrm, let me try again point by point:

That the values "yielded" by the comprehensions are embedded, and hence the closedness constraints of #x are not enforced in the declaration of out shown. (Noting that a separate declaration does enforce the closedness, modulo the next point) That the "double embedding" of foo: #x results in the cross product described, allowing the value {a: 1, b: 2} in out (this is not possible in input):

Any double embedding is accidental, not my intent. Assuming out is my Component.Resources then my intent is to error on unknown fields, specifically Component.Resources.HTTPRoute.foo: _ should be validated against v1.#HTTPRoute and error on unknown fields.

That the "double embedding" of foo: #x results in the cross product described, allowing the value {a: 1, b: 2} in out (this is not possible in input):

Did not intend this as per above, intent is to vet against v1.#HTTPRoute

That it's possible to see a situation like that observed in #3296 with a different concrete value in out compared to input (see my question regarding the original intent), because of points 1 and 2.

I wasn't aware.

That embedding is not idempotent:

I've been assuming embedding is idempotent.

@myitcv
Copy link
Member

myitcv commented Jan 17, 2025

@jeffmccune

Let me see if I can clarify.

@mvdan was going to follow up with you, but these hugely detailed replies give us plenty to work through. Thank you very much!

@jeffmccune
Copy link

@mvdan was going to follow up with you, but these hugely detailed replies give us plenty to work through.

He did, thanks... I'm concerned they're counter-productive so please let me know if that's the case are and I'll edit them down or delete them entirely.

@myitcv
Copy link
Member

myitcv commented Jan 17, 2025

I'm concerned they're counter-productive so please let me know if that's the case are and I'll edit them down or delete them entirely.

They are absolutely not counter productive. Your responses are exactly what we need to hear (above and beyond fixing the core evaluator, and that's the parallel conversation with Marcel's points in this issue). So to reiterate I'm genuinely very grateful for you spending the time to add this detail. In case jumping onto a call is quicker/easier from your perspective please say. Whilst writing is often a good way to order thoughts etc, for my part I'm conscious that I don't unduly impinge on your time!

@mpvl
Copy link
Member

mpvl commented Jan 17, 2025

Note that the current fix for V3 makes the double embedding essentially idempotent in this case. In general, with this change if a field has a disambiguated disjunction, it will not get "disambiguated" by embedding.

We are thinking of a new way to do defaults uncoupled from disjunctions to simply things a bit. But that is a bit down the line. These cases help shape our thoughts there.

@jeffmccune
Copy link

Thanks @myitcv for the reassurance. I grabbed half an hour to chat with @mvdan on Tuesday to catch up with what you've all learned and ask what I should do different, if anything.

@myitcv
Copy link
Member

myitcv commented Jan 18, 2025

@mpvl

and hence the closedness constraints of #x are not enforced in the declaration of out shown

That is not how closedness works. The closedness is temporarily not enforced, a small but important distinction.

Indeed, and that is what I was trying to convey in "in the declaration of out shown":

input: k1: [string]: a: 1
input: k1: foo: #x
out: {
	for x in input {x}
	for x in input {x}
	foo: b: 2
}
#x: {a: int} | *{b: 2}

with 053f47b gives:

input: k1: foo: a: 1
out: foo: {
        a: 1
        b: 2
}

These implications are not obvious to my mind, as demonstrated by the level of analysis required in this issue.

Well, there is a factor of correctness.

Absolutely. My observation here was certainly not to question or stop that analysis. Because it's critical. My observation was that, like #3296, it's not always clear (to me at least) that the original intent is necessarily captured by the written CUE. So in parallel to us fixing/etc the evaluator, I was looking to dig a bit deeper on the goal/intent front. @jeffmccune has kindly provided more details that I'm going to work through.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
evaluator evalv3-win Issues resolved by switching from evalv2 to evalv3 evalv3 issues affecting only the evaluator version 3
Projects
None yet
Development

No branches or pull requests

4 participants