Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support a must() builtin #575

Open
cueckoo opened this issue Jul 3, 2021 · 10 comments
Open

Support a must() builtin #575

cueckoo opened this issue Jul 3, 2021 · 10 comments

Comments

@cueckoo
Copy link
Collaborator

cueckoo commented Jul 3, 2021

Originally opened by @myitcv in cuelang/cue#575

This has come up various times on Slack, noting here for posterity (and so it can be referenced)

The name must() was also thrown into the 🚲 shed.

Is your feature request related to a problem? Please describe.

Sometimes it is necessary to declare arbitrary constraints on a field.

Taking one such example from Slack, where we try to define #Foo as a string that must contain a numeric value greater than 5:

#Foo: constrain(strconv.Atoi(#Foo) > 5)

Describe the solution you'd like

As above.

Describe alternatives you've considered

The current alternative is to declare additional (hidden) definitions that express the constraint:

#Foo: string
_#checkFoo: strconv.Atoi(#Foo) & >5

Contrast the proposed constrain() builtin which allows the constraint to be declared on the field itself, which is much clearer for the author, reader and user.

Additional context

n/a

@cueckoo cueckoo added builtin FeatureRequest New feature or request labels Jul 3, 2021
@cueckoo cueckoo added this to the v0.4.x milestone Jul 3, 2021
@cueckoo
Copy link
Collaborator Author

cueckoo commented Jul 3, 2021

Original reply by @mpvl in cuelang/cue#575 (comment)

I vote for must.

@cueckoo
Copy link
Collaborator Author

cueckoo commented Jul 3, 2021

Original reply by @jlongtine in cuelang/cue#575 (comment)

I like must, too.

@cueckoo
Copy link
Collaborator Author

cueckoo commented Jul 3, 2021

Original reply by @extemporalgenome in cuelang/cue#575 (comment)

I have been thinking about this issue as well, particularly around how strings.MinRunes takes the value as an implicit parameter (which has benefits and drawbacks).

That said, I wonder if the self-referential #Foo is a design liability. Are there any cases in which the right-hand-side #Foo could be ambiguous?

Pretend that strings.Contains, strings.Count, etc, accepted only explicit parameters (so we could express "receiver is contained by" using the same function in which we currently only express "receiver contains").

// if #Foo isn't one of these fruit, it must be two or more words
#Foo: "apples" | "oranges" | "pears" | =~ #"(\w+)(\s+\w+)+"#
#Foo: constrain(strings.Count(#Foo, #Foo) < 7)

Pretend the author of the above intends the first strings.Count parameter to bind to the definition (i.e. bind to the receiver, though I'm not sure what terminology we use for that), while the second parameter is intended to mean "any value matching the definition #Foo"), thus as a whole the constraint would mean: any valid #Foo must contain fewer than 7 fruit (and fewer than 14 words, presumably, though that'd be out of scope for strings.Count to intelligently handle).

However, since #Foo binds to the receiver, this really means that any matching concrete string must contain fewer than 7 occurrences of itself, which is certainly true, but not very useful to express.

If we had a special symbol to refer to the receiver, we'd avoid this hypothetical issue, and potentially reduce reader confusion ("is this an unresolvable circular reference? I thought those were disallowed?").

Using the original example, perhaps we could have a notation for referring to the receiver, such as @ or $ (neither of which is presently a valid identifier):

#Foo: constrain(strconv.Atoi(@) > 5)

Perhaps this is also worth a special symbol, for example -> to mean "such that the following is true" ?

#Foo: -> strconv.Atoi(@) > 5

or an explicit form without a magic identifier:

#Foo: x -> strconv.Atoi(x) > 5

(where x is a local binding for the receiver)

@cueckoo
Copy link
Collaborator Author

cueckoo commented Jul 3, 2021

Original reply by @bttk in cuelang/cue#575 (comment)

I was surprised to learn that this is not what alias was used for:

d=#Date: {
	=~#"^\d{4}-\d{2}-\d{2}$"#
	#valid: time.Parse(d, time.RFC3339Date) & true
}

cue eval -e '(#Date & "2021-01-32")' -c
"2021-01-32"

@cueckoo
Copy link
Collaborator Author

cueckoo commented Jul 3, 2021

Original reply by @myitcv in cuelang/cue#575 (comment)

@bttk you can't use an alias in the following example:

package x

import "list"

x: [...int] & list.MinItems(3)
x: [1, 2, 3]

The use of implicit parameters is missing from the spec, which I've raised as cuelang/cue#863

The intention of must is to, in most cases, remove the repetition that comes with aliases, and verbosity of embedded scalars.

@cueckoo
Copy link
Collaborator Author

cueckoo commented Jul 3, 2021

Original reply by @verdverm in cuelang/cue#575 (comment)

@bttk There is an excellent write up on several related constructs (alias, let, hidden fields) here: cuelang/cue#699 (comment)

@cueckoo
Copy link
Collaborator Author

cueckoo commented Jul 3, 2021

Original reply by @myitcv in cuelang/cue#575 (comment)

Good memory, @verdverm 😄

@myitcv
Copy link
Member

myitcv commented Jun 14, 2023

Tentatively marking this as v0.7.0 (so as not to delay v0.6.0). This has been a high priority feature request for some time, but has never fully made the cut. However, if we add up all the requirements that generally point this direction, there is significant value in addressing this sooner rather than later.

@myitcv
Copy link
Member

myitcv commented Jun 14, 2023

Edited previous comment with change in milestone.

@extemporalgenome
Copy link

Is it right for this to be a function? Could constrain/must be a keyword or pseudo-keyword?

Making it a function means that we can end up with some non-sense:

field_i_care_about: must(strconv.Atoi(something_unrelated) > 5)
// or
something_unrelated: must(strconv.Atoi(field_i_care_about) > 5)

How would this get reported? field_i_care_about failed because something_unrelated didn't have an expected value? That could be powerful (help direct the user to the "real problem"), but it could also be misused (intentionally or otherwise), and be susceptible to copy-paste errors, where only one of the identifiers is updated following the paste.

Avoiding extraneous information is a good mitigation against this issue.


As such, I propose one of the following modifications to the design:

constrain has its own syntax with a keyword

constrain strconv.Atoi(field_i_care_about) > 5

sigil-based syntax

constrain/must would merely be the term used to describe an indirect property validation syntax

field_i_care_about: strconv.Atoi(@) > 5

The sigil just returns to the left-hand-side. The presence of a complete expresison (no implicit operands) enable boolean-mode (or bool+error) evaluation. Functions which return bools can be used as validators by using == true, e.g.:

field_i_care_about: list.UniqueItems(@) == true

or perhaps, for clarity:

field_i_care_about: check list.UniqueItems(@)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants