Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We should not require * if the variant keys exhaust all possibilities #603

Open
eemeli opened this issue Jan 15, 2024 · 14 comments
Open

We should not require * if the variant keys exhaust all possibilities #603

eemeli opened this issue Jan 15, 2024 · 14 comments
Assignees
Labels
formatting LDML47 LDML 47 Release (Stable) registry Issue pertains to the function registry Seek-Feedback-in-Preview Issue should be something we seek feedback on in the tech preview period syntax Issues related with MF Syntax

Comments

@eemeli
Copy link
Collaborator

eemeli commented Jan 15, 2024

As currently specified, we require at least one variant of each .match message to have only * keys. Otherwise, we consider the message to have a Missing Fallback Variant data model error.

This requirement ensures that no matter the case, selection on a valid message will always produce some explicitly defined pattern.

It is however possible to define selectors such as :plural or :boolean for which the presence of some set of keys guarantees that selection will succeed, such as ['other'] for the former and ['true', 'false'] for the latter.

If we were to make the current error a runtime Selection Error like "No Variant Selected" instead of a Data Model Error, it would allow for more natural messages, such as:

.match {$exists :boolean}
true {{You have the thing.}}
false {{You don't have it.}}

instead of our current

.match {$exists :boolean}
true {{You have the thing.}}
* {{You don't have it.}}

With access to a registry definition, it would still be possible to validate messages for completeness.

An outline of the changes required for this are available in #560 (comment).

I'm assigning this to @stasm, as he mentioned this topic near the end of today's call, and I just wanted to be sure we had an issue filed on this. Please feel free to take over here.

@eemeli eemeli added syntax Issues related with MF Syntax registry Issue pertains to the function registry formatting LDML45 LDML45 Release (Tech Preview) labels Jan 15, 2024
@mihnita
Copy link
Collaborator

mihnita commented Jan 16, 2024

We argued on this one to the death before.
And we ended up with it being mandatory.

There are good reasons to have a default, and we had some kind of agreement on this for months (years?).

With access to a registry definition

We know that not all tools will have access to the registry.

And even if we do, there are the cases where that does not scale (because there are too many values).
Or the values change too often.

It also means that for everything that might have some kind of "catch all" option we will have to define that in the registry (think yes / no / unknown, or masculine / feminine / unknown)
And now we need a way to specify (in the registry) what value is "catch all"
Some tools will not be able to access the registry.
And there is a risk that various selectors will use different values for catch all (unknown, -1, n/a, *)

Then the selection algorithm needs access to the registry, at runtime.
Which is not the case now.
Or if not the registry, then some other way to specify "and by the way, this is the default"

@aphillips
Copy link
Member

The actual proposal here is related to this line the syntax specification:

At least one variant MUST exist whose keys are all equal to the "catch-all" key *.

This is the only place where the the catch-all is required. The errors part of the specification defines the "Missing Fallback Variant" error, but it is not referenced anywhere else, which is probably an error.

I do think that multiple selectors make this issue more complicated. The current requirement guarantees that all messages with a matcher will produce a pattern to format. Relaxing this requirement would make it possible to have syntactically valid messages that produce a runtime error (currently undefined!) in which the matcher does not match any of the variants.

An argument can be made that the syntactical requirement for * keyed variants might hide other problems with the matcher. To use @eemeli's example, the current requirement means MF might not notice anything wrong with:

.match {$exists :boolean}
ture {{ <-- typo }}
*    {{I don't know if I exist}}

Also, I note that there is nothing necessarily wrong with the following message, except it is inelegant and lacks some parsimony:

.match {$exists :boolean}
true  {{I exist}}
false {{I don't exist}}
*     {{Unreachable?}}

@mihnita
Copy link
Collaborator

mihnita commented Jan 20, 2024

I have a real example where it was not possible to cover all options.

It is about French plurals, which changed a few years ago to include many as a case (CLDR 38, end of 2020, not that long ago).

So we had thousands of strings in many products that had only "one" / "other" for French.
And the times of submission they passed validations (and would have passed them with MF2).

When CLDR added "many", things continued to work, because all of them had the mandatory "other"
Yes, maybe the selection was a bit off at times. But that was the same as before.

It took a while to fix things (and there might still be unfixed cases here and there :-)
And was not trivial (forcing the strings to be re-sent, app updated and released, translation memories updated)

This also impacted outside:


If we assume that we can have a comprehensive validation and remove the mandatory fallback,
these kind of changes can result in crashes, instead of the grammar being a bit off.
Or (at best) fallback to English? Still worse than bad grammar.

If we thing about browsers, which get updated pretty often, we would suddenly break applications that did nothing wrong, they passed all validations (at the time when they were translated), and worked for years.
But now all of a sudden break because the selectors changed "under them"
Some of these applications might be abandoned.
Or some might be maintained, but critical, so having them crash for a few days is really bad.

Not updated as often as browsers, but CLDR is also used by Windows, macos / iOS, Android, others.
These kind of changes can potentially break thousands of applications.


I agree with Addison that "might hide other problems with the matcher"
But "unhiding" it with a crash might is probably worse :-)

If people use proper lint (as this issue assumes), then the situation will not be hidden for long.
At some point someone will run a lint, and will see "case foo is missing in your bar match"
But it is not an exception in the face of the end user.

@eemeli
Copy link
Collaborator Author

eemeli commented Jan 21, 2024

So let's not fail selection then for plurals, if an other key is present. To repeat myself from above:

It is however possible to define selectors such as :plural or :boolean for which the presence of some set of keys guarantees that selection will succeed, such as ['other'] for the former and ['true', 'false'] for the latter.

In other words, it is perfectly plausible for e.g. a French plural selector to be given a list of keys ['one', 'other'] and to choose other even if many would be a better fit. In this sense, it could work exactly the same as the MF1 plural selector.

Expanding a list of required keys would need to count as a breaking change, which we probably cannot do, for the reasons @mihnita explains above.

@aphillips
Copy link
Member

One of the previous arguments about this was:

We would prefer not to have to call functions to find out if a message is valid. If we accept this proposal, the only way to know if a set of variants has sufficient closure is for the implementation to call functions or read the registry. Note that "implementation" here can mean tools or even people writing messages. Requiring the * avoids this at the potential cost of requiring the * ... * variant.

I think accepting this change ought to include removing the requirement for the "all fallback" variant from "validity" constraints (putting the onus on authors/tools to avoid "Missing Fallback Variant" errors).

We document briefly why we didn't chose making * optional in this design

I could accept making * optional, on the theory that it makes our syntax less pedantic, but I would prefer to keep this fairly mild guardrail personally: it's easier to developers what to do.


It occurs to me that this is something that could benefit from the beta period. Whichever route we choose now, we should explicitly ask for feedback about the consequences of the choice. If we allow non-* fallbacks, we should ask if people are getting bit by it. If we require * fallbacks, we should ask if people are finding it annoying to be required to write what will seem like an extra variant. Does that make sense?

@aphillips
Copy link
Member

In the 2024-01-29 call we decided:

  • we will reject this in LDML45
  • we will seek feedback on this item during tech preview (hence label)
  • we will add warning/notes to the spec (new issue created)

@aphillips aphillips added Future Deferred for future standardization Seek-Feedback-in-Preview Issue should be something we seek feedback on in the tech preview period out-of-scope? and removed Agenda+ Requested for upcoming teleconference LDML45 LDML45 Release (Tech Preview) Future Deferred for future standardization labels Jan 29, 2024
@eemeli
Copy link
Collaborator Author

eemeli commented Jun 19, 2024

Dropping the out-of-scope? label as that was about the LDML54 release.

@catamorphism
Copy link
Collaborator

This seems much harder to do now that the registry is optional. E.g. in the example, without a registry, there's no way to check that :boolean can only return true or false.

@aphillips
Copy link
Member

I'd go further: it might be impossible because the spec has the matching algorithm adding the * variant outside of the function. The * might be unreachable for a given selector, but it still has to be present.

@eemeli
Copy link
Collaborator Author

eemeli commented Oct 8, 2024

This is still just as possible as always:

If we were to make the current error a runtime Selection Error like "No Variant Selected" instead of a Data Model Error, it would allow for more natural messages [...]

Not requiring the * variant makes it possible for selection to fail, in which case we'd need to resort to {�} fallback formatting

But on the other hand it also allows for messages that have more semantically meaningful variant keys, like true & false instead of true & *, or one & other instead of one & *.

The validation problem we've left outside the scope of the MF2 spec, as we do not define a machine-readable registry, but this doesn't make it impossible, it just requires an implementation and a linter to define separately how they communicate about function interfaces.

@catamorphism
Copy link
Collaborator

I see, yes -- I think what threw me off is the issue title. Without a way (within the scope of the spec) to check if the variant keys are exhaustive, the proposal becomes just "We should not require *."

@aphillips
Copy link
Member

Do we want to re-open this for 46.1 or seek feedback (post 46.1) for 47?

@aphillips aphillips added the Agenda+ Requested for upcoming teleconference label Oct 8, 2024
@eemeli
Copy link
Collaborator Author

eemeli commented Oct 8, 2024

As with #602, this is an issue for which external input would be useful, and for which we could/should explicitly seek feedback.

@aphillips aphillips added resolve-candidate This issue appears to have been answered or resolved, and may be closed soon. and removed Agenda+ Requested for upcoming teleconference labels Oct 18, 2024
@aphillips
Copy link
Member

This was discussed in the 2024-10-14 teleconference. We rejected this proposal for the 46.1 timeframe. We are leaving intact the desire for feedback during the TechPreview period.

@aphillips aphillips added LDML47 LDML 47 Release (Stable) and removed resolve-candidate This issue appears to have been answered or resolved, and may be closed soon. labels Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
formatting LDML47 LDML 47 Release (Stable) registry Issue pertains to the function registry Seek-Feedback-in-Preview Issue should be something we seek feedback on in the tech preview period syntax Issues related with MF Syntax
Projects
None yet
Development

No branches or pull requests

6 participants
@aphillips @stasm @catamorphism @eemeli @mihnita and others