Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ECMA-402 v1 legacy constructor semantics compromise #84

Merged
merged 2 commits into from
Feb 2, 2017

Conversation

littledan
Copy link
Member

This patch addresses #57 by allowing certain legacy constructor
patterns to coexist with the new guarantees in ECMA-402 v2, which
allows for a pattern where all internal slots exist from the
beginning of the object's lifetime. The compromise is based on
storing a "real" object inside of a symbol-named property to allow
for object initialization in cases of the
Intl..call(Object.create(Intl.) pattern.
Legacy methods have to forward their calls to this "real" object.

This patch specifies the change for Intl.NumberFormat, but an
analogous change would also be needed for Intl.DateTimeFormat and
Intl.Collator. This version is being sent out for review for feedback
from users and implementors. A sample implementation in V8 can
be found at
https://codereview.chromium.org/1828543007

@@ -346,6 +365,13 @@

<emu-alg>
1. If NewTarget is *undefined*, let _newTarget_ be the active function object, else let _newTarget_ be NewTarget.
1. Let _this_ be the receiver.
1. NOTE: The following step and nested steps implement legacy compatibility semantics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the first time I see an arbitrary link in an ecma spec, do we really need to link to the bug in the algo?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just trying to be useful by including meaningful non-normative notes, but maybe we can say people should find this information via git blame instead.

@caridy
Copy link
Contributor

caridy commented Mar 25, 2016

LGTM

@littledan
Copy link
Member Author

Before this is committed, we need feedback on whether this patch ensures compatibility with old Intl.js versions and other uses of ECMA 402. It also needs to be ported to the other two relevant constructors, which I'll do as soon as I hear back about the compatibility of this change.

@littledan
Copy link
Member Author

I tested this patch on a website which uses an old version of Intl.js. I observed that the v2 semantics led to a console message being printed indicating a failure to format, whereas the v1 semantics and those in this patch seemed to load successfully. Does anyone else have feedback before I port these semantics to the other cases?

@littledan
Copy link
Member Author

@tschneidereit @thetalecrafter @bterlson

@vanwagonet
Copy link

Given that the known incompatible library only operated on NumberFormat and DateTimeFormat, could we leave out applying the compatibility fix to Collator?

@allenwb
Copy link
Member

allenwb commented Mar 28, 2016

@thetalecrafter
+1 WRT Collator

@caridy @littledan
Could we add this as an "Annex B" within Ecma-402? It seems like something we can get rid of as Intl.js falls out of usage.

@tschneidereit
Copy link
Member

CC @jswalden as our expert on Intl.

@littledan
Copy link
Member Author

OK, I tested a version which just does NumberFormat and constructor (same link), with code structured so it should be easy to turn off later, and it worked on the broken-in-v2 page. I uploaded a spec version which just applies to those. What do you think?

@littledan
Copy link
Member Author

Re: Annex B. Could we have some kind of emu formatting to put the "historical web compat" parts in-line rather than out-of-line in an appendix (@bterlson @domenic)? I think this would make things easier for implementors and users if they didn't have to look way at the end, given how this is actually a little invasive. I agree that it could be removed after a number of years, but not yet.

By the way, Intl.js has been updated to work with the v2 spec, and I don't know of any serious competitors; I think we're just waiting for old versions to fall out of use.

@jswalden
Copy link
Collaborator

This in principle (modulo one aspect of the change that I'm probably misreading, detail to follow) seems reasonable enough to me. I'm definitely with the people arguing for putting this in some sort of temporary-compatibility holding pattern for later removal. The subclassing tactic Intl used to use is a wart upon the proposal and the spec, that we can and should endeavor to remove, even if it's not an immediately-possible thing.

</p>
<emu-alg>
1. If _nf_ does not have an [[initializedNumberFormat]] internal slot,
1. If ? InstanceofOperator(_nf_, %NumberFormat%),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm probably misreading this, but are you sure this shouldn't be "If not ? InstanceofOperator(nf, %NumberFormat%),"? It seems to me that if nf is a NumberFormat, we should be returning it. Only in the case where nf isn't a NumberFormat, should we look at nf[Intl.[[FallbackSymbol]]].

And a similar change needs to be made, it seems to me, to the Intl.NumberFormat algorithm.

And similar changes would need to be made to the DateTimeFormat patch.

Again, probably misreading this, but I've been staring at it for a good ten minutes and can't parse it as anything other than doing nothing special if the object being acted upon is a non-NumberFormat.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! Till explained how I was being stupid in misreading this elsewhere. Carry on, this change looks good!

@caridy
Copy link
Contributor

caridy commented Mar 29, 2016

@allenwb, @littledan Let's discuss the course of action for the spec notes and later removal during tomorrow's meeting, we have an agenda item for this.

@littledan
Copy link
Member Author

At TC39 today, we got to consensus on including this proposal, as long as it's marked in the spec as normative-optional and something that we'd hope to remove if/when possible. This marking may be done inline if it makes the resulting spec easier to follow, as long as the affected sections are indeed appropriately marked. (@erights, could you verify this interpretation of the conclusion?) (@bterlson any suggestions for how to mark it as such, or should I cook something up?)

existing Intl objects.
</p>
<emu-alg>
1. If _nf_ does not have an [[initializedDateTImeFormat]] internal slot,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: initializedDateTImeFormat --> initializedDateTimeFormat

@erights
Copy link

erights commented Mar 30, 2016

(@erights, could you verify this interpretation of the conclusion?)

Yes. Thanks for checking.

@ericf
Copy link
Member

ericf commented Mar 30, 2016

For posterity, the code that depended on the ECMA402 v1 constructor behavior was in the intl-format-cache package (not the Intl.js polyfill package) and was fixed in a SemVer patch release ~4 months ago.

Last week I was able to supply @littledan with a site which is still using an older version of the intl-format-cache package so that he could verify that without this change the site breaks, and when applying this change the site works as expected.

@littledan
Copy link
Member Author

This proposal reached consensus at the March 2016 TC39 meeting.

@littledan
Copy link
Member Author

The remaining change needed it so make it clear that the text here needs to be normative-optional. At TC39, there was consensus that this can be done in-line, by putting a box around the optional parts, rather than out-of-line as in Annex B in ECMA 262, as long a the denotation of normative-optional was the same. @caridy signed up for making this editorial change.

@caridy caridy self-assigned this Apr 7, 2016
@littledan
Copy link
Member Author

@caridy What's the status of this change? Need anything from me?

@caridy
Copy link
Contributor

caridy commented Jun 9, 2016

@littledan I will work with @bterlson on the normative-optional representation, and then we can tackle this one.

@caridy caridy merged commit 8775fdc into tc39:master Feb 2, 2017
@littledan
Copy link
Member Author

Thanks for this!

Copy link
Contributor

@anba anba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review check list:
[x] Write review comments
[ ] Submit review comments

Total review fail by me because I didn't actually submit the review comments! 😢

</p>
<emu-normative-optional><span class="normative-optional">Normative Optional</span><div class="normative-optional-contents">
<emu-alg>
1. If _dtf_ does not have an [[initializedDateTimeFormat]] internal slot and ? InstanceofOperator(_dtf_, %DateTimeFormat%),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to check Type(dtf) is Object because dtf could be a primitive when called from resolvedOptions().

So either:

  1. If Type(dtf) is Object and dtf does not have an [[initializedDateTimeFormat]] internal slot and ? InstanceofOperator(dtf, %DateTimeFormat%) is true, then

Or if instanceof should also be applied on primitive values (also needs GetV instead of Get in that case):

  1. If Type(dtf) is not Object or dtf does not have an [[initializedDateTimeFormat]] internal slot, then
    1. If ? InstanceofOperator(dtf, %DateTimeFormat%) is true, then
      1. Let dtf be ? GetV(dtf, Intl.[[FallbackSymbol]]).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with the first option because I'm not sure about the primitive values. @littledan can you double check this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed by 7eb2c2b

<emu-normative-optional><span class="normative-optional">Normative Optional</span><div class="normative-optional-contents">
<emu-alg>
1. If _dtf_ does not have an [[initializedDateTimeFormat]] internal slot and ? InstanceofOperator(_dtf_, %DateTimeFormat%),
1. Let _dtf_ be ? RequireObjectCoercible(Get(_dtf_, Intl.[[FallbackSymbol]])).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RequireObjectCoercible is not needed here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed by 7eb2c2b

</emu-alg>
</div></emu-normative-optional>
<emu-alg>
2. If _dtf_ does not have an [[initializedDateTimeFormat]] internal slot,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs type check for object values:

  1. If Type(dtf) is not Object or dtf does not have an [[initializedDateTimeFormat]] internal slot, then

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed by 7eb2c2b

<emu-normative-optional><span class="normative-optional">Normative Optional</span><div class="normative-optional-contents">
<emu-alg>
4. Let _this_ be the *this* value.
1. If NewTarget is *undefined* and ? InstanceofOperator(_this_, %NumberFormat%),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing "then" after comma.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed by 7eb2c2b

<emu-alg>
4. Let _this_ be the *this* value.
1. If NewTarget is *undefined* and ? InstanceofOperator(_this_, %NumberFormat%),
1. Perform ? DefineOwnPropertyOrThrow(_this_, Intl.[[FallbackSymbol]], { [[Value]]: _dateTimeFormat_, [[Writable]]: *false*, [[Enumerable]]: *false*, [[Configurable]]: *false* }).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs a type check, because the this-value could be a primitive value in which case it's not valid to call DefineOwnPropertyOrThrow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is covered by InstanceofOperator(_this_, %NumberFormat%) in the previous step I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not if Intl.NumberFormat[Symbol.hasInstance] was redefined to add a custom instanceof behaviour, like in this test: https://github.com/mozilla/gecko-dev/blob/3ed98bee2a88a4d953d22520e727344bf8e617b8/js/src/tests/Intl/NumberFormat/call.js#L62-L85

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was all unintentional, but I agree with @caridy's reading.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anba will something like this be sufficient?

image

or should we just throw a TypeError if this-value is not an object before calling DefineOwnPropertyOrThrow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caridy I don't see why that's necessary; I'm fine with this throwing due to InstanceOf.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I feel the same @littledan, but @anba seems to be convinced about a potential issue here, and I trust him :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I missed the above comment--yeah, the text you wrote sounds good and (I think) it should guard against it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anba can you confirm?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both solutions (guard or throwing) work, and I'm fine with either of them. 😄

@@ -12,6 +12,10 @@
The Intl object is not a function object. It does not have a [[Construct]] internal method; it is not possible to use the Intl object as a constructor with the *new* operator. The Intl object does not have a [[Call]] internal method; it is not possible to invoke the Intl object as a function.
</p>

<p>
The Intl object has an internal slot, [[FallbackSymbol]], which is a new %Symbol% in the current realm.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jswalden asked in https://bugzilla.mozilla.org/show_bug.cgi?id=1328386 if [[FallbackSymbol]] should get a [[Description]]?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@littledan can you comment here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would undefined be a reasonable [[Description]] value here? In V8, I used "IntlFallback", but for no good reason. There's no way that you can access this symbol as the property of something else, and I don't see a reason to make one, so "Intl.fallback" or "Symbol.intlFallback" would not be all that compelling in my opinion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume @jswalden asked about adding a description for debugging purposes, I can try to catch him later on IRC, so he can chime in.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

solved by 44107b6

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Intl object has an internal slot, [[FallbackSymbol]], which is a new %Symbol% in the current realm.

In this case, shouldn't something like %FallbackSymbol% be added to Table 1 and the accesses performed like: Get(%Intl%,%FallbackSymbol)? Basically, internal slots are not identified using symbols. It either needs to be an internal slot named FallBackSymbol or a own property whose key is %FallBackSymbol%. You should be muddling these distinct concepts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see no problem with putting it as %FallbackSymbol%, but I also don't quite understand the problem with the current phrasing. When I wrote it out with this internal slot notation, I was picturing it as being "in a separate namespace"--it's just an internal slot that is called [[FallbackSymbol]] which contains a symbol. Is the problem that this is a one-off internal slot, rather than a name that's reused over a class of objects?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I misunderstood what you were trying to say (which is the root problem). I thought you were saying that the "new %Symbol%" was the key/name of the internal slot. If the value is a realm specific intrinsic symbol, it probably needs to be added to Table 1 (of this spec). Note that in Ecma-262 we don't have any realm-specific intrinsic symbols so rather than appearing in the corresponding Table 6 (Well known intrinsic objects) intrinsic symbols are listed in Ecma-262 Table 1 (Well known symbols).

We should probably also refer it the symbol as @@FallbackSymbol.

So, the spec statement could be: "The Intl object has an internal slot, [[FallbackSymbol]] whose value is initialized to @@FallbackSymbol

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting point about being realm-specific vs cross-realm. In V8, I ended up implementing it as a cross-realm symbol, actually. @anba What did you do for SpiderMonkey? Do you have an opinion about which way it should be?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What did you do for SpiderMonkey? Do you have an opinion about which way it should be?

It's currently implemented as a realm-specific symbol in SpiderMonkey. Hm, if it isn't a realm-specific symbol, I wonder if we should remove the InstanceofOperator call in the unwrapping operations, so that a cross-realm object initialized as a Intl.DateTimeFormat/NumberFormat object works the same way as a proper cross-realm Intl.DateTimeFormat/NumberFormat object. (Intl.NumberFormat.prototype.resolvedOptions.call(crossRealmNumberFormat) currently works, but Intl.NumberFormat.prototype.resolvedOptions.call(crossRealmObjectInitializedAsNumberFormat) throws a TypeError.)

</p>
<emu-normative-optional><span class="normative-optional">Normative Optional</span><div class="normative-optional-contents">
<emu-alg>
1. If _nf_ does not have an [[initializedNumberFormat]] internal slot and ? InstanceofOperator(_nf_, %NumberFormat%),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issues here and below like for DateTimeFormat.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed by 7eb2c2b

@caridy
Copy link
Contributor

caridy commented Feb 6, 2017

thanks @anba, I can take care of this.

@erights
Copy link

erights commented Feb 22, 2017

If the kludge serves its purpose either way, I would prefer the realm specific approach as that narrows the visibility of the kludge.

@littledan
Copy link
Member Author

How does realm-specific make the kludge narrower? I don't see how cross-realm vs realm-specific differ in terms of breadth.

@allenwb
Copy link
Member

allenwb commented Feb 23, 2017

It seems likely that the use of a per realm symbol will mean that passing an object that actually uses the fallback may fail with the object is passed between realms. Of course, the use of InstanceOfOperator as part of the prerequisite for setting the fallback also seems problematic in cross-realm situations. Note that the this value (presumably usually Intl) passed to Intl constructors is not necessarily from the same realm as the actual constructor function that was called.

Cross-realm well known symbols are generally used when we want objects that depend upon symbol based hooks to work consistently when they are passed between realms.

It's still not clear to me why a symbol accessed property is needed. As far as I can tell the actual symbol isn't explicitly exposed to userland. Under what circumstance would usernd code need to use the symbol to access the associated property. If seems like an internal slot would be a better place to capture the fallback object.

@littledan
Copy link
Member Author

The purpose of this proposal is to make it so that you can take an object that was initialized with none of these Intl internal slots and then make it act like a DateTimeFormat or NumberFormat instance, without adding an internal slot to an existing object. You can get at this symbol from JS in a way that @anba's tests show:

Object.getOwnPropertySymbols(Intl.NumberFormat.call(Object.create(Intl.NumberFormat.prototype)))[0]

@erights
Copy link

erights commented Feb 23, 2017

The purpose of the proposal is to introduce the minimal kludge we can to allow old-bad-code-that-we-dare-not-break to not break while doing all we can to discourage new code from needlessly relying on the presence of this kludge. If too much new code accidentally comes to rely on the kludge anyway, we'll never be able to kill it. Otherwise, we might. We have several such successes behind us already.

@allenwb
Copy link
Member

allenwb commented Feb 23, 2017

Ah, I missed the application of the constructor to an arbitrary object part.

Clearly, you can get at the symbol that way. Isn't that a problem? It opens the compatibility hack up to various misuses or even attacks. Wouldn't it be better to use a "soft field" to associate the fallback value with an arbitrary object? EG, an specification level cross-realm weak map like registry (of course, left to implementations to decide how to actually represent it). There would be no exposure to userland tampering and no cross-realm issues (except that the instanceof tests are still a problem).

@littledan
Copy link
Member Author

How would this differ in observable semantics from the v1 constructor semantics?

@allenwb
Copy link
Member

allenwb commented Feb 23, 2017

@littledan
I think it is probably the same as the v1 semantics. But I haven't done any sort of real analysis.

V1 assume the capability to add private slots, post allocation, to arbitrary objects. Soft fields via a weak registry provide an equivalent capability. Use of a symbol-keyed property provide a similar capability, but exposes other observable behaviors.

@littledan
Copy link
Member Author

This would be readily implementable in V8. We'd just use our private symbol functionality (which is inadmissible to spec since Proxies would see private symbol access, but would not see WeakMap access), rather than an ordinary symbol.

If I understand correctly (please correct me if I'm wrong), basically you're suggesting that we go back to the v1 semantics, but phrased differently, and with the scope restrictions we worked out above (only for NumberFormat and DateTimeFormat, and only if the thing you're constructing instanceof that constructor, plus the object checks recommended by @anba).

I guess the only downside I see is debugging. With a public symbol, you can see how it's represented, but with a WeakMap, it's secret internal state. On the other hand, there's lots of other stuff that's in internal state; this wouldn't be particularly different.

Given that this is a bug fix, it might not be too late to squeeze it into ES2017. What do people think? I don't have a particular objection to it.

@allenwb
Copy link
Member

allenwb commented Feb 27, 2017

If I understand correctly (please correct me if I'm wrong), basically you're suggesting that we go back to the v1 semantics, but phrased differently, and with the scope restrictions we worked out above (only for NumberFormat and DateTimeFormat, and only if the thing you're constructing instanceof that constructor, plus the object checks recommended by @anba).

Basically yes. The V1 semantics can be implemented on any platform that hosts some sort of soft field capability and intl methods that use the fallback protocol. What we want too make normatively optional is the ability to setup a fallback for ordinary, non-formatter objects.

I don't think we should be using instanceof to determine if we need to setup a fallback or in unwrapping because of the cross realm issues relating to instanceof. Instead of instanceof, internal slot existence tests should suffice for determining whether a constructor needs to setup a fallback object or wither unwrapping needs to be done.

@littledan
Copy link
Member Author

I don't see how internal slot existence checks could work. We need to be able to "instantiate" an object that was created like Object.create(Intl.NumberFormat.prototype).

@allenwb
Copy link
Member

allenwb commented Feb 27, 2017

don't see how internal slot existence checks could work. We need to be able to "instantiate" an object that was created like Object.create(Intl.NumberFormat.prototype).

Allowed by using Intl.NumberFormat to initialize the object, right?

NumberFormat checks if its this value has one or more of the internal slots of a NumberFormat instance. If not, it instantiates a fallback NumberFormat object and associates it with the this value using some implementation specific soft field mechanism.

@littledan
Copy link
Member Author

The purpose of the instanceof check is to narrow the set of things that we'll go back and instantiate. The narrowing isn't driven by implementation or compatibility concerns, but by design goals from TC39 normative-optional practice to make compatibility fixups as narrow as possible. We could also go to v1 semantics, which didn't have this narrowing.

The UnwrapNumberFormat operation already does not get the fallback object for objects which are actual NumberFormat instances with the internal slot; the constructor does not do this check, but I would be fine with adding it for further narrowing, but it would only change behavior in extremely obscure test cases like:

  Intl.NumberFormat.call(Intl.NumberFormat())

What the instanceof check does is make sure that we don't instantiate calls like this as a NumberFormat pseudo-instance:

  Intl.NumberFormat.call({})

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.