-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move "global" internal slots to MediaDevices, and fix "relevant object" links #939
Conversation
In cleaning this up, I came across a couple of realities that I've tried to handle better with this PR.
|
A UA can stop a source and all tracks will get ended.
I do not see why it would not work, getUserMedia creates a source that gets associated to ONE global object. |
My understanding is a source "can be a physical webcam", making it a singleton in the entire user agent. gUM doesn't "create" it, as much as it: "Let[s] grantedDevice be finalCandidate's source device. ... Let track be the result of creating a MediaStreamTrack with grantedDevice", which initializes track.[[Source]] to grantedDevice. E.g. in this iframe example (Once I hit Allow and the
But stop all sources isn't handling that correctly, as written. It says: "For each MediaStreamTrack object track whose relevant global object is globalObject, set track's [[ReadyState]] to "ended". ... For each source in globalObject's [[mediaStreamTrackSources]], stop source." This fails to:
In the PR this is fixed like this. |
That sounds like ends for any reason other than the stop() method being invoked, not stop all sources of a global object. Unplugging a camera should stop ONE source, ending all tracks of that source regardless of how many global objects it is associated with. |
Ah, this is where we are not aligned. AIUI, the main (only) requirement is that cloned tracks have to share the same source.
We should dig into this, stopping the sources happen when the document gets unloaded, it is not clear whether queueing a task at this time will ensure the task will be executed.
It works as follows:
See above, this is handled.
As per above, I am not sure why this would be needed. |
As you say, tracks are muted, not sources, so I don't immediately see what's not possible. But let's dig in: § 4.3.1.2 Media Flow begins with "Muted refers to the input to the MediaStreamTrack.", BUT this ultimately just describes the track's muted state, and when the UA should update it, which it accomplishes (almost!) without mentioning "source". The section has just two mentions of "source" near the end:
The second one does mention "the muted state reflects the state of the source", which seems a tacit admission that the source has some (unnamed) state. But I think s/source/input/ here would have been more self-consistent with the rest of this section. I think what happened here was that all the user agent examples we could think of at the time were global to the user agent ("the user pushing a physical mute button on the microphone, the user closing a laptop lid with an embedded camera, the user toggling a control in the operating system"), and we wanted to express the invariant that creation of new tracks shouldn't sidestep those user-directed privacy measures, so we had no choice but to tacitly refer to some state outside of the track.¹ I think the challenge here is how to express this invariant while at the same time allowing a user agent to mute all tracks in some contexts and not others. — But I don't think defining multiple instances of source is the answer, because having an instance live inside a single (per realm) global object seems too limiting, because in my view, the idea that gUM for the same camera from a page and from an iframe in that page (see my example above) refers to distinct source objects seems confusing for little benefit. At least in Firefox, privacy indicators act as if this is the same source (yes I gave up fixing that section in the spec).
Transferred tracks also share the same [[Source]] of their detached counterpart. Such tracks may be transferred not just to workers but to other documents, in which case their relevant global object (which will be in the destination realm) won't have it in its [[mediaStreamTrackSources]]. This, iframes, and the way privacy indicators have been implemented in all browsers, I think suggest the minimal granularity is a page. E.g. if an iframe calls gUM, transfers the resulting video track up to the document containing it, and then the iframe navigates, does the the video track end? If so, why should it? The permissions were delegated to the iframe in the first place, so why can capture not continue? So perhaps the source should live in the top-level document? Or maybe we can write algorithms that don't depend on where sources live?
Good point. I can remove that.
My problem with this and the current text is it only works correctly if we ASSUME sources are different per global object, which at best isn't obvious and at worst seems to lack support in the spec, which means when I read this (thinking sources are singletons) it reads to me as doing the WRONG thing (stopping other concurrent sources). In contrast, I hope my PR does not rely on that assumption (since it was written without it). 1. There's also the more recent: "[[Muted]], initialized to true if source is muted", which I think erroneously refers to the track muted state. |
Yes, this is the current expected behavior, which aligns with what JS can do by transferring manually VideoFrames. |
I think the intent of the spec is clear, as we discussed this when designing transferred streams. |
I see my mistake now. I misread what the original text was trying to do (I read it as inadvertently stopping all tracks of a physical source in the user agent). But in fixing it, the PR fails to preserve the original intent, which I failed to interpret and we wish to to preserve, which is effectively to end all tracks "tied" to the I'll try to update my PR to not break that. To be fair, I believe the existing language only accomplishes that under ONE interpretation of the scope of source (one per global object), NOT the other (source is a singleton of the user agent), but I believe we can express this while being agnostic of source scope.
This is all I'm trying to do. :) |
The assumption is more one source per getUserMedia call (well per track created by getUserMedia). |
Where in the spec is such an assumption relied on? Rather, the source stopped algorithm relies on a 1-1 between a source and device:
If JS calls gUM twice and get two video tracks with the same deviceId, and then I end one of them, a per-track source interpretation would turn off the privacy indicator for this realm even though the other video track is still live. The [[devicesLiveMap]] is on the global object (which is 1-1 with That seems like one too many 1-to-many relationships. |
I've simplified this PR to focus on the move of internal slots, which should hopefully be editorial now. @youennf PTAL. |
@@ -693,7 +692,9 @@ <h2>{{MediaStreamTrack}}</h2> | |||
<a data-cite="!HTML/#concept-relevant-global">relevant global object</a> is <var>globalObject</var>, | |||
set <var>track</var>'s {{MediaStreamTrack/[[ReadyState]]}} to | |||
{{MediaStreamTrackState/"ended"}}.</p></li> | |||
<li><p>For each <var>source</var> in <var>globalObject</var>'s {{MediaDevices/[[mediaStreamTrackSources]]}}, | |||
<li><p>If <var>globalObject</var> is a {{Window}}, then for each <var>source</var> in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could replace the {{Window}} check here by a check on whether there is an associated MediaDevices object, which could be stored in a specific internals lot.
That way, we do not need to come back here if we ever expose things in workers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think we should follow HTML here, which has 15 hits for "is a Window", e.g. "If global is a Window object, then return global's associated Document." — I'd similarly avoid "If X has internal slot Y".
Preserving this invariant allows readers to infer types of passed-in arguments in algorithms (for which there often are no other nearby clues) from their associated properties and internal slots.
@@ -2658,6 +2657,9 @@ <h2>Enumerating Local Media Devices</h2> | |||
camera or a headset).</p> | |||
<section> | |||
<h3>`Navigator` Interface Extensions</h3> | |||
<p>Each {{Window}} has an <dfn>associated `MediaDevices`</dfn>, which is a {{MediaDevices}} object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use an internal slot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HTML seems to prefer this format to describe inherent (constant) 1-1 relationships, which I cribbed from:
Each Window has an associated UserActivation, which is a UserActivation object. Upon creation of the Window object, its associated UserActivation must be set to a new UserActivation object created in the Window object's relevant realm.
... partial interface Navigator { [SameObject] readonly attribute UserActivation userActivation; };
While HTML does use internal slots in places, its use seems limited to variables repeatedly modified by algorithms (beyond initialization). E.g. I don't see "set X's associated Y to newValue" where it replaces an old value.¹
Since we never change the mediaDevices
object on navigator
, I think the "associated" language gives readers one less variable to worry about (like const
over let
in JS), so I think I prefer that.
1. This doesn't appear to be a 100% rule, as I do see things like: "each Window W has a last activation timestamp", where it seems to rely on language to differentiate whether something can change or not (e.g. last vs. associated). But this is tipping in the other direction, away from using the [[]]
format.
to the [=User Agent=].</p> | ||
</li> | ||
<li> | ||
<p>Create one internal slot: <dfn data-dfn-for="MediaDevices">[[\canExposeCameraInfo]]</dfn>, initialized | ||
<p><dfn data-dfn-for="MediaDevices">[[\canExposeCameraInfo]]</dfn>, initialized |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is correct but I am not sure we have tests for this.
Say one same origin iframe has access to camera and microphones, what about its same origin parent for instance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR merely moves internal slots from window
(one per realm) to window.navigator.mediaDevices
(one per realm) which should be behavior neutral in all JS-observable ways, so blocking on tests to merge this cleanup would be unfortunate I think.
That said, more tests in this area would be great! Happy to try to add some later, maybe in bug 1528042.
Say one same origin iframe has access to camera and microphones, what about its same origin parent for instance?
Two same-origin iframes/parents are different realms. So both before and after this PR, enumerateDevices exposure in the parent would not be enabled by getUserMedia calls in iframes, or vice versa, as the spec is written.
If this is something we'd want to enable, we should open a new issue to discuss. On the one hand, it seems like an oversight in that it's stricter than gUM permissions which are inherited (in some browsers). OTOH, I'm not super excited to enable something for consistency whose primary use case would likely be tracking.
The PR now should be true to its title and only move internal slots that were on the global object |
Fixes #938. Using hide whitespace when reviewing is recommended.
Preview | Diff