Closed Bug 1212996 Opened 9 years ago Closed 4 years ago

Grant permission for all devices of a class

Categories

(Firefox :: Site Permissions, defect)

42 Branch
defect

Tracking

()

RESOLVED WONTFIX

People

(Reporter: mathieu.hofman.dev+mozilla, Unassigned)

References

Details

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.42 Safari/537.36 Steps to reproduce: User visits conferencing / media web app: - App calls getUserMedia({audio: true, video:true}) to get access to mic and webcam - User accepts, but does not click switch to "remember choice" Web App offers custom UI to switch devices: - App use enumerateDevices() to get list of alternative mic and webcam and render some custom device switching UI - Users selects a different device, mic, webcam, or both - App calls getUserMedia({audio: {exact: {deviceId: "audioDeviceId"}}, video: {exact: {deviceId: "videoDeviceId"}}}) - User accepts permissions again, and verifies devices selected are the right ones - User confirms selection in the app and effectively switches to the new devices for use in the conference Actual results: Firefox shows the permission dialog again for the 2nd getUserMedia call, causing a sort of "double confirmation" from an user point of view. Even worse, if the app device switching features includes a preview type flow (webcam preview in video element, mic volume through WebAudio metering), it really becomes a 3 click flow: select new device in app UI, allow permission dialog, accept changes after verifying preview. If the selected device wasn't the right one, the user would have to select new device and accept permission every time until they select the right one. Expected results: There should be a way to call getUserMedia so that the app can be granted permissions for all devices of the given class. A second call to getUserMedia would then have seen that app still has active local MediaStreams for devices of the given class and allowed the call to succeed without a new permission prompt. Switching devices is particularly useful since most systems have multiple audio input (microphone, aux-in, usb headset, etc.) Some desktop systems in "conference room" type computers have multiple webcams too. Even regular desktop systems sometimes have multiple webcams (one hardware one, one virtual one from an "effects" software). I'm filling this bug after a productive exchange in comments of Bug 901616. Please read comments on that bug for context. From what I understand, it is accepted that the current permission prompt that includes device selection is too complicated for users (Bug 1004392). Also it looks like implementation on mobile devices is different from desktop and Firefox already prompts the user for devices class access if the "exact" constraint is used (attachment in comment #9 on bug 901616) I realize that the current "device selection" type permission prompt has advantages for web developers not willing or able to implement their own selection UI. However applications that wishes to take advantage of the enumerateDevices feature to implement their own device selection should be able to do so without the suboptimal user experience of intermediate permission prompts when switching. The solution I'm suggesting is to treat the usage of the deviceId constraint (with or without exact modifiers, TBD) in getUserMedia call as an indicator that the application knows how to list and select devices itself, and prompt the user for generic access to all devices of the requested class. Subsequent getUserMedia calls wouldn't require the user to approve permissions again. One issue I see with this is that the application would probably take the first device of the list and make the first request with that deviceId since showing a UI to the user without full device names is not very useful. This can be a small problem if the first device of the list is not the user default / preferred device, thus requiring the user to switch devices immediately. I can personally live with that but maybe the first device returned by enumerateDevices should be the default / preferred user device? Another alternative is to never ask the user for which device to share in the permission prompt, grant access to all devices of the requested class by default, and have the browser implement a "default / preferred device" selection in the browser settings.
Component: Untriaged → Device Permissions
(In reply to mathieu.hofman from comment #0) > Also it looks like implementation on mobile devices is different from desktop and Firefox > already prompts the user for devices class access if the "exact" constraint is used > (attachment 8671441 [details] in bug 901616 comment #9) Correction: Android also only grants a single device, it just leaves out the redundant selector when constraints narrow choices to 1 (arguably it should still inform users which camera they're granting). So this problem exists on Android too. > Subsequent getUserMedia calls wouldn't require the user to approve permissions again. I assume here you mean within the current session only.
(In reply to Jan-Ivar Bruaroey [:jib] from comment #1) > (In reply to mathieu.hofman from comment #0) > Correction: Android also only grants a single device, it just leaves out the > redundant selector when constraints narrow choices to 1 (arguably it should > still inform users which camera they're granting). So this problem exists on > Android too. Thanks for the correction. I had based this observation uniquely on the wording of the dialog, not checking the actual behavior. My bad! > > > Subsequent getUserMedia calls wouldn't require the user to approve permissions again. > > I assume here you mean within the current session only. Correct, this would be until all tracks of devices of that class have been stopped. If there isn't currently any active device for that class, even if there had been earlier in the same session, I would expect permissions to be asked for (again). Unless the user chose to "Always allow" of course. I base this behavior on "Best Practice 2: Stored Permissions" paragraph of the spec, section 10.6 "Implementation Suggestions" : > When permission is requested for a device, the User Agent may choose to store > that permission, if granted, for later use by the same origin, so that the user > does not need to grant permission again at a later time. Such storing must only > be done when the page is secure (served over HTTPS and having no mixed content). > It is a User Agent choice whether it offers functionality to store permission to > each device separately, all devices of a given class, or all devices; the choice > needs to be apparent to the user, and permission must have been granted for the > entire set whose permission is being stored, e.g., to store permission to use > all cameras the user must have given permission to use all cameras and not just > one. > When permission is not stored, permission should last only until such time as all > MediaStreamTracks sourced from that device have been stopped. This is the only section mentioning that a User Agent might want to prompt the user for permissions to use all devices (or all devices of a certain class). It is in the context of persisted permissions but I don't see why the behavior of remembering the permission for all devices, if asked clearly that way, couldn't be applied to non-persisted permissions until all tracks for any devices (of that class) have been stopped. Maybe we should ask clarification from the Media Capture Task Force?
(In reply to mathieu.hofman from comment #2) > this would be until all tracks of devices of that class have been > stopped. Hardware limitations on most phones today still limit us from opening both the front and back camera at the same time unfortunately, so an overlap strategy seem problematic. It would probably need to be a timer instead, e.g. reacquiring within 200 ms of close or something to skip the prompt. This is largely up to UAs anyways. > I base this behavior on "Best Practice 2: Stored Permissions" paragraph of > the spec, section 10.6 "Implementation Suggestions" : The only requirement I find in that section is that this might need to be https only.
(In reply to Jan-Ivar Bruaroey [:jib] from comment #3) > Hardware limitations on most phones today still limit us from opening both > the front and back camera at the same time unfortunately, so an overlap > strategy seem problematic. > > It would probably need to be a timer instead, e.g. reacquiring within 200 ms > of close or something to skip the prompt. This is largely up to UAs anyways. Interesting, I hadn't thought of this limitation. The application would have to handle a different flow for such mobile devices. I'm actually wondering how that would work when the stream is tied to a PeerConnection... Anyway, a timer would work for me, but I'm wondering if that would actually be allowed by the spec as written. > > > I base this behavior on "Best Practice 2: Stored Permissions" paragraph of > > the spec, section 10.6 "Implementation Suggestions" : > > The only requirement I find in that section is that this might need to be > https only. I quoted this paragraph because it mentions the fact that a User Agent might offer a choice to grant permission to all devices of a given class, and gives guidelines on how to prompt the user in that case. The rest of the document usually refers to permission for a specific device. It was to point out that the use case of granting blanket permission to all devices of the same class is at least acknowledged by the spec.
(In reply to mathieu.hofman from comment #4) > Anyway, a timer would work for me, but I'm wondering if that would actually > be allowed by the spec as written. I think you quoted the right paragraph. The quote makes a distinction between "stored" and "not stored" permissions, and the timer would not meet the stricter requirements for "not stored" permissions, which means it is considered "storing". All that means, I think is that it would have to require https.
I'm ok with this being considered a "stored" permission and be only available over HTTPS. To keep it in the spirit of the spec, I would definitely time this out if user didn't grant permanent permission.
I hope I was of help, but this now needs attention from someone in UX to progress.
(In reply to Jan-Ivar Bruaroey [:jib] from comment #7) > I hope I was of help, but this now needs attention from someone in UX to > progress. Is there a specific question that UX needs to answer?
I don't think so. If you have links to related bugs that are more active then that might help.
I don't have links to a good bug for this. There are several distinct bugs all related to the fact that granting permission and selecting the device are currently associated in the UI in suboptimal ways. Eg. bug 1142123, bug 1145525, bug 1129254, bug 905696, bug 896876. The meta bug 880312 may be more interesting, but it's focused on the user switching the device from the chrome UI, not from the web app.
Thanks, that's helpful. Mathieu, you may also want to file a separate issue for android, which I understand is a different team and code.
So what are the next steps for this change to be considered? I've asked on the meta bug to think about this use case, but that bug seems focused on the in-browser switching. From what I understand this issue has multiple components: - Security: is allowing access to all devices of a class acceptable? - UI: We need to change the messaging to let the user know he's allowing access to all cameras / mics, not just a specific one. There might be more complicated UI if the browser wants to offer the opportunity to the user to override this and share a specific device instead. - Actual implementation of the "temporarily stored" permission.
> Security: is allowing access to all devices of a class acceptable? From my perspective, the answer is no. One of the primary advantages of non-persist permission is that you don't have any collateral grant. It's not obvious that a persistent permission isn't limited to the device you select. What different devices see might be very different, and we need to offer users some control. That we couple that with the persist/not choice is perhaps unfortunate, but it's a reasonable choice given the narrow scope we have for user interaction. In general, this problem is one that arises most often from a poor initial choice. We can improve that on several fronts: better labels, clearer selection UX (with previews perhaps) and some spec changes. I have some ideas on how apps might be able to offer in-app selection (I believe that I created a PR on gUM for this purpose even).
(In reply to Martin Thomson [:mt:] from comment #13) > > Security: is allowing access to all devices of a class acceptable? > > From my perspective, the answer is no. One of the primary advantages of > non-persist permission is that you don't have any collateral grant. It's > not obvious that a persistent permission isn't limited to the device you > select. Well the only reason this is technically a "persistent" permission is that we need to remember it for a short amount of time after all MediaStreamTracks for devices of the same class are no longer active to work around hardware limitations in camera switching on mobile devices. Really what I'm suggesting is in principle a simple permission grant for a device class instead of a specific device. This behavior is both envisioned by the spec, and implemented this way by other User Agents like Chrome. > What different devices see might be very different, and we need to offer > users some control. Agreed, and I'm not saying we should remove user control. I'm saying we should find a way to: 1) Let the application ask for all devices of certain class 2) Clearly communicate that to the user when asking for permission 3) Allow the user to override the choice by allowing him to only allow the device specifically requested. Btw, device switching is particularly useful for mics, and less so for webcams. > In general, this problem is one that arises most often from a poor initial > choice. We can improve that on several fronts: better labels, clearer > selection UX (with previews perhaps) and some spec changes. I have some > ideas on how apps might be able to offer in-app selection (I believe that I > created a PR on gUM for this purpose even). Realistically The MediaCapture and Streams spec has been in last call for a few months and any changes wouldn't make it in now. As currently implemented by Firefox, the enumerateDevices API in that spec is pretty much useless for the reasons I described in this bug's description, namely the permission prompt creates a subpar user experience.
(In reply to Martin Thomson [:mt:] from comment #13) > > Security: is allowing access to all devices of a class acceptable? > > From my perspective, the answer is no. One of the primary advantages of > non-persist permission is that you don't have any collateral grant. It's > not obvious that a persistent permission isn't limited to the device you > select. ...but the reality is Chrome already does a "collateral grant." Have there been any documented abuses of this or significant user concerns raised? Is a user who's unconcerned with microphone A but concerned with microphone B even going to grant access in the first place? I think if a user is in any way paranoid about granting camera/mic access, they're going to deny the initial request outright. > What different devices see might be very different, and we need to offer > users some control. That we couple that with the persist/not choice is > perhaps unfortunate, but it's a reasonable choice given the narrow scope we > have for user interaction. I don't agree; I think that already exists sufficiently in the form of the initial request. My $0.02; we've been heavily using WebRTC via Chrome, and are trying to add Firefox support to our application, but the lack of reasonable permissions and device enumeration/selection is making it very difficult to provide our users with a consistent experience.
I too am a little discouraged by differences in the way browsers present this UX. But I have come to appreciate that this isn't going to change: it's the last point of market differentiation we have between browsers. Fundamentally, Chrome and Firefox take different postures when it comes to user interaction. However, I think that this bug is completely the wrong approach. It doesn't solve the problem, it just papers over the symptoms. A solution would find some way to ensure that the user is able to pick the right device first time. I know that constraints aren't sufficient for that.
(In reply to Martin Thomson [:mt:] from comment #16) > However, I think that this bug is completely the wrong approach. It doesn't > solve the problem, it just papers over the symptoms. A solution would find > some way to ensure that the user is able to pick the right device first > time. I know that constraints aren't sufficient for that. I'm confused; how is the way Chrome currently behaves "the wrong approach?" I believe that's the proposal being made (only one gUM allow/deny prompt per page load), and frankly it seems totally reasonable to me.
The suggestion that Mozilla might instead ship Chrome instead of Firefox is one that is often made. No, the point I was making is that the fundamental problem is that users often pick (or are lead to pick) the wrong device. Providing ways to ensure that mistake doesn't happen is more productive than trying to impose a consistent user experience where that consistency is unwanted.
Let's restate the problem: web applications want to provide a user experience that allows users to easily select the right device. It's possible to do this is in Chrome using a combination of enumerateDevices and multiple gUM calls. It is not currently possible to provide a seamless experience in Firefox. The way I see it, there are 2 general approaches to this problem: improve the selection experience inside the Firefox UI so that the web app can fully rely on that, or give the tools to the web app to implement a seamless selection themselves. This issue here is suggesting a simple "patch" for the current state of the implementation which would allow web application to implement their own seamless device selection. I believe it doesn't require complex changes to the Firefox UI and is within the requirements of the spec. However, from what I understand, Mozilla is concerned about the security of its users and believes the suggested fix relaxes protections too much. @mt, I believe you had a suggestion for a Spec change that would allow apps to implement seamless device selection without sacrificing user security. From what I remember, this was based around the idea of giving a preview of the device (live meter image for mic, thumbnail preview for webcam) protected by CORS so that the app cannot capture or transmit it. The app would be able to get access to this without permission prompt (a sort of "previewURL" on the DeviceInfo I suppose?) I would be ok with such a solution but it would require a Spec extension, which takes time. Is there any existing issue / bug that can be referenced for this approach? Then to come back to approach number 1, if Mozilla feels it wants to control the whole selection experience within the browser UI, can we please get that prioritized a bit higher? The stated problem is going to be a big pain point very very soon. What is the relevant bug to follow that UX work?
Yes, I have a plan that would allow for in-app previews. As far as UX goes, short of getting a patch from an enthusiastic contributor, I don't know how this could go faster.
(In reply to Martin Thomson [:mt:] from comment #18) > The suggestion that Mozilla might instead ship Chrome instead of Firefox is > one that is often made. I'm definitely not making this suggestion. > No, the point I was making is that the fundamental problem is that users > often pick (or are lead to pick) the wrong device. Providing ways to ensure > that mistake doesn't happen is more productive than trying to impose a > consistent user experience where that consistency is unwanted. If your angle is "maybe if users always pick the right device, we can avoid device enumeration/selection entirely in the api," then I can't agree with that. For a serious video conferencing application (which is what WebRTC is all about), user control over mics, cameras and output devices is critically essential. I think the request here is simple: do not show the gUM allow/deny prompt a second time in a page if it's already been allowed. Firefox can still be more restrictive than Chrome, too (frankly, Chrome is too permissive). Chrome currently interprets HTTPS + an "allow" user response as permanent. The only request here is to show the allow/deny dialog at most once per page load.
I know this is an old bug, but this continues to be a big problem for our firefox users. The biggest issue is users can have a very difficult time configuring audio and video devices. I have code that tries to enumerate over camera/microphone combinations, and find a good combination of both (ideally I'd gUM for audio and video media streams individually, but that's a separate discussion) using web audio to validate the mic and a canvas for the video. In Firefox... this is nearly impossible to do in a way that's user friendly.
jib/florian - we should resolve this in some manner, given we have a plan for where we want to end up with respect to permission UI (whether that's wontfix, dup, or perhaps rename and prioritize).
Flags: needinfo?(jib)
Flags: needinfo?(florian)
I'm not aware of any recent plan around permission UI (except for the 'mute' case).
Flags: needinfo?(florian)
I think we should re-prompt on switching devices, but there could be opportunities to improve this situation (don't re-prompt for the same device and maybe a smarter permission prompt) if we get the backend API changes from bug 1066082.
I don't see anything warranting a change in priority here. The problem is still real, and occurs naturally when a website UI built on the assumption of ubiquitous permission meets a browser with per-use-per-device grants by default. But the workaround is trivial and the number of people affected low. Workaround: check the "Remember this decision" checkbox, and it works like Chrome. On desktop, most people have 1 camera and 1 mic, or less. On mobile, people have front and rear cameras, and which one to use should be obvious from context. So the remaining challenge is making configuration of multi-mic and multi-camera desktop systems for privacy-sensitive users less annoying. Relaxing the permission model for this group seems counter-intuitive.
Flags: needinfo?(jib)
I respectfully disagree. The problem occurs when a website has UI that tries to help a user properly configure their microphone and camera; you seem to want to blame my implementation microphone/camera configuration codep; I would posit you have never been in the position of having to build this UI in a cross-browser manner, and to deal with one browser that has nonsensical permissions. Is there _any_ good reason to have "per-use-per-device" grants? What _is_ the argument for that model? Note that Firefox doesn't even need to have permissions as lax as Chrome to make life dramatically more simple for both users and developers.
As for the workaround, it's questionable at best because it relies on the website clearly messaging the need to do this and the user actually performing it. And although the number of users affected is "low" (no arguments here), the reality is it frequently results in a failed video call because it makes hardware configuration difficult. I say this as someone who has written an app used by tens of thousands of Firefox users, monthly: this is a serious problem for developers and users. We have serious issues with both MediaRecorder and WebRTC usage, and properly guiding our users through camera/microphone/speaker configuration.

I respectfully disagree.

Sorry but this makes no sense. If a user gives permission to access camera, enumerateDevices doesn't return the available camera labels.

Is there anyway to have enumerateDevices() returning labels without "remember this decision" checked?

Thanks.

(In reply to aetheon from comment #29)

If a user gives permission to access camera, enumerateDevices doesn't return the available camera labels.

Is there anyway to have enumerateDevices() returning labels without "remember this decision" checked?

Wfm.

In fact, as a result of having undergone TAG privacy review over the last year, permission is insufficient. The spec now requires active capture to see labels. See bug 1528042 and crbug 1101860.

tks Jan-Ivar, I confirm the given example works. I understand the privacy reasons but this behaviour is a pain for developers :)

As per https://bugzilla.mozilla.org/show_bug.cgi?id=1687395#c6 I don't think we want to give permissions for all devices of a class at this point. However, we're currently looking into other methods of reducing the number of webRTC permission prompts for the user within in a site.

Severity: normal → --
Status: UNCONFIRMED → RESOLVED
Closed: 4 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.