Expose telemetry client_id to about:addons and AMO remove SHIELD dependency from TAAR
Categories
(Firefox Graveyard :: Security: Review Requests, enhancement, P1)
Tracking
(firefox65+ verified, firefox66+ verified)
People
(Reporter: mlopatka, Assigned: mixedpuppy)
References
Details
(Keywords: sec-other, Whiteboard: [adv-main65-])
Attachments
(4 files, 1 obsolete file)
No description provided.
In order to allow TAAR to function without a SHIELD addon to mediate the delivery of the Telemetry client_id, the
The telemetry client_id is then used to lookup other telemetry fields from a dynamoDB instance running apart from main Telemetry. These characteristics are the basis of personalized recommendations. We need to remove the necessity for a SHIELD addon to pass the client_id. TAAR repo: https://github.com/mozilla/taar
Comment 3•6 years ago
|
||
Does not appear to be a security bug. If this is incorrect please find us in #security or send mail to security@mozilla.org
[:ddurst] specifically requested that we mark this bug as a security issue to keep it private. We will be discussing a potential change to the security model of Firefox in the context of exposing the client_id selectively, this would mean a fundamental change tot he FX attack surface.
Comment 5•6 years ago
|
||
Sorry, dveditz, I had this on my list late last week and neglected to comment. The proposal for this is to pass the client id as a header -- we would use this for AMO only initially, but certainly could see use cases for it wherever we use TAAR-based recommendations. That's the part that concerns me, and which I thought security would want to consider.
Comment 6•6 years ago
|
||
So this bug is a security review request, rather than a known flaw or feature work? If so please move to the firefox security review request component (or if it's AMO maybe we need ulfr's team instead)
Comment 7•6 years ago
|
||
OK. Sorry for the runaround. This is a review request for the proposed feature (passing the client ID via header to whitelisted sites).
Comment 8•6 years ago
|
||
Also, I should probably loop in Marshall wrt Privacy because of the client ID (noting that this is the same client ID we use for telemetry).
We've reviewed this as part of the discussions about TAAR and have approved. We should consider this (passing the client ID via header to a whitelisted site) on a case by case basis do determine if the client ID is really needed and determine how the proposal will change the privacy risk. But as for the TAAR point, we are good with this moving forward. To Martin's broader point, it certainly seems worth considering the change to the security model in light of this work.
Comment 10•6 years ago
|
||
We had a follow up discussion with Martin and Travis today, and we would prefer to have Firefox send a hash of the client_id to AMO, to avoid having AMO know the actual client_id. This would prevent AMO from being in a position to correlate IDs with telemetry, while allowing us to take a given client_id, hash it, and look it up in the AMO logs. A simple sha256(client_id) ought to be enough here. needinfo :ddurst for his take on feasibility.
Updated•6 years ago
|
Assignee | ||
Comment 12•6 years ago
|
||
Assignee | ||
Comment 13•6 years ago
|
||
This iteration of the patch only adds the header to the discover url, not to all AMO requests. A couple questions. - Why not post data or get param? - Does this really need to be attached to all AMO requests. Using a channel listener could do that easily, but adds some overheader, even if trivial, to all requests, where AMO requests are an insignificant portion of requests.
Reporter | ||
Comment 14•6 years ago
|
||
Adding an NI for ddurst. Perhaps some context that may help answer the second question. We would like to have the client_id hash available on both AMO and the about:addons page, which will allow us to improve recommendation quality on AMO for Firefox clients. This also has implications for the first question. If a client is browsing AMO, then no post is occurring.
Updated•6 years ago
|
Assignee | ||
Comment 15•6 years ago
|
||
(In reply to mlopatka from comment #14) > Adding an NI for ddurst. > > Perhaps some context that may help answer the second question. We would like > to have the client_id hash available on both AMO and the about:addons page, > which will allow us to improve recommendation quality on AMO for Firefox > clients. This also has implications for the first question. Already knew that. client_id is by nature available to us in about:addons, it needs to get to amo of course. The shield study added client_id to the discovery url, and if that's all that is necessary, then the approach is different than otherwise. The context I'm missing is, do we need it elsewhere and is that pervasive (i.e. all amo requests), if so why, or is that for one or two specific urls. > If a client is > browsing AMO, then no post is occurring. What is the use case for having client_id passed to every page when browsing? For example, a hashed client_id could just be passed on a ping url which would set a cookie so AMO can later access that. That way we don't have to install a listener or do any other modification.
Comment 16•6 years ago
|
||
:mixedpuppy The usecase we're trying to solve is : "TAAR requires the hashed telemetry ID available whenever any a.m.o page is loaded as TAAR is expected to generate suggestions for all Firefox users browsing https://addons.mozilla.org/". Whether or not this is implemented within a cookie or a header seems to be an implementation detail that TAAR itself isn't concerned with. All we care about is that AMO eventually makes a request to the TAAR service that includes the hash. I'm not familiar with ping URLs in AMO. Can we still guarantee that we'll get the hashed telemetry ID on every page? Will that work with cookie destroying add-ons?
Assignee | ||
Comment 17•6 years ago
|
||
Had a meeting w/Scott, Kev and ddurst. From that: There currently are no concrete plans to do any recommendations beyond disco pane, so we will go forward with the current patch here. If and when any plans to expand beyond disco pane solidify, we will likely change the implementation to either a header or cookie, but we will delay that conversation to such a time. Additionally, - The client_id will not be sent when the user is in private browsing mode. - The pref, bug 1499470, will also enable/disable sending the client id. - A couple other items are waiting decision which should happen (hopefully) monday.
Reporter | ||
Comment 18•6 years ago
|
||
Sounds like a good approach to me. Just a point of clarification, We already *are* providing recommendations beyond disco pane. Each addons detail page on AMO provides recommendations based on a model derived from addon co-installation data that is leveraging Firefox Telemetry data. The quality of those recommendations could be drastically improved with access to the hashed telemetry client_id. If we are not moving forward with revealing the hashed client_id on any AMO page, then we will not be able to iterate on the TAAR technology. The quality of recommendations can not be expected to improve.
Comment 19•6 years ago
|
||
Let's have the discussion about header vs cookie here. A more forward-looking option is to use one of these as the vehicle for the hashed ID. We're going to have a preference that results in the setting of a header for networks requests to AMO, or results in the setting of a cookie for AMO. In both cases there's some overhead. In the case of the cookie, there's also the edge case that a user or extension clears cookies and whether we reset that cookie on the next session start or we basically treat it as an intentional flip of the pref. Beyond that, the question we have is whether either solution should: a) be disabled if the user has enabled DNT (we suspect that as first-party here, we do not need to do that) b) be a concern going forward in terms of GDPR -- this is not an issue now, but if we were to include this ID on some other mozilla property in the future, it suddenly becomes a possible tracking issue. If neither the cookie nor header worries Legal or Privacy or Security, then it comes down to engineering choice on ease of implementation and/or performance. If either of the options does worry Legal or Privacy or Security, then that should inform which is the less problematic approach. I know this was sort of discussed in #c9, but worth calling it out now that the hashing, at least, is settled. And if this is something we'll just revisit when/if another (non-AMO) site is intended to be included, then so be it.
Comment 20•6 years ago
|
||
(Also back to :ulfr because the model changes with that discussion.)
Assignee | ||
Comment 21•6 years ago
|
||
Assignee | ||
Comment 22•6 years ago
|
||
I'm a proponent of the cookie approach. All the management of private browsing, ensuring secure channels, etc. are built in. If we do a different header, we then also have to implement a number of items that cookies give us in addition to listening on all requests to watch for amo requests. The downside of course is those users who use cookie removal extensions. A rough thumb to the air (search cookie on amo and guestimate a total user base) shows less than 1M users, less than 1% of our user base. That's edge case enough it shouldn't matter. Recommendations will already need to account for a lack of clientId for users who disable it (see bug 1499470). Security can look at the cookie version of the patch I uploaded to see if they have any concern.
Comment 23•6 years ago
|
||
No concern on my side. I like the cookie approach better too. We should still hash the ID per comment 10.
Assignee | ||
Comment 24•6 years ago
|
||
(In reply to Julien Vehent [:ulfr] from comment #23) > No concern on my side. I like the cookie approach better too. We should > still hash the ID per comment 10. It is hashed. As well, the cookies are marked secure, httpOnly, samesite-lax and non-PrivateBrowsing. One item I'd point out is that the patch uses a pref for the list of sites that will get the cookie, currently only AMO.
Comment 25•6 years ago
|
||
This requires Data Collection Review[1]. (Since I'm involved in the technical review side, I cannot serve as Data Steward for the reivew). One piece of the review that might be very relevant is the need for public documentation of the collection (specifically including any ids being sent) and clear identification of the opt-out mechanism in that documentation and in the review. [1]: https://wiki.mozilla.org/Firefox/Data_Collection
Updated•6 years ago
|
Assignee | ||
Comment 26•6 years ago
|
||
For the patch here specifically, it's well covered by automated tests. I think any QE involvement should probably focus on the bigger picture of the FX->AMO->TAAR interaction.
Comment 27•6 years ago
|
||
Submitting data collection review request form.
Comment 28•6 years ago
|
||
Comment on attachment 9021859 [details] data collection review request As :chutten mentioned in comment #25, one of the requirements for our data collection is that this be documented publicly somewhere. This could be the data review request itself if this bug were public for example. I have read through this bug and some of the linked documents, but I haven't found anything that could qualify as public documentation for this. Did I miss something? Also, I opened the devtools network tab while loading about:addons and didn't see any third-party script resources on that page, which is great. Do we have a policy against adding third-party-hosted scripts on there in the future?
Assignee | ||
Comment 29•6 years ago
|
||
(In reply to François Marier [:francois] from comment #28) > Comment on attachment 9021859 [details] > data collection review request > > As :chutten mentioned in comment #25, one of the requirements for our data > collection is that this be documented publicly somewhere. This could be the > data review request itself if this bug were public for example. > > I have read through this bug and some of the linked documents, but I haven't > found anything that could qualify as public documentation for this. Did I > miss something? I'm thinking this bug could be made public. > Also, I opened the devtools network tab while loading about:addons and > didn't see any third-party script resources on that page, which is great. Do > we have a policy against adding third-party-hosted scripts on there in the > future? Not an issue. The cookie is set in a way that about:addons nor any content contained within it would never see cookie, only the server receiving the cookie will. The server will have to produce content based on that.
Comment 31•6 years ago
|
||
I suppose so, though I expect that mlopatka should be the arbiter of where the public statement of this would live. Either here as a public bug, or elsewhere where we describe this unique hash more generally.
Reporter | ||
Comment 32•6 years ago
|
||
I would propose that we include a detailed and transparent statement regarding the data collection in the TAAR repo. The original motivation for making this bug private was to avoid leaking potential changes to Firefox's attack surface until after we have discussed it. I have no problem making this bug public per-se, but I'd prefer to have a clearly articulated statement prepared and associated with the repo that we can point to from anywhere else where this discussion (or similar ones) may be had. I'll submit a PR to the TAAR repo with data collection documentation and then circle back here.
Reporter | ||
Comment 33•6 years ago
|
||
Public facing documentation added via PR: https://github.com/mozilla/taar/pull/131/files Comments/PRs welcome @ddurst, @mixedpuppy, @francois please take a look at the full `DATA_COLLECTION_POLICY.md` file under PR#131 and let me know if this is sufficiently and accurately capturing the contents of this bug. NI: Chutten to please flag an appropriate Data Steward as francois is currently not accepting NI requests. Renewed data steward review is requested with public facing data collection policy documentation now in the TAAR repo.
Comment 34•6 years ago
|
||
Comment on attachment 9021859 [details] data collection review request Any Data Steward from the list[1] can be asked for review, so I choose... :liuche :) [1]: https://wiki.mozilla.org/Firefox/Data_Collection
Comment 35•6 years ago
|
||
Comment on attachment 9021859 [details] data collection review request Data-review+ This request includes using a new identifier, but this has been cleared by :merwin and is covered by the privacy policy. 1) Is there or will there be **documentation** that describes the schema for the ultimate data set available publicly, complete and accurate? Yes, on TAAR github: https://github.com/mozilla/taar/blob/master/DATA_COLLECTION_POLICY.md 2) Is there a control mechanism that allows the user to turn the data collection on and off? Yes, turning off telemetry, or turning off a pref 3) If the request is for permanent data collection, is there someone who will monitor the data over time?** No, 6mo collection 4) Using the **[category system of data types](https://wiki.mozilla.org/Firefox/Data_Collection)** on the Mozilla wiki, what collection type of data do the requested measurements fall under? ** Type 2 - linking client to behaviors for AMO recommendations 5) Is the data collection request for default-on or default-off? Default on, opt-out 6) Does the instrumentation include the addition of **any *new* identifiers** (whether anonymous or otherwise; e.g., username, random IDs, etc. See the appendix for more details)? New identifier, hashed telemetry client id 7) Is the data collection covered by the existing Firefox privacy notice? **If unsure: escalate to legal if:** This data includes a hash of a telemetry client id, but asked :merwin and collecting this has been cleared in https://bugzilla.mozilla.org/show_bug.cgi?id=1489531#c9 8) Does there need to be a check-in in the future to determine whether to renew the data? (Yes/No) (If yes, set a todo reminder or file a bug if appropriate)** Expires in 6mo, data retention set to 6mo in https://github.com/mozilla/taar/issues/127
Updated•6 years ago
|
Reporter | ||
Comment 36•6 years ago
|
||
Data collection policy is now merged in on the TAAR repo. Can we confirm that the patch to implement this is landed and will be available for testing in Nightly?
Assignee | ||
Comment 37•6 years ago
|
||
This is not landed, it was held up by the data review. Now that we have both, unfortunately it is also the week before beta uplift. Do we need this in 65, or can it wait on 66?
Assignee | ||
Updated•6 years ago
|
Reporter | ||
Comment 38•6 years ago
|
||
Strong preference for 65 if it is in any way possible. Our development roadmap and Q4/Q1 2019 OKRs were planned around 65. We need to have the pref landed in order to do any end-to-end testing of TAAR in a realistic manner.
Assignee | ||
Comment 40•6 years ago
|
||
oh, as well, I mis-read the calendar, soft freeze is next week.
Reporter | ||
Comment 41•6 years ago
|
||
Thanks! Once we can verify proper functioning on an end-to-end TAAR workflow I'll close out this bug. In the mean time (as per comment#30), I'm making this bug public. Data collection is addressed in comment 35: https://bugzilla.mozilla.org/show_bug.cgi?id=1489531#c35
Comment 42•6 years ago
|
||
Pushed by ccoroiu@mozilla.com: https://hg.mozilla.org/mozilla-central/rev/374a4bf1482e Expose telemetry client_id hash to about:addons via cookie r=Gijs,chutten
Comment 43•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/374a4bf1482e
Updated•6 years ago
|
Comment 44•6 years ago
|
||
Turns out that the cookie is not passed through (seems like it's not considered first party), so our approach for this will revert to the header (see #c19).
NI the world to see if there are any concerns about this. I assume the data review portions of this are intact, and this is just about transmission method.
Updated•6 years ago
|
Comment 45•6 years ago
|
||
(Setting P1 because this was 65, and we'd like to preserve that if possible)
Updated•6 years ago
|
Comment 46•6 years ago
|
||
Turns out that the cookie is not passed through (seems like it's not considered first party), so our approach for this will revert to the header (see #c19).
Sounds good to me.
Assignee | ||
Comment 47•6 years ago
|
||
m-c patch to add header to request on about:addons
Assignee | ||
Comment 48•6 years ago
|
||
Note this will require a small change for beta uplift, specifically the webNavigation.loadURI changes to loadUriWithOptions for beta.
Comment 49•6 years ago
|
||
(In reply to Shane Caraveo (:mixedpuppy) from comment #47)
Created attachment 9036718 [details]
Bug 1489531 Expose telemetry client_id header to about:addonsm-c patch to add header to request on about:addons
I tried this patch locally and made a few changes to the AMO/Disco Pane patch too. Running the Disco Pane locally inside about:addons
now works! The client ID is passed from FF to the AMO/Disco app, which then forwards this ID to the AMO API (HTTP call).
Assignee | ||
Comment 51•6 years ago
|
||
Beta version of client id header patch for AMO discovery.
Assignee | ||
Comment 52•6 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d22405d31a77147c7ee86d5bce6ac5e1a8fe80b6
Assignee | ||
Comment 53•6 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=07f072c2d3d01a3750b4a3194bd09d32c60179d8
Assignee | ||
Comment 54•6 years ago
|
||
Comment on attachment 9037262 [details]
Bug 1489531 Expose telemetry client_id header to about:addons BETA
[Beta/Release Uplift Approval Request]
Feature/Bug causing the regression: None
User impact if declined: Recommendations will not be available
Is this code covered by automated tests?: No
Has the fix been verified in Nightly?: No
Needs manual test from QE?: Yes
If yes, steps to reproduce: load about:addons
load network monitor
view the "Get Add-ons" panel
in network monitor, look at request headers for the main request.
you should see a Moz-Client-ID header
Two prefs are required to be true for the header to be set:
datareporting.healthreport.uploadEnabled
browser.discovery.enabled
If either are false the header should not appear. So testing should be itterated along with changing these prefs
- both true
- uploadEnabled false, discovery.enabled true
- uploadEnabled true, discovery.enabled false
These steps would be the same for testing on m-c
List of other uplifts needed: None
Risk to taking this patch: Low
Why is the change risky/not risky? (and alternatives if risky): The change is limited to about:addons. The patch adds a header when loading the discover panel on about:addons, no other browser should be affected with this.
String changes made/needed: none
Assignee | ||
Comment 55•6 years ago
|
||
@RyanVM, fyi patches approved, try runs are in comments above.
Comment 56•6 years ago
|
||
Pushed by scaraveo@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/4ddef07cfb17 Expose telemetry client_id header to about:addons r=aswan
Updated•6 years ago
|
Comment 57•6 years ago
|
||
R+ from QA.
Tested on the following builds (using the steps from comment 54):
- beta try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=d1fff08a56737cf418d1599b66050d02d8bff99c&selectedJob=222513958
- m-c try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=baa5d6f32bba3eeb6290465f30ea022d1e78128f
Verified all the combinations of Boolean datareporting.healthreport.uploadEnabled
and browser.discovery.enabled
configs and confirmed the "Moz-Client-Id" header appeared only when both were true.
Comment 58•6 years ago
|
||
bugherder |
Updated•6 years ago
|
Comment 60•6 years ago
|
||
Comment on attachment 9037262 [details]
Bug 1489531 Expose telemetry client_id header to about:addons BETA
[Triage Comment]
Fixes a header issue causing addon recommendations to not be available. Approved for 65.0b12.
Comment 61•6 years ago
|
||
bugherder uplift |
Updated•6 years ago
|
Comment 62•6 years ago
|
||
Marking this bug as verified based on comment #57, I've also verified using the steps provided on comment #54 and can confirm that the header only shows when both pref's listed in the above mentioned comment are set to true. Verified using Nightly 66.0a1 and 65.0b12.
Updated•4 years ago
|
Description
•