Closed
Bug 1191912
Opened 9 years ago
Closed 9 years ago
Implement a way to make Telemetry opt-out for a random sample of users
Categories
(Toolkit :: Telemetry, defect)
Toolkit
Telemetry
Tracking
()
Tracking | Status | |
---|---|---|
firefox40 | --- | unaffected |
firefox41 | --- | fixed |
firefox42 | --- | fixed |
firefox43 | --- | fixed |
People
(Reporter: gfritzsche, Assigned: gfritzsche)
References
Details
(Whiteboard: [unifiedTelemetry] [uplift4])
Attachments
(4 files, 1 obsolete file)
(deleted),
patch
|
rvitillo
:
review+
ritu
:
approval-mozilla-aurora+
ritu
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
rvitillo
:
review+
mreid
:
feedback-
ritu
:
approval-mozilla-aurora+
ritu
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
rvitillo
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
rvitillo
:
review+
|
Details | Diff | Splinter Review |
If we can't get Unified Telemetry shipping for 41 then we want to enable it for a random sample (5% being suggested) on release.
This allows us to
a) unblock projects waiting on the data (game initiative, release management, ...)
b) judge the release data quality
c) don't break critical dashboards yet as FHR is still fully running
Assignee | ||
Updated•9 years ago
|
status-firefox40:
--- → unaffected
status-firefox41:
--- → affected
Assignee | ||
Updated•9 years ago
|
Iteration: --- → 42.3 - Aug 10
Assignee | ||
Comment 1•9 years ago
|
||
Are we ok with the stated 5% sampling for now?
Flags: needinfo?(sguha)
Flags: needinfo?(benjamin)
Assignee | ||
Comment 2•9 years ago
|
||
Roberto, i remember you implemented sampling based on client id already somewhere.
Can you point me to the code?
Did that turn out to be reliable and good for random sampling?
Flags: needinfo?(rvitillo)
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → gfritzsche
Comment 3•9 years ago
|
||
I assume we should take the same strategy we do on the server (seems like a good idea to make sure we're collecting the same clientIds that fall in the 1% sample).
Adding needsinfo to mreid to give pointers to that code/strategy...
Flags: needinfo?(mreid)
Comment 4•9 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] from comment #2)
> Roberto, i remember you implemented sampling based on client id already
> somewhere.
> Can you point me to the code?
> Did that turn out to be reliable and good for random sampling?
You can find the code at [1]. I haven't used the data after enabling this on Beta though.
[1] https://dxr.mozilla.org/mozilla-central/source/xpcom/threads/BackgroundHangMonitor.cpp?from=BackgroundHangMonitor.cpp#485
Flags: needinfo?(rvitillo)
Comment 5•9 years ago
|
||
I agree on using the same procedure as we do on the server side: https://github.com/vitillo/data-pipeline/blob/master/heka/sandbox/decoders/extract_telemetry_dimensions.lua#L95
Comment 6•9 years ago
|
||
So it all depends. If you need some estimate from(say a proportion) representative of the population 5% ought to be well enough. if one starts looking for small subpopulations e.g. Hello users (very very small), specific addon users (also very small) then 5% might result in tiny numbers. For example some addons have <2000 users - a 5% sample of these is ~100 which might be really small to detect differences in usage (low power).
For lack of any particular needs, i would go with 5% but quote everything with confidence intervals or at least report a measure of sampling error - so that we don't just compare means without taking into account the sampling error in the measurement.
HTH
Flags: needinfo?(sguha)
Comment 7•9 years ago
|
||
Roberto already pointed at the code that takes the sampleId on the server, but to repeat here, we take the crc32 of the clientId modulo 100 to bucket things into groups of approximately 1%. The current 1% sample on the server takes sampleId value of 42, though implementing sampling on the client we would presumably want to take values <= X or >= (100-X) to get an X% sample.
Also, to be clear, sampling should only be done on the release channel, pre-release channels should continue to submit 100%.
Flags: needinfo?(mreid)
Comment 8•9 years ago
|
||
Yes 5%.
Assignee | ||
Comment 9•9 years ago
|
||
Attachment #8645720 -
Flags: review?(rvitillo)
Assignee | ||
Comment 10•9 years ago
|
||
Attachment #8645721 -
Flags: review?(rvitillo)
Assignee | ||
Comment 11•9 years ago
|
||
Attachment #8645722 -
Flags: review?(rvitillo)
Assignee | ||
Updated•9 years ago
|
Status: NEW → ASSIGNED
Flags: needinfo?(benjamin)
Updated•9 years ago
|
Attachment #8645720 -
Flags: review?(rvitillo) → review+
Comment 12•9 years ago
|
||
Comment on attachment 8645721 [details] [diff] [review]
Part 2 - Enable opt-out Telemetry for a 5% sample of release users
Review of attachment 8645721 [details] [diff] [review]:
-----------------------------------------------------------------
::: toolkit/components/telemetry/TelemetryController.jsm
@@ +111,5 @@
> + crc = (crc >>> 8) ^ gCrcTable[(crc ^ str.charCodeAt(i)) & 0xFF];
> + }
> +
> + return (crc ^ (-1)) >>> 0;
> +}
mreid should confirm that this matches the server side implementation as depending on the algorithm used the implementation might yield different results.
Attachment #8645721 -
Flags: review?(rvitillo)
Attachment #8645721 -
Flags: review+
Attachment #8645721 -
Flags: feedback?(mreid)
Updated•9 years ago
|
Attachment #8645722 -
Flags: review?(rvitillo) → review+
Assignee | ||
Comment 13•9 years ago
|
||
Attachment #8645748 -
Flags: review?(rvitillo)
Updated•9 years ago
|
Attachment #8645748 -
Flags: review?(rvitillo) → review+
Assignee | ||
Comment 14•9 years ago
|
||
Comment on attachment 8645721 [details] [diff] [review]
Part 2 - Enable opt-out Telemetry for a 5% sample of release users
Trink, maybe you can take a look in the meantime?
Attachment #8645721 -
Flags: feedback?(mtrinkala)
Comment 15•9 years ago
|
||
Comment 16•9 years ago
|
||
Comment on attachment 8645721 [details] [diff] [review]
Part 2 - Enable opt-out Telemetry for a 5% sample of release users
Review of attachment 8645721 [details] [diff] [review]:
-----------------------------------------------------------------
::: toolkit/components/telemetry/TelemetryController.jsm
@@ +111,5 @@
> + crc = (crc >>> 8) ^ gCrcTable[(crc ^ str.charCodeAt(i)) & 0xFF];
> + }
> +
> + return (crc ^ (-1)) >>> 0;
> +}
The server implementation uses the standard zlib crc32. I tested ~1300 clientIds and the implementation in this bug matches the server-assigned values.
@@ +640,5 @@
> + // This mimics the server-side 1% sampling, so that we can get matching populations.
> + // The server samples on ((crc32(clientId) % 100) == 42), we match 42+X here to get
> + // a bigger sample.
> + const sample = crc32(clientId) % 100;
> + const offset = 42;
I don't think we should encode the server's magic "42" value in the client. Rather, I think we should modify the server to select a different range of sampleIds and remove the "offset".
Instead, we could simply use "sample < range" on the client.
Attachment #8645721 -
Flags: feedback?(mreid) → feedback-
Assignee | ||
Comment 17•9 years ago
|
||
(In reply to Mark Reid [:mreid] from comment #16)
> @@ +640,5 @@
> > + // This mimics the server-side 1% sampling, so that we can get matching populations.
> > + // The server samples on ((crc32(clientId) % 100) == 42), we match 42+X here to get
> > + // a bigger sample.
> > + const sample = crc32(clientId) % 100;
> > + const offset = 42;
>
> I don't think we should encode the server's magic "42" value in the client.
> Rather, I think we should modify the server to select a different range of
> sampleIds and remove the "offset".
>
> Instead, we could simply use "sample < range" on the client.
Per IRC we are not planning to address this now to not add any delays to the 41 timeline.
Comment 18•9 years ago
|
||
+1 to Comment 16.
Although I am still unclear how throwing away the bulk of the data will help in our data validation efforts ;) Hopefully it will be easy to compare with an equivalent (available?) v2 data sample set.
Comment 19•9 years ago
|
||
(In reply to Mike Trinkala [:trink] from comment #18)
> +1 to Comment 16.
>
> Although I am still unclear how throwing away the bulk of the data will help
> in our data validation efforts ;) Hopefully it will be easy to compare with
> an equivalent (available?) v2 data sample set.
We're not sampling for pre-release populations. We won't need to sample if we're ready to turn of FHRv2.
This option is to be used if we aren't willing to turn off FHRv2, so the choice would be between 5% and 0%, not between 5% and 100%.
Comment 20•9 years ago
|
||
Yeah just saying 5% allows you to play with the v4 data but not actually use it as a reliable reporting mechanism and if it is not aligned with a v2 sample (while collecting both) it would also make it hard to validate.
https://hg.mozilla.org/mozilla-central/rev/58b1351b3538
https://hg.mozilla.org/mozilla-central/rev/2dc8148c0245
https://hg.mozilla.org/mozilla-central/rev/ed247945355f
https://hg.mozilla.org/mozilla-central/rev/82e177714649
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
status-firefox43:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
Assignee | ||
Comment 22•9 years ago
|
||
Comment on attachment 8645720 [details] [diff] [review]
Part 1 - Move client id caching to ClientID.jsm
Approval Request Comment
[Feature/regressing bug #]:
Unified Telemetry
[User impact if declined]:
If we don't get sign-off to have Unified Telemetry replace FHR in 41, then we want to enable it in addition to FHR for a 5% sample of the release population.
This patch takes care of a prerequisite.
[Describe test coverage new/current, TreeHerder]:
Automated test-coverage, manually checked the behavior, validated it matches the server-sampling.
[Risks and why]:
Low-risk, this is a rather small and contained change.
[String/UUID change made/needed]:
None.
Attachment #8645720 -
Flags: approval-mozilla-beta?
Attachment #8645720 -
Flags: approval-mozilla-aurora?
Assignee | ||
Comment 23•9 years ago
|
||
Comment on attachment 8645721 [details] [diff] [review]
Part 2 - Enable opt-out Telemetry for a 5% sample of release users
Approval Request Comment
[Feature/regressing bug #]:
Unified Telemetry
[User impact if declined]:
If we don't get sign-off to have Unified Telemetry replace FHR in 41, then we want to enable it in addition to FHR for a 5% sample of the release population.
This implements the actual sampling.
[Describe test coverage new/current, TreeHerder]:
Automated test-coverage, manually checked the behavior, validated it matches the server-sampling.
[Risks and why]:
Low-risk, this is a rather small and contained change.
[String/UUID change made/needed]:
None.
Attachment #8645721 -
Flags: feedback?(mtrinkala)
Assignee | ||
Comment 24•9 years ago
|
||
Comment on attachment 8645721 [details] [diff] [review]
Part 2 - Enable opt-out Telemetry for a 5% sample of release users
Approval Request Comment
[Feature/regressing bug #]:
Unified Telemetry
[User impact if declined]:
If we don't get sign-off to have Unified Telemetry replace FHR in 41, then we want to enable it in addition to FHR for a 5% sample of the release population.
This implements the actual sampling.
[Describe test coverage new/current, TreeHerder]:
Automated test-coverage, manually checked the behavior, validated it matches the server-sampling.
[Risks and why]:
Low-risk, this is a rather small and contained change.
[String/UUID change made/needed]:
None.
Attachment #8645721 -
Flags: approval-mozilla-beta?
Attachment #8645721 -
Flags: approval-mozilla-aurora?
Assignee | ||
Updated•9 years ago
|
Whiteboard: [unifiedTelemetry] [uplift5] → [unifiedTelemetry] [uplift4]
Comment on attachment 8645720 [details] [diff] [review]
Part 1 - Move client id caching to ClientID.jsm
Approving for both Aurora and Beta. Telemetry related changes should be safe and this is a must-have for FF41 based on what I understand of the situation.
Attachment #8645720 -
Flags: approval-mozilla-beta?
Attachment #8645720 -
Flags: approval-mozilla-beta+
Attachment #8645720 -
Flags: approval-mozilla-aurora?
Attachment #8645720 -
Flags: approval-mozilla-aurora+
Attachment #8645721 -
Flags: approval-mozilla-beta?
Attachment #8645721 -
Flags: approval-mozilla-beta+
Attachment #8645721 -
Flags: approval-mozilla-aurora?
Attachment #8645721 -
Flags: approval-mozilla-aurora+
Assignee | ||
Comment 26•9 years ago
|
||
Assignee | ||
Comment 27•9 years ago
|
||
https://hg.mozilla.org/releases/mozilla-beta/rev/f949fd1f62f8
https://hg.mozilla.org/releases/mozilla-beta/rev/23ae04733450
Flags: in-testsuite+
Assignee | ||
Updated•9 years ago
|
Flags: qe-verify+
Assignee | ||
Comment 28•9 years ago
|
||
Assignee | ||
Comment 29•9 years ago
|
||
Comment on attachment 8652271 [details] [diff] [review]
Part 3 (beta) - Enable opt-out Telemetry sampling
Moving this to bug 1192906 to track the pref change uplifts in one bug.
Attachment #8652271 -
Attachment is obsolete: true
Comment 30•9 years ago
|
||
Removing [qe-verify +] flag since this has already been covered by the testing performed in bug 1192906.
Flags: qe-verify+
Assignee | ||
Updated•9 years ago
|
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•