Collect telemetry on crashes, if one Fission docshell was open
Categories
(Core :: Performance, enhancement, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox76 | --- | fixed |
People
(Reporter: ccd, Assigned: barret)
References
Details
Attachments
(2 files)
(deleted),
text/plain
|
chutten
:
data-review+
|
Details |
(deleted),
text/x-phabricator-request
|
Details |
The proposal is to the collect telemetry on cases when there is a crash, and at least one Fission docshell was open.
Comment 2•5 years ago
|
||
Sean said he'll work on adding this.
Comment 3•5 years ago
|
||
Could someone clarify a bit what exactly should be measured here, just to ensure we'll get the right kind of patch and proper review for it?
Thanks.
Comment 4•5 years ago
|
||
I believe, but will defer to @cdowhygelund , that is to infer crash rates when a fission dochshell is open. The crash could have ocurred elsewhere but it is during a session when a fission docshell was open and present.
Without it crash rates cannot be separated out when a user has used fission -window or not.
Updated•5 years ago
|
Comment 5•5 years ago
|
||
I think this probe is going to be used to answer how frequent a crash is related to Fission. And just by looking at the Fission flag isn't sufficient.
Reporter | ||
Comment 6•5 years ago
|
||
As Saptarshi and Sean noted, this will best determine crashes related to Fission. As, it is possible to have Fission enabled and not use Fission, this probe focuses on crashes where there is some Fission-related activity occurring . This probe will record cases where a crash occurred somewhere, and at least one Fission docshell was open. Over time we can use this probe to measure aspects of Fission stability.
Updated•5 years ago
|
Updated•5 years ago
|
Comment 7•5 years ago
|
||
Olli, do you think the above explanations valid? If so, who is the proper reviewer? Thanks!
Comment 8•5 years ago
|
||
So is this about crash reports or telemetry? And if latter, is this about child processes only?
Comment 9•5 years ago
|
||
I think the idea is, It's going to be a categorical histogram probe which has 2 categories, CRASH_WITH_FISSION_DOCSHELL
and CRASH_WITHOUT_FISSION_DOCSHELL
.
Once we catches a crash (not sure how this happens in our code), we see if there was a fission docShell open when the crash occurred, and we increase the probe accordingly.
So by using this probe, we can get a sense of how stable fission is. For example, the initial data could look like 10% crashes had Fission docshell open, and after a couple of months, the number could drop to 5%.
Olli, what do you think?
Corey, is my understanding correct?
Comment 10•5 years ago
|
||
But telemetry probes run usually in the Firefox process. And if that process crashes, what are we collecting and where?
Or is this only about child processes and then parent keeps track whether the child process has fission docshells?
Reporter | ||
Comment 11•5 years ago
|
||
wbeard: What are your thoughts on how this probe should be implemented? The desire is to have a Fission analog of the existing crash counts metrics to determine the stability of Fission, as Sean noted. Should this be a crash report? Or should it be telemetry, but only collecting crash counts for child processes as Olli has noted above?
For context, Fission-enabled Firefox can have both non-Fission and Fission windows. The idea of this probe is to measure crashes where at least one Fission docshell was open (e.g., Fission was being used).
Comment 12•5 years ago
|
||
I'm not deeply familiar with the implementation details of the current crash probes, but for crash rates we would need telemetry rather than crash reporter (the latter is opt-in and doesn't give us generalizable numbers).
Currently mission control splits this into content crashes and main/browser crashes. I assume we would only be interested in content crashes here? That is, the number of content crashes that come from a Fission window?
Comment 13•5 years ago
|
||
I believe that, in addition to a "Crash Report", we also send some extremely limited information about crashes in the "crash ping" (https://firefox-source-docs.mozilla.org/toolkit/components/telemetry/data/crash-ping.html), which iirc is treated more like other telemetry pings. Given that :mccr8 has already added a DOMFissionEnabled
entry to the crash report in bug 1560977, it may be acceptable to upgrade this annotation and also add it to the crash ping.
Would that be sufficient for this type of telemetry?
Reporter | ||
Comment 14•5 years ago
|
||
It would be desirable to have the probe measure if at least one docshell was opened, rather than Fission being enabled. This is due to the possibility of having Fission enabled, but opening non-Fission windows. However, for crashes, I believe this is an edge case, and adding DOMFissionEnabled
to the crash ping will suffice.
Comment 15•5 years ago
|
||
The probe bug 1560977 actually checks whether a fission docshell was opened, not whether fission is globally enabled, already.
If every fission window was closed, it does not clear the bit, as we didn't want to miss fission-related crashes occurring during shutdown.
Reporter | ||
Comment 16•5 years ago
|
||
Probe bug 1560977 sounds like what we are looking for, and could be used for this telemetry.
Comment 17•5 years ago
|
||
Move Fission telemetry probe bugs from M5 dogfooding milestone to M6 Nightly.
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 18•5 years ago
|
||
Comment 19•5 years ago
|
||
Comment 20•5 years ago
|
||
bugherder |
Assignee | ||
Comment 21•5 years ago
|
||
Comment 22•5 years ago
|
||
Description
•