Closed Bug 1370178 Opened 7 years ago Closed 6 years ago

Add process information from the RemoteType annotation to the crash aggregates

Categories

(Data Platform and Tools :: General, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: gsvelto, Assigned: mdoglio)

References

(Blocks 1 open bug)

Details

+++ This bug was initially created as a clone of Bug #1353168 +++ In bug 1353168 I've added an additional field to the crash ping payload field (metadata.RemoteType). This field is populated only for pings where payload.processType is set to "content" and it contains the type of content process; its values can be "web", "file" or "extension". This fields allows us to distinguish between crashes to the content process that occurred while running code from a regular page (web), from one loaded locally via the file:// protocol (file) or from an extension (extension). NB: Documentation about the field should show up in [1] as soon as the patch for bug 1353168 reaches mozilla-central [1] https://gecko.readthedocs.io/en/latest/toolkit/components/telemetry/telemetry/data/crash-ping.html
This is a little more important with 57 looming, as this will allow us to differentiate content process crashes -- and in so doing, allow us to attribute crashes to extensions. This data started to get collected in 55 (bug 1353168), so if we could get this in 57 or -- even better -- 56 Nightly that would be a great help.
Flags: needinfo?(mdoglio)
Given that the target is 57, I assume this needs to go into the MissionControl dataset. I'm going to deprecate crash aggregates soon (bug 1388025). I will add 3 new stats to the dataset: web_content_crashes, file_content_crashes, extension_content_crashes. Is there a specific dashboard that needs to be updated with the new data?
Flags: needinfo?(mdoglio)
Assignee: nobody → mdoglio
Priority: -- → P2
I don't think there is currently one that exposes it. However, there should be. We need to be looking at top crashers (at least) and ad-hoc per-process (I would think) from crash pings. This would be useful in that regard, but until we have some dashboard doing this, I think any analysis of this is going to be someone's distinct atmo/stmo work. I don't know enough about MissionControl to know whether these things overlap.
I just ran [1] on crash_summary and it looks like we we are missing the content type 'extension' completely; we have instead a significant number of 'webLargeAllocation'. There are also some crashes that don't report their remote type and those account for about 6% of the total. Is that expected? [1] https://sql.telemetry.mozilla.org/queries/15440/source#table
(In reply to Mauro Doglio [:mdoglio] from comment #4) > I just ran [1] on crash_summary and it looks like we we are missing the > content type 'extension' completely; /queries/15440/source#table Out-of-process extensions were turned on for Windows in 56 and have not yet been turned on by default on Mac or Linux.
Depends on: 1416078
Blocks: 1416078
No longer depends on: 1416078
Component: Datasets: Crash Aggregates → Datasets: General
This is no longer relevant, we use error_aggregates for this purpose in Mission Control.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → INVALID
Component: Datasets: General → General
You need to log in before you can comment on or make changes to this bug.