Closed Bug 1104317 Opened 10 years ago Closed 10 years ago

Signatures for shutdown crashes

Categories

(Socorro :: General, task)

x86_64
Windows 7
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: away, Assigned: lars)

References

Details

The shutdown crashes added by 1038342 are coming from a watchdog thread which is uninteresting for purposes of crash bucketing and analysis. It would be more helpful to see the main thread was doing. If the signature matches: [@ mozilla::`anonymous namespace''::RunWatchdog(void*)] [@ mozilla::(anonymous namespace)::RunWatchdog(void*)] (or maybe just some regex for RunWatchdog) Then instead let's show the stack for thread 0, ignoring anything on the regular ignore-list as well as anything containing ProcessNextEvent. E.g. bp-569807e3-431a-4f7b-a800-0bbc62141124 would display as "shutdownhang | mozilla::layers::CompositorParent::ShutDown()" This won't work well for busy-hangs like bp-b776b61d-a73a-40b1-81dd-581212141124. Taking the approach above we'd get tons of random signatures. bsmedberg are you ok with that or do you want to include some kind of cleverness like search for a frame containing the word Shutdown?
Flags: needinfo?(benjamin)
Let's ignore the busy-hang case for now: if we want to we can go back and instrument that case by measuring and annotating CPU usage. In bug 1103833 I suggested that rather than using RunWatchdog as the marker, we could use an explicit annotation. But in the short term, Run Watchdog is probably good enough.
Flags: needinfo?(benjamin)
Blocks: 1103833
Lonnen can you find an owner for this? It's needed in order to diagnose one of our biggest nightly crashes.
Flags: needinfo?(chris.lonnen)
can "ProcessNextEvent" be added to the general ignore list or must it be a special case for this signature variant only? adding that frame signature to the general ignore list makes this trivial to implement. Having it as a special case makes the implementation a bit more complicated.
Flags: needinfo?(dmajor)
Flags: needinfo?(chris.lonnen)
Flags: needinfo?(benjamin)
QA Contact: lars
Am I remembering correctly, that there are two lists, an "ignore" list and an "append" list? I think it would be reasonable to include /ProcessNextEvent/ on the general append list. (I wouldn't want to strip it altogether though.)
Flags: needinfo?(dmajor)
Yes, that is correct, there are both "ignore" and "append" lists (see: http://socorro.readthedocs.org/en/v8/signaturegeneration.html). I have proceeded with adding ProcessNextEvent to the "append" list. To see if my modifications produce what you expect, please compare these*: from production, a crash with the target signature: https://crash-stats.mozilla.com/report/index/1d69a631-7ece-4570-8c72-bffa12150120 that same crash in staging, reprocessed with a new signature rule for "shutdown hangs": https://crash-stats.allizom.org/report/index/1d69a631-7ece-4570-8c72-bffa12150120 Please verify that the signature in staging is correct. On your approval, I will submit this PR and if you act quickly, I bet we can get this into production this week. * the orignial example cited in Comment #0 could not be used because its symbols have expired, reprocessing now results in a signature that wouldn't trigger the shutdown hang rule.
Assignee: nobody → lars
Flags: needinfo?(dmajor)
QA Contact: lars
Flags: needinfo?(benjamin)
f+
Flags: needinfo?(dmajor)
Blocks: 1123698
Commit pushed to master at https://github.com/mozilla/socorro https://github.com/mozilla/socorro/commit/a9a3cd3ff837b7e44d48499d29c72215588bcccd Merge pull request #2588 from twobraids/runwatchdog Fixes Bug 1104317 - adds SignatureRunWatchDog rule to processor
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
"shutdownhang..." signatures are now streaming out of the Socorro processor. There are about 4 per minute.
You need to log in before you can comment on or make changes to this bug.