Missing MOZ_CRASH reason in crash reports from Android, take two
Categories
(Toolkit :: Crash Reporting, defect, P2)
Tracking
()
People
(Reporter: mstange, Assigned: gsvelto, NeedInfo)
References
Details
Attachments
(1 file)
(deleted),
text/x-phabricator-request
|
pascalc
:
approval-mozilla-release+
|
Details |
This is a follow-up bug to bug 1666383, which did not fix the bug for me.
Steps to reproduce:
- In Firefox for Android, navigate to
about:crashcontent
. - Alternatively, in a build before bug 1681842 appeared, navigate to
about:crashparent
.
Expected results:
The Socorro crash report for that crash should include a Moz crash reason.
Actual results:
No moz crash reason can be found.
Example reports:
about:crashcontent
report from a build that includes the fix for bug 1666383: https://crash-stats.mozilla.org/report/index/fc1c7397-20e6-42c8-a071-702e80201210about:crashparent
report from an old build before the landing of bug 1649132 caused bug 1681842: https://crash-stats.mozilla.org/report/index/584debe8-e130-4d5a-8a0d-b076e0200921
Assignee | ||
Comment 1•4 years ago
|
||
I can't reproduce in the emulator so this must have something to do with the environment we have on the phones. I'm wondering if SELinux might be interfering with the code we run in the exception handler? That would also explain bug 1681842 which I also cannot reproduce in the emulator. One thing I noticed while inspecting some crashes is that some other fields are also missing: the StackTraces field, the Telemetry* as well as the ContentSandbox* ones.
Assignee | ||
Comment 2•4 years ago
|
||
This is only happening on ARM, I verified only the 32-bit build but I'll try the 64- one too. On x86-64 - which is what we use for the emulator - the issue doesn't repro.
Assignee | ||
Comment 3•4 years ago
|
||
I smell a compiler issue: if I put a print statement in AnnotateMozCrashReason()
everything works as advertised, if I remove it then the gMozCrashReason
variable is never assigned the string passed to MOZ_CRASH()
. The fact that this only happens on ARM and not on x86-64 makes me doubly suspicious about the compiler. Lowering the priority given that I don't have time to investigate into clang ATM.
Comment 5•2 years ago
|
||
We updated compilers a few times since, assuming this is no longer an issue (and anyway Mike is still needinfoed).
Assignee | ||
Comment 6•2 years ago
|
||
I could never really figure this one out. We have a handful of ARM crashes where the raw crash reason was populated correctly, but two order of magnitude more ARM64 crashes. I suspect this is still an issue and I'm starting to think it's not related to the compiler (otherwise it would always fail, wouldn't it?).
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 7•1 years ago
|
||
I just tested this again and 32-bit ARM builds are still affected. Somehow the compiler removes the store to gMozCrashReason
. Why does this only happen on 32-bit ARM is beyond my understanding, maybe some arch-specific peephole optimization thinks this is a dead store? Maybe it has to do with the memory model (or the compiler's assumptions about this)? Either way I'm trying to see if turning this into a volatile
store fixes the issue (while I loathe using volatile
sometimes there's really no way around it).
Assignee | ||
Comment 8•1 years ago
|
||
Updated•1 years ago
|
Assignee | ||
Comment 9•1 years ago
|
||
Making the variable volatile does the trick. Let me know if it's too ugly or if there's a better way.
Comment 10•1 years ago
|
||
Comment 11•1 years ago
|
||
bugherder |
Assignee | ||
Comment 12•1 years ago
|
||
FYI I've verified that this fixed the problem, recent versions store the proper value after a crash.
Comment 13•1 years ago
|
||
Gabriele, do you think this MOZ_CRASH fix is worth uplifting to a 114 dot release (that is already planned for next week or the week after)? It would make crash reports more useful and seems low risk, since the browser is already crashing.
Comment 15•1 years ago
|
||
Comment on attachment 9336075 [details]
Bug 1681846 - Ensure MOZ_CRASH() stores the reason in crash reports on 32-bit ARM/Android builds r=glandium
Beta/Release Uplift Approval Request
- User impact if declined: Crash reports from 32-bit Android devices will be harder to debug because they won't have the MOZ_CRASH reason string.
The patch author, gsvelto, verified the fix in Nightly: https://bugzilla.mozilla.org/show_bug.cgi?id=1681846#c12. The patch has been baking on Nightly 115 for seven days.
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): This change is not risky. It's a one-line change in Firefox's crash handler. Firefox is already crashing when this new line of code is run, so nothing worse can happen.
- String changes made/needed:
- Is Android affected?: Yes
Comment 16•1 year ago
|
||
Comment on attachment 9336075 [details]
Bug 1681846 - Ensure MOZ_CRASH() stores the reason in crash reports on 32-bit ARM/Android builds r=glandium
Approved for the release branch, thanks.
Comment 17•1 year ago
|
||
bugherder uplift |
Description
•