[@ EMPTY: no frame data available ] instead of Java signature for crash reports from Android
Categories
(Socorro :: Signature, defect, P2)
Tracking
(Not tracked)
People
(Reporter: robwu, Assigned: willkg)
References
(Depends on 1 open bug)
Details
Crash Data
Attachments
(3 files)
In bug 1847372, I pinpointed a reliable OOM crash-trigger, and witnessed the crash happening on all recent versions (Release 116, Beta 117, Nightly 118). Strangely, the last known entry in crash-stats only is associated with Firefox 111:
- bp-f9e509b5-a739-4d14-a0a1-c794e0230316 with signature
[@ java.lang.OutOfMemoryError: at java.util.Arrays.copyOf(Arrays.java) ]
When I tried to trigger a crash report, I was unable to do so due to a regression that broke crash reporting in 115 and 116: bug 1838389. This bug is fixed in Nightly 117.
After running the STR from bug 1847372, I got a crash report, but its signature is [@ EMPTY: no frame data available ]
:
- bp-3f3611e7-97e4-4db1-8061-0541b0230806 with signature
[@ EMPTY: no frame data available ]
.
Both reports have the Java Stack Trace field populated, which feeds my suspicion that this is a bug in Socorro rather than the client side.
Comment 1•1 year ago
|
||
Is this a duplicate of bug 1245570? ("crash in EMPTY: no crashing thread identified; no frame data available (Firefox for Android only)")
Reporter | ||
Comment 2•1 year ago
|
||
That other bug is much older, and it was not clearly actionable.
I filed this one because of a specific actionable task: figure out why two similar crashes appear to have different crash signatures. Due to the overlapping mrtadata to extract the information from, I think that Socorro is the first place to take a look, but I wouldn't completely rule out this being a (Firefox for Android) client issue either.
Assignee | ||
Comment 3•1 year ago
|
||
I glanced at the crash report in question and it's weird it picked up that signature. I'll grab this to look into further this week.
Assignee | ||
Comment 4•1 year ago
|
||
What's going on is that there's no JavaStackTrace
annotation in bp-3f3611e7-97e4-4db1-8061-0541b0230806 and that's what signature generation uses to generate signatures for Java crash reports.
Rob: Any idea why this crash report is missing JavaStackTrace
?
Reporter | ||
Comment 5•1 year ago
|
||
(In reply to Will Kahn-Greene [:willkg] ET needinfo? me from comment #4)
What's going on is that there's no
JavaStackTrace
annotation in bp-3f3611e7-97e4-4db1-8061-0541b0230806 and that's what signature generation uses to generate signatures for Java crash reports.Rob: Any idea why this crash report is missing
JavaStackTrace
?
Probably the same reason as bug 1838389 (example pasted below): In that bug a NPE was fixed by wrapping logic in try-catch and returning null
otherwise: https://github.com/mozilla-mobile/firefox-android/commit/884a6086756fd35320e49a9d80768a646492477c
getExceptionStackTrace
is used here to populate JavaStackTrace
: https://github.com/mozilla-mobile/firefox-android/blob/884a6086756fd35320e49a9d80768a646492477c/android-components/components/lib/crash/src/main/java/mozilla/components/lib/crash/service/MozillaSocorroService.kt#L286-L299
Here is an example of a stack trace that causes throwable.getStacktraceAsString
to raise an error, copy-pasted from about:crashes
. The issue occurred when I tried to submit the crash report from bug 1847372 on Beta.
ddf4650b-cf4a-431c-b461-d920a70eda9e
java.lang.NullPointerException: Attempt to invoke virtual method 'java.lang.String java.lang.Object.toString()' on a null object reference
* New Sentry Instance: https://sentry.io/organizations/mozilla/issues/?project=6295551&query=4414637fdbc2433eb352dca9124104e2
* New Sentry Instance: https://sentry.io/organizations/mozilla/issues/?project=6295551&query=34b5fd80ceb041f0adfbfa4aa6a298d9
----
java.lang.NullPointerException: Attempt to invoke virtual method 'java.lang.String java.lang.Object.toString()' on a null object reference
at java.lang.String.valueOf(String.java:3657)
at java.lang.StringBuilder.append(StringBuilder.java:132)
at java.lang.Throwable.printEnclosedStackTrace(Throwable.java:717)
at java.lang.Throwable.printStackTrace(Throwable.java:682)
at java.lang.Throwable.printStackTrace(Throwable.java:743)
at mozilla.components.support.base.ext.ThrowableKt.getStacktraceAsString$default(Throwable.kt:19)
at mozilla.components.lib.crash.service.MozillaSocorroService.sendCrashData(MozillaSocorroService.kt:607)
at mozilla.components.lib.crash.service.MozillaSocorroService.sendReport$lib_crash_release(MozillaSocorroService.kt:285)
at mozilla.components.lib.crash.service.MozillaSocorroService.report(MozillaSocorroService.kt:5)
at mozilla.components.lib.crash.CrashReporter$submitReport$2.invokeSuspend(CrashReporter.kt:70)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:9)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:112)
at kotlinx.coroutines.internal.LimitedDispatcher$Worker.run(LimitedDispatcher.kt:4)
at kotlinx.coroutines.scheduling.TaskImpl.run(Tasks.kt:3)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:96)
Suppressed: kotlinx.coroutines.internal.DiagnosticCoroutineContextException: [StandaloneCoroutine{Cancelling}@337a98a, Dispatchers.IO]
Assignee | ||
Comment 6•1 year ago
|
||
That sounds like an issue you should raise in the fenix crash reporter.
Currently, Socorro needs a JavaStackTrace
value for signature generation. Changing that is a project and covered in bug #1693863.
Assignee | ||
Comment 7•1 year ago
|
||
I don't think there's anything I can do here. Unassigning myself.
Assignee | ||
Comment 8•1 year ago
|
||
Oops--the bug for the "let's rethink signatures for Java" is bug #1541120.
Reporter | ||
Comment 9•1 year ago
|
||
Bug 1541120 looks like a larger-scope issue than this one. That one is about being smarter than extracting the signature from JavaStackTrace.
https://crash-stats.mozilla.org/signature/?product=Fenix&signature=EMPTY%3A%20no%20frame%20data%20available#reports (on Nightly 118.0a1 alone, there are 160 such reported crashes in the past 7 days).
In this bug, JavaStackTrace is null, but JavaException is not (MozillaSocorroService.kt sets both at the same time, but the value may sometimes be null as seen in bug 1838389).
What would it take to extract the signature from JavaException when JavaStackTrace is null?
FYI:
- JavaStackTrace is generated by: https://github.com/mozilla-mobile/firefox-android/blob/884a6086756fd35320e49a9d80768a646492477c/android-components/components/support/base/src/main/java/mozilla/components/support/base/ext/Throwable.kt#L22-L28
- JavaException is generated by: https://github.com/mozilla-mobile/firefox-android/blob/884a6086756fd35320e49a9d80768a646492477c/android-components/components/support/base/src/main/java/mozilla/components/support/base/ext/Throwable.kt#L30-L88
Assignee | ||
Comment 10•1 year ago
|
||
Currently, Socorro requires JavaStackTrace
to generate a signature. If the crash report doesn't contain a JavaStackTrace
, that's a bug with the relevant crash reporter that should get figured out.
Your idea of changing signature generation to factor in JavaException
seems reasonable, but it's a much bigger project than a "well, why don't we just ..." because of the way signature generation is implemented. Looks like this affects < 350 crash reports out of 1 million for Fenix in the last month. Unless there's some serious urgency here, I'm not going to get to fixing this any time soon.
The data on Crash Stats is available via APIs. You can unblock your work by writing scripts to manipulate the data to get what you want to see out of it. I have a set of utility commands to make that easier:
https://github.com/willkg/crashstats-tools
Hope that helps!
Assignee | ||
Updated•1 year ago
|
Comment 11•1 year ago
|
||
FYI the [@ EMPTY: no frame data available]
is currently Fenix' top crasher. It might be worth putting the signature here since those are all Java exceptions missing the Java stack trace, but since it looks like a native crash it's confusing people.
Updated•1 year ago
|
Comment 12•1 year ago
|
||
(In reply to Will Kahn-Greene [:willkg] ET needinfo? me from comment #10)
Currently, Socorro requires
JavaStackTrace
to generate a signature. If the crash report doesn't contain aJavaStackTrace
, that's a bug with the relevant crash reporter that should get figured out.
Will, is there any Socorro work to be done in this bug? Or can I move this bug to the Fenix::Crash Reporting
component and use it to investigate what Fenix client changes might be needed to fix crash reports without a JavaStackTrace
?
Updated•1 year ago
|
Assignee | ||
Comment 13•1 year ago
|
||
There should definitely be a bug/issue for Fenix and maybe android-components about why there is a JavaException
, but no JavaStackTrace
.
Since comment #10, it looks like the number of crash reports this affects has increased dramatically and this is now a top crasher signature. I don't think we should move this bug to Fenix::Crash Reporting
. I should probably grab this and figure out what I can do about it in socorro.
Comment 14•1 year ago
|
||
btw, I suspect the new crashes are bug 1846306. I looked in Sentry for top crash signatures that aren't in Socorro and found that bug. It's the top Sentry crash signature over the last 30 days, by both number of crash events and number of affected users.
The crash volume spike started August ~16, which happened to be the release date for the Fenix 116.0.3 dot release.
Comment 15•1 year ago
|
||
116.0.3 included a crash reporter fix to help diagnose bug 1846306.
Assignee | ||
Comment 16•1 year ago
|
||
Assignee | ||
Comment 17•1 year ago
|
||
That fix minimally disrupts things. It should only affect crash reports where we have a JavaException
but no JavaStackTrace
. It generates a signature just like it would have if there was a JavaStackTrace
with the mild caveat that it does the right thing by not including line numbers. The current JavaStackTrace
-using code includes the line numbers for non .java
files. That's in bug #1851202.
I'll try to get it to production next week. Once I do, I can reprocess all the existing crash reports with the problem and they'll pick up new signatures.
Assignee | ||
Comment 18•1 year ago
|
||
willkg merged PR #6464: "bug 1847429: implement signature generation for JavaException" in b60b65b.
This will automatically deploy to the stage environment. I'll test it there and (hopefully) deploy it next week to production.
Assignee | ||
Comment 19•1 year ago
|
||
Also, since this involves signature generation changes, I'll write an intent-to-ship email on stability and crash-reporting-wg mailing lists before pushing it to production.
Assignee | ||
Comment 20•1 year ago
|
||
I checked stage this morning and the change looks good:
$ supersearchfacet --host=https://crash-stats.allizom.org \
--_facets=product \
--signature='=EMPTY: no frame data available' \
--relative-range=2w --period=daily --format=markdown
date | -- | Fenix | Focus | total | notes |
---|---|---|---|---|---|
2023-08-22 00:00:00 | 0 | 242 | 5 | 247 | |
2023-08-23 00:00:00 | 0 | 271 | 9 | 280 | |
2023-08-24 00:00:00 | 0 | 274 | 4 | 278 | |
2023-08-25 00:00:00 | 0 | 295 | 4 | 299 | |
2023-08-26 00:00:00 | 0 | 280 | 1 | 281 | |
2023-08-27 00:00:00 | 0 | 310 | 4 | 314 | |
2023-08-28 00:00:00 | 0 | 293 | 2 | 295 | |
2023-08-29 00:00:00 | 0 | 355 | 9 | 364 | |
2023-08-30 00:00:00 | 0 | 351 | 7 | 358 | |
2023-08-31 00:00:00 | 0 | 282 | 7 | 289 | |
2023-09-01 00:00:00 | 0 | 331 | 5 | 336 | <-- landed fix late afternoon |
2023-09-02 00:00:00 | 0 | 11 | 0 | 11 | |
2023-09-03 00:00:00 | 0 | 8 | 0 | 8 | |
2023-09-04 00:00:00 | 0 | 4 | 0 | 4 | |
2023-09-05 00:00:00 | 0 | 4 | 0 | 4 |
Currently, there are around 50k crash reports since August 1st with this signature that will change signatures when I reprocess them.
I emailed the stability and crash-reporting-wg mailing lists with the intended deploy and reprocessing.
Comment 21•1 year ago
|
||
Thanks for this Will!
Assignee | ||
Comment 22•1 year ago
|
||
I deployed this to prod just now in bug #1851648. I'm reprocessing crash reports from 2023-08-01 through now.
Assignee | ||
Comment 23•1 year ago
|
||
Assignee | ||
Comment 24•1 year ago
|
||
I reprocessed the crash reports in that list. There are still 7,101 Fenix crash reports since 2023-08-01 which have "EMPTY: no frame data available". I spot checked those and they don't have a JavaStackTrace
annotation, a JavaException
annotation, or a minidump, so ... I think that's the best we're going to do for now.
When we redo signature generation for Java crash reports, we can include information from other annotations like CrashType
or something like that which adds some information and differentiates between crash reports that have no frame data.
Marking this as FIXED.
Assignee | ||
Comment 25•1 year ago
|
||
Assignee | ||
Updated•1 year ago
|
Assignee | ||
Updated•1 year ago
|
Assignee | ||
Comment 26•1 year ago
|
||
I did a first round of reprocessing for crash reports >= 2023-08-01. We went from 51,320 to 7,232.
Before:
$ supersearchfacet --signature='=EMPTY: no frame data available' --date='>=2023-08-01' \
--_facets=product --format=markdown
product | count |
---|---|
Fenix | 50344 |
Focus | 947 |
ReferenceBrowser | 29 |
total | 51320 |
After:
$ supersearchfacet --signature='=EMPTY: no frame data available' --date='>=2023-08-01' \
--_facets=product --format=markdown
product | count |
---|---|
Fenix | 7104 |
Focus | 127 |
ReferenceBrowser | 1 |
total | 7232 |
At Chris' behest, I did a second round of reprocessing for crash reports >= 2023-07-01 and < 2023-08-01. We went from 3,527 to 3,527--it looks like those weren't affected.
$ supersearchfacet --signature='=EMPTY: no frame data available' --date='>=2023-07-01' --date='<2023-08-01' \
--_facets=product --format=markdown
product | count |
---|---|
Fenix | 3493 |
Focus | 34 |
total | 3527 |
It looks like none of them have a JavaStackTrace
or JavaException
.
$ supersearch --signature='=EMPTY: no frame data available' --date='>=2023-07-01' --date='<2023-08-01' --num=all \
| wc -l
3527
$ supersearch --signature='=EMPTY: no frame data available' --date='>=2023-07-01' --date='<2023-08-01' \
--crash_report_keys=JavaStackTrace --crash_report_keys=JavaException --num=all \
| wc -l
0
Comment 27•1 year ago
|
||
One interesting bit about JavaException
is that it's a bit of a misnomer. It's actually a stack trace, just in a different format compared to JavaStackTrace
. Anyway, I feel like the remaining fixes need to happen in Fenix' crash handler. We can close this bug as fixed and open a new one in Fenix crash handler to make sure it tries harder to populate at least one of the two annotations.
Comment 28•1 year ago
|
||
I filed bug 1851898 to fix the Fenix crash reporter.
Should crash reports include both JavaException
and JavaStackTrace
annotations? Or prefer JavaException
? Bug 1792902 asks if we should retire JavaStackTrace
now that we have JavaException
.
Assignee | ||
Comment 29•1 year ago
|
||
Currently, Socorro depends on JavaStackTrace
. We'd need to figure out what's involved in changing that and then change it. I wrote up bug #1851903 for that work. Until that work is completed, we need at least JavaStackTrace
for the foreseeable future.
Description
•