Closed Bug 823452 Opened 12 years ago Closed 12 years ago

Android's checkForCrashes "No crash directory found on remote device" error is misleading (obscures startup java stacks)

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: emorley, Assigned: gbrown)

References

(Blocks 1 open bug)

Details

(Keywords: sheriffing-P1)

Attachments

(1 file, 2 obsolete files)

check for java exception in mozcrash 12 years ago Geoff Brown [:gbrown] (deleted), patch		Details \| Diff \| Splinter Review
use mozcrash to check for java exceptions, from remote mochitests/reftests 12 years ago Geoff Brown [:gbrown] (deleted), patch		Details \| Diff \| Splinter Review
check for java exception in logcat; if found, report preferentially to other crash reporting 12 years ago Geoff Brown [:gbrown] (deleted), patch	jmaher : review+	Details \| Diff \| Splinter Review

Ed Morley [:emorley]

Reporter

Description

•

12 years ago

During the night mozilla-inbound was closed, since devs in #developers saw this push (and subsequent similarly busted): https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=06d72606ec56 ...and inferred it was an infra problem, because the failure line shown in TBPL's annotated summary was: "Automation Error: No crash directory (/mnt/sdcard/tests/profile/minidumps/) found on remote device" And thus filed bug 823420. The real problem in this case was visible lower down in the logcat: { 12-19 21:07:16.938 E/GeckoAppShell( 1816): >>> REPORTING UNCAUGHT EXCEPTION FROM THREAD 1 ("main") 12-19 21:07:16.938 E/GeckoAppShell( 1816): java.lang.RuntimeException: Unable to start activity ComponentInfo{org.mozilla.fennec/org.mozilla.fennec.App}: android.content.res.Resources$NotFoundException: Resource ID #0x7f0a0034 12-19 21:07:16.938 E/GeckoAppShell( 1816): at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:2663) ... } ...however, you have to view the full log on TBPL to see it, which is unlikely to happen if one didn't know what to look for. In addition, I've just looked at a handful of the logs in the intermittent failure bug 722166, and many of those actually have usable Java exceptions in the logcat (and aren't all the "NSS load done", with no further useful output, which is what that bug is primarily about). We should do a better job of separating out those different intermittent failures. I was toying with the idea of: * Rewording the "Automation Error: No crash directory..." string * Adding "REPORTING UNCAUGHT EXCEPTION FROM THREAD" (or similar) to TBPL's parser regexes, so the annotated summary lists (and also TBPL's brief log view highlights/links to) the relevant parts of the logcat. However, there is also bug 809065, which is presumably going to output the java stack directly to the log (it relies on the .extra file in the minidumps directly, which we don't have here aiui). So if TBPL's parser matches the logcat exception lines, we'll end up double-reporting them in the annotated summary. It seems like we should just be handling this in remoteautomation.py's checkForCrashes. (ie call out to a new automationutils.py checkForJavaCrash(), which would parse the logcat and extract the failure or something; unless there is a more intelligent way to retrieve it from Android? I really don't have much experience with Android java crash debugging/devicemanager etc). What do you think?

Ed Morley [:emorley]

Reporter

Updated

•

12 years ago

Keywords: sheriffing-P1

Ed Morley [:emorley]

Reporter

Comment 1

•

12 years ago

(In reply to Ed Morley (Away 20th Dec-2nd Jan) [UTC+0; email:edmorley@moco] from comment #0) > However, there is also bug 809065, which is presumably going to output the > java stack directly to the log (it relies on the .extra file in the > minidumps directly, which we don't have here aiui). So if TBPL's parser > matches the logcat exception lines, we'll end up double-reporting them in > the annotated summary. Actually, once the crashreporter is in a position to give us the .extra file, it then handles the java exceptions, meaning that we could get TBPL to match on ">>> REPORTING UNCAUGHT EXCEPTION FROM THREAD .*", which would only be present in the startup-crash logs where nothing caught them (or at least that's what I've seen so far from inspecting the logs).

Ed Morley [:emorley]

Reporter

Comment 2

•

12 years ago

Same confusion has now happened on https://tbpl.mozilla.org/?rev=ea373e534245 too. dbaron has starred all the crashes as the intermittent failure (even though it occurred on every testsuite for that push, which is highly unlikely, but hey), when in fact it was: { 12-22 06:32:41.937 E/GeckoAppShell( 2197): >>> REPORTING UNCAUGHT EXCEPTION FROM THREAD 1 ("main") 12-22 06:32:41.937 E/GeckoAppShell( 2197): java.lang.RuntimeException: Unable to start activity ComponentInfo{org.mozilla.fennec/org.mozilla.fennec.App}: java.lang.RuntimeException: Your TabHost must have a TabWidget whose id attribute is 'android.R.id.tabs' 12-22 06:32:41.937 E/GeckoAppShell( 2197): at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:1956) 12-22 06:32:41.937 E/GeckoAppShell( 2197): at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:1981) 12-22 06:32:41.937 E/GeckoAppShell( 2197): at android.app.ActivityThread.access$600(ActivityThread.java:123) 12-22 06:32:41.937 E/GeckoAppShell( 2197): at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1147) ... }

Ed Morley [:emorley]

Reporter

Comment 3

•

12 years ago

Joel, Geoff, Ted: Any thoughts on comments 0-2?

(not currently active) Ted Mielczarek

•

12 years ago

Attached patch check for java exception in logcat; if found, report preferentially to other crash reporting (deleted) — Details — Splinter Review

The mozcrash approach did not work out. try runs failed on "import mozcrash" and I note that remote mochitests and reftests are not currently using mozcrash, and perhaps not any mozbase module directly. This approach simply adds on to remoteautomation's checkForCrashes. If a Java exception is found in the logcat, that is reported as a PROCESS-CRASH, and additional crash detection is skipped. try run showing normal (no java crash) operation: https://tbpl.mozilla.org/?tree=Try&rev=d96f80619f1a&noignore=1 try run with java exceptions: https://tbpl.mozilla.org/?tree=Try&rev=1f231280ffdb&noignore=1 I will handle talos in a follow-up patch.

Attachment #711887 - Attachment is obsolete: true

Attachment #711888 - Attachment is obsolete: true

Attachment #711893 - Flags: review?(jmaher)

Ed Morley [:emorley]

Reporter

Updated

•

12 years ago

Depends on: 839569

Joel Maher ( :jmaher ) (UTC -8)

Comment 14

•

12 years ago

Comment on attachment 711893 [details] [diff] [review] check for java exception in logcat; if found, report preferentially to other crash reporting Review of attachment 711893 [details] [diff] [review]: ----------------------------------------------------------------- it seems to be a longer road to get mozcrash fully implemented, one or two more hurdles left, but it shouldn't hold this patch up any more. ::: build/mobile/remoteautomation.py @@ +120,4 @@ > def checkForCrashes(self, directory, symbolsPath): > + logcat = self._devicemanager.getLogcat(filterOutRegexps=fennecLogcatFilters) > + crashed = self.checkForJavaException(logcat) > + if not crashed: the variable name 'crashed' is confusing here as we are checking if not crashed, but then checking for a crash again. Maybe javaException would be more descriptive?

Attachment #711893 - Flags: review?(jmaher) → review+

Geoff Brown [:gbrown]

Assignee

Comment 15

•

12 years ago

Landed with 'crashed' -> 'javaException' change. https://hg.mozilla.org/integration/mozilla-inbound/rev/e754df01a190

Ed Morley [:emorley]

Reporter

Comment 16

•

12 years ago

As landed, we now don't delete the old minidumps directory if we found a java exception.

Geoff Brown [:gbrown]

Assignee

Comment 17

•

12 years ago

(In reply to Ed Morley [:edmorley UTC+0] from comment #16) I have a feeling I missed something and you are right, but I still don't see the problem...can you clarify? It looks to me like the remote minidumps dir (remoteCrashDir) is never deleted, regardless of my change. The local dump dir (dumpDir) is neither created nor deleted in the case of a Java exception.

Whiteboard: [leave open]

Ed Morley [:emorley]

Reporter

Comment 18

•

12 years ago

Sorry, I meant the local dir (dumpDir) isn't deleted since we return early if there was a java exception.

Ed Morley [:emorley]

Reporter

Comment 19

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/e754df01a190

Geoff Brown [:gbrown]

Assignee

Comment 20

•

12 years ago

I checked with Ed on irc and we agree that the code is okay as it is: on java exception, dumpDir is never created.

Ed Morley [:emorley]

Reporter

Updated

•

12 years ago

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Whiteboard: [leave open]

Ed Morley [:emorley]

Reporter

Comment 21

•

12 years ago

Just had first usage of this! :-D eg https://tbpl.mozilla.org/php/getParsedLog.php?id=19752455&tree=Firefox

Ed Morley [:emorley]

Reporter

Updated

•

12 years ago

Blocks: 809065

Ed Morley [:emorley]

Reporter

Updated

•

12 years ago

Blocks: 778688

Depends on: 872507

Ed Morley [:emorley]

Reporter

Updated

•

11 years ago

Blocks: 910320

You need to log in before you can comment on or make changes to this bug.