Closed
Bug 803158
Opened 12 years ago
Closed 10 years ago
if no crash report is generated by a tegra (or whatever we're running tests on) use ndk-stack to get a stack from the tombstone
Categories
(Firefox Build System :: General, defect)
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: blassey, Assigned: jmaher)
References
Details
(Keywords: sheriffing-P1)
Attachments
(6 files)
No description provided.
Comment 1•12 years ago
|
||
This description of ndk-stack might be useful: https://yssays.wordpress.com/2011/12/27/android-ndk-stack-tool/
Assignee | ||
Comment 2•12 years ago
|
||
Here is a log file with a crash that doesn't get picked up in crashreporter, but spits to logcat:
https://tbpl.mozilla.org/php/getParsedLog.php?id=16336570&tree=Mozilla-Inbound&full=1
I/DEBUG ( 937): *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
I/DEBUG ( 937): Build fingerprint: 'nvidia/harmony/harmony/harmony:2.2/FRF91/20110202.102810:eng/test-keys'
I/DEBUG ( 937): pid: 1700, tid: 1705 >>> org.mozilla.fennec <<<
I/DEBUG ( 937): signal 11 (SIGSEGV), fault addr 00335d58
I/DEBUG ( 937): r0 00000000 r1 00313310 r2 00335d58 r3 808a23f4
I/DEBUG ( 937): r4 002faaf0 r5 4b405c98 r6 808a23f4 r7 00000006
I/DEBUG ( 937): r8 00100000 r9 8084f865 10 4b306000 fp 00131b80
I/DEBUG ( 937): ip 00000003 sp 4b405c30 lr 8086ed91 pc 8086ee00 cpsr 20000030
I/DEBUG ( 937): d0 400000003eaaaaab d1 3ff0000041f00000
I/DEBUG ( 937): d2 0000000050baf6de d3 0000000000000000
I/DEBUG ( 937): d4 0000001c000000b4 d5 3fe999999999999a
I/DEBUG ( 937): d6 3fe0000000000000 d7 3eaaaaab3f800000
I/DEBUG ( 937): d8 0000000000000000 d9 0000000000000000
I/DEBUG ( 937): d10 0000000000000000 d11 0000000000000000
I/DEBUG ( 937): d12 0000000000000000 d13 0000000000000000
I/DEBUG ( 937): d14 0000000000000000 d15 0000000000000000
I/DEBUG ( 937): scr 80000012
I/DEBUG ( 937):
I/DEBUG ( 937): #00 pc 0006ee00 /system/lib/libdvm.so
I/DEBUG ( 937): #01 pc 0006ed8c /system/lib/libdvm.so
I/DEBUG ( 937): #02 pc 00074466 /system/lib/libdvm.so
I/DEBUG ( 937): #03 pc 0006dee8 /system/lib/libdvm.so
I/DEBUG ( 937): #04 pc 0004f8b6 /system/lib/libdvm.so
I/DEBUG ( 937): #05 pc 000110a4 /system/lib/libc.so
I/DEBUG ( 937): #06 pc 00010c38 /system/lib/libc.so
I/DEBUG ( 937):
I/DEBUG ( 937): code around pc:
I/DEBUG ( 937): 8086ede0 ffff41cf ffff41e6 ffff421b ffff422d
I/DEBUG ( 937): 8086edf0 4a07a000 181b4b07 58992000 e001460a
I/DEBUG ( 937): 8086ee00 68526010 d1fb2a00 189b4a01 47706059
I/DEBUG ( 937): 8086ee10 000046fc 00033600 6843684a 688a18d3
I/DEBUG ( 937): 8086ee20 6883604b 68ca18d3 68c3608b 18d32000
I/DEBUG ( 937):
I/DEBUG ( 937): code around lr:
I/DEBUG ( 937): 8086ed70 4a1e6903 f8dd9300 1871e068 e004f8cd
I/DEBUG ( 937): 8086ed80 200318b2 3018f8dc ed2af7a7 f830f000
I/DEBUG ( 937): 8086ed90 30aef89d f8ddb943 f8dee040 f1bcc000
I/DEBUG ( 937): 8086eda0 bf180000 e0082001 9b159a1a 70d2eb02
I/DEBUG ( 937): 8086edb0 10419a10 f7ff4620 b049fcd7 bf00bdf0
I/DEBUG ( 937):
I/DEBUG ( 937): stack:
I/DEBUG ( 937): 4b405bf0 4b405c98
I/DEBUG ( 937): 4b405bf4 00000084
I/DEBUG ( 937): 4b405bf8 00000000
I/DEBUG ( 937): 4b405bfc 00000000
I/DEBUG ( 937): 4b405c00 808a23f4 /system/lib/libdvm.so
I/DEBUG ( 937): 4b405c04 4b405ea4
I/DEBUG ( 937): 4b405c08 00000008
I/DEBUG ( 937): 4b405c0c 0000006c
I/DEBUG ( 937): 4b405c10 00000002
I/DEBUG ( 937): 4b405c14 00310000
I/DEBUG ( 937): 4b405c18 00000000
I/DEBUG ( 937): 4b405c1c 002faaf0
I/DEBUG ( 937): 4b405c20 4b405c98
I/DEBUG ( 937): 4b405c24 808a23f4 /system/lib/libdvm.so
I/DEBUG ( 937): 4b405c28 df0027ad
I/DEBUG ( 937): 4b405c2c 00000000
I/DEBUG ( 937): #01 4b405c30 4b405d4f
I/DEBUG ( 937): 4b405c34 4b405d44
I/DEBUG ( 937): 4b405c38 00000065
I/DEBUG ( 937): 4b405c3c 00000000
I/DEBUG ( 937): 4b405c40 00000000
I/DEBUG ( 937): 4b405c44 1feb5e40
I/DEBUG ( 937): 4b405c48 0000fa00
I/DEBUG ( 937): 4b405c4c afd3832c /system/lib/libc.so
I/DEBUG ( 937): 4b405c50 d39fad79
I/DEBUG ( 937): 4b405c54 00313768
I/DEBUG ( 937): 4b405c58 000001c0
I/DEBUG ( 937): 4b405c5c 000001c2
I/DEBUG ( 937): 4b405c60 00313318
I/DEBUG ( 937): 4b405c64 000001c0
I/DEBUG ( 937): 4b405c68 00000003
I/DEBUG ( 937): 4b405c6c 002faaf0
I/DEBUG ( 937): 4b405c70 4b405ea4
I/DEBUG ( 937): 4b405c74 808a23f4 /system/lib/libdvm.so
I/DEBUG ( 937): 4b405c78 44198e70 /data/dalvik-cache/system@framework@framework.jar@classes.dex
I/DEBUG ( 937): 4b405c7c 808a23f4 /system/lib/libdvm.so
I/DEBUG ( 937): 4b405c80 0000001a
I/DEBUG ( 937): 4b405c84 4b405d98
I/DEBUG ( 937): 4b405c88 808964f4 /system/lib/libdvm.so
I/DEBUG ( 937): 4b405c8c 00000394
I/DEBUG ( 937): 4b405c90 80884688 /system/lib/libdvm.so
I/DEBUG ( 937): 4b405c94 00313348
I/DEBUG ( 937): 4b405c98 0000000c
I/DEBUG ( 937): 4b405c9c 00000006
I/DEBUG ( 937): 4b405ca0 00313818
I/DEBUG ( 937): 4b405ca4 43189b88 /dev/ashmem/dalvik-LinearAlloc (deleted)
I/DEBUG ( 937): 4b405ca8 002faaf0
I/DEBUG ( 937): 4b405cac 00314568
I/DEBUG ( 937): 4b405cb0 002f5884
I/DEBUG ( 937): 4b405cb4 002f544c
I/DEBUG ( 937): 4b405cb8 00314568
I/DEBUG ( 937): 4b405cbc 00000008
I/DEBUG ( 937): 4b405cc0 00000002
I/DEBUG ( 937): 4b405cc4 003137f8
I/DEBUG ( 937): 4b405cc8 00000002
I/DEBUG ( 937): 4b405ccc 0000007c
I/DEBUG ( 937): 4b405cd0 00000084
I/DEBUG ( 937): 4b405cd4 002f58cc
I/DEBUG ( 937): 4b405cd8 4edc2618 /dev/ashmem/dalvik-jit-code-cache (deleted)
I/DEBUG ( 937): 4b405cdc 00000000
I/DEBUG ( 937): 4b405ce0 00000000
I/DEBUG ( 937): 4b405ce4 00000002
I/DEBUG ( 937): 4b405ce8 00000000
I/DEBUG ( 937): 4b405cec 00000000
I/DEBUG ( 937): 4b405cf0 00000000
I/DEBUG ( 937): 4b405cf4 00000000
I/DEBUG ( 937): 4b405cf8 00314420
I/DEBUG ( 937): 4b405cfc 00000000
I/DEBUG ( 937): 4b405d00 00000000
I/DEBUG ( 937): 4b405d04 00000000
I/DEBUG ( 937): 4b405d08 00000000
I/DEBUG ( 937): 4b405d0c 002f5884
I/DEBUG ( 937): 4b405d10 00313e8c
I/DEBUG ( 937): 4b405d14 00000002
I/DEBUG ( 937): 4b405d18 4b405d98
I/DEBUG ( 937): 4b405d1c 00000003
I/DEBUG ( 937): 4b405d20 0000003a
I/DEBUG ( 937): 4b405d24 00313830
I/DEBUG ( 937): 4b405d28 003138f8
I/DEBUG ( 937): 4b405d2c 00000000
I/DEBUG ( 937): 4b405d30 00000000
I/DEBUG ( 937): 4b405d34 00000000
I/DEBUG ( 937): 4b405d38 003141c0
I/DEBUG ( 937): 4b405d3c 00000000
I/DEBUG ( 937): 4b405d40 00000000
I/DEBUG ( 937): 4b405d44 00000000
I/DEBUG ( 937): 4b405d48 000001cb
I/DEBUG ( 937): 4b405d4c 008db80c
I/DEBUG ( 937): 4b405d50 00000000
I/DEBUG ( 937): 4b405d54 808a7460
I/DEBUG ( 937): 4b405d58 00000000
I/DEBUG ( 937): 4b405d5c 00000000
I/DEBUG ( 937): 4b405d60 be8db80c [stack]
I/DEBUG ( 937): 4b405d64 8087446b /system/lib/libdvm.so
I/DEBUG ( 937): debuggerd committing suicide to free the zombie!
In order to use the ndk-stack tool, I need system level symbols. Is there a way to get these from the tegras or somewhere on the internet?
Updated•12 years ago
|
Keywords: sheriffing-P1
Updated•12 years ago
|
Comment 3•12 years ago
|
||
Here is a try run with an intentional Gecko crash on startup:
https://tbpl.mozilla.org/?tree=Try&rev=250358af465d
In (nearly?) all of these cases, mozcrash reports the crash stack perfectly. (The Talos tests format the error message differently, so tbpl offers "stack found after process termination" rather than a helpful PROCESS CRASH line...but all the same crash stack data is in the log.) Also, in (almost?) none of these cases is a tombstone dumped to the logcat. For this type of crash, it seems unlikely that additional processing by ndk-stack would provide any additional information.
Here is a try run with an intentional Java crash on startup:
https://tbpl.mozilla.org/?tree=Try&rev=ef8deb159d5d
As far as I can see, all of these logs contain accurate Java crash stacks under "REPORTING UNCAUGHT EXCEPTION FOR THREAD". tbpl doesn't recognize the error well -- that is bug 823452 -- but an accurate and complete stack is reported in the existing logs. In these Java crash cases, a tombstone is also usually reported in logcat, and those tombstones look very much like the tombstones for the long-unresolved problem of startup crashes: bug 810471. In these cases -- when there is a tombstone but no Java stack reported by the unhandled exception handler -- additional processing by ndk-stack might provide additional information. Since the tombstones for bug 810471 generally only reference system libraries, we will likely need to provide symbols for system libraries to get any value out of this -- likely only available for pandaboards.
Comment 4•12 years ago
|
||
If we have a useful Java stack, the tombstone stack is unlikely to be very useful, right? On crash-stats when we catch a Java exception we send and display that stack in the crash report instead of the native stack.
Updated•12 years ago
|
Comment 5•12 years ago
|
||
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #4)
> If we have a useful Java stack, the tombstone stack is unlikely to be very
> useful, right? On crash-stats when we catch a Java exception we send and
> display that stack in the crash report instead of the native stack.
Right. We are really only looking at ndk-stack because of the bug 810471 scenario: we often have a tombstone (with libc and system references only) but no stack. We don't know that ndk-stack will help, but it might.
Comment 6•12 years ago
|
||
There is a little bit of ndk-stack info at <path-to-your-ndk>/docs/NDK-STACK.html -- it's not very useful.
Comment 7•12 years ago
|
||
On my local pandaboard, I ran my Java-crash-on-startup Fennec. This wrote a tombstone to /data/tombstones and also to logcat. I collected those traces and ran some experiments with ndk-stack, from the r8c Android NDK.
There was substantially more information in the /data/tombstones file than was recorded to logcat. Running ndk-stack on the logcat produced a very abbreviated report:
********** Crash dump: **********
Build fingerprint: 'pandaboard/pandaboard/pandaboard:4.0.4/IMM76I/5:eng/test-keys'
pid: 2089, tid: 2103 >>> org.mozilla.fennec_mozdev <<<
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 00000000
Stack frame #00 pc 000009aa /dev/ashmem/libmozalloc.so (deleted)
Running ndk-stack on the /data/tombstones file produced a full report. I tried trimming the logcat, manipulating end-of-line markers, etc, but could not produce a good report from the logcat. I concentrated on the /data/tombstones file instead.
Running ndk-stack on the /data/tombstones file without providing any symbols produced a full report that was concise and much more readable than the /data/tombstones file itself. I don't think the ndk-stack report contains any additional information, but there may be value in doing this simply for the formatting.
Of course we would prefer to use symbols. One stack frame referenced libmozalloc.so; without any libmozalloc symbols provided to ndk-stack, this is reported as:
Stack frame #00 pc 000009aa /dev/ashmem/libmozalloc.so (deleted)
Providing the unstripped libmozalloc.so improves this and gives us the kind of information we want:
Stack frame #00 pc 000009aa /dev/ashmem/libmozalloc.so (deleted): Routine mozalloc_abort in /home/mozdev/src/memory/mozalloc/mozalloc_abort.cpp:30
Providing a stripped libmozalloc.so does not help, and produces an annoying error message:
Stack frame #00 pc 000009aa /dev/ashmem/libmozalloc.so (deleted): Unable to open symbol file /home/mozdev/pandasym/libmozalloc.so. Error (9): Bad file descriptor
I pulled libc.so and other system libraries from the pandaboard, from /system/lib and offered those to ndk-stack. Those libs do not appear to have symbols and produce the same error message:
Stack frame #00 pc 0000cff0 /system/lib/libc.so (epoll_wait): Unable to open symbol file /home/mozdev/pandasym/libc.so. Error (9): Bad file descriptor
Comment 8•12 years ago
|
||
Comment 9•12 years ago
|
||
Comment 10•12 years ago
|
||
Comment 11•12 years ago
|
||
Comment on attachment 699448 [details]
best ndk-stack report from experiment in comment 7: from tombstone + unstripped libmozalloc.so
Even this best-case result doesn't seem incredibly useful, especially given the amount of effort it takes to get it.
Comment 12•12 years ago
|
||
:jmaher provided a libc.so with symbols, for the pandas. I verified that the libc.so contains symbols, and ndk-stack recognizes them: I applied ndk-stack to the tombstone generated earlier and obtained a report with all the libc references translated.
I patched my panda build by remounting /system rw and pushing libc.so to /system/lib. I launched a normal Fennec build and verified that it started (the new libc.so appears to be valid). I then launched my crashing Fennec build and verified that it crashed; it did crash but did not generate a tombstone. I re-tried several times but could not generate a new tombstone with the new libc.so...but maybe I was just unlucky.
I restored the old libc.so and launched the crashing Fennec twice -- no tombstone the first time, but one was generated the second time.
I tried to put back the new libc.so for further tests, but cp crashed (it uses libc of course!) and left my panda unbootable...I'll re-flash, but since nearly everything uses libc, I cannot think of a safe way to patch my build. Perhaps it would be best to create and test a full image with unstripped libs.
Comment 13•12 years ago
|
||
I re-flashed, patched libc.so again, and re-installed the crashing Fennec. This time I was able to get several tombstones. However, ndk-stack fails when reading these tombstones; it reports only:
********** Crash dump: **********
Build fingerprint: 'pandaboard/pandaboard/pandaboard:4.0.4/IMM76I/5:eng/test-keys'
pid: 2448, tid: 2461 >>> org.mozilla.fennec_mozdev <<<
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 00000000
ndk-stack: elff/elf_file.cc:102: static ElfFile* ElfFile::Create(char const*): Assertion `read_bytes != -1 && read_bytes == sizeof(header)' failed.
Stack frame #00 pc 00000000 Aborted
Comment 14•12 years ago
|
||
Updated•12 years ago
|
Summary: if no crash report is generated by a tegra (or whatever we're running tests on) use ndk-stack to get a stack from the toomstone → if no crash report is generated by a tegra (or whatever we're running tests on) use ndk-stack to get a stack from the tombstone
Comment 15•12 years ago
|
||
I found these reports of the same problem, but the answers have not helped: http://stackoverflow.com/questions/8218507/ndk-stack-not-working
http://grokbase.com/p/gg/android-ndk/125atrndgd/ndk-static-boost-thread-gnustl-shared
Comment 16•12 years ago
|
||
:jchen found and fixed a bug in ndk-stack and I tested his fixed ndk-stack against the tombstone in Comment 14 and the new libc.so with symbols: It produced a good report, with functions and line numbers for the libc references.
So :jmaher's libc.so appears to be functional and can be used with ndk-stack to get additional information from these troublesome tombstones. :jchen's patched ndk-stack is highly recommended.
Comment 17•12 years ago
|
||
Comment 18•12 years ago
|
||
This patch fixes a bug in ndk-stack where it tries to open a symbols file even if there is no symbols file for that stack frame. Apply it against the NDK sources at https://android.googlesource.com/platform/ndk, and run 'make -C sources/host-tools/ndk-stack -f GNUMakefile'
A pre-compiled ndk-stack for 32-bit Linux with the fix is at http://people.mozilla.org/~nchen/ndk-stack.tar.bz2
Updated•12 years ago
|
Assignee | ||
Comment 19•11 years ago
|
||
are we still interested in doing this?
Comment 20•10 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #19)
> are we still interested in doing this?
I still think that ndk-stack might help us understand some crashes better, but certainly it seems less urgent and in some ways less important now. We don't seem to see nearly as many unexplained crash stacks these days, and it feels like we have significantly fewer test crashes than we did when we opened this bug.
It occurs to me that it would be easier to just copy /data/tombstones files to MOZ_UPLOAD_DIR and not worry about running ndk-stack at test time -- leave it to the person investigating the crash to run ndk-stack on the tombstone if desired. What do you think?
Flags: needinfo?(jmaher)
Assignee | ||
Comment 21•10 years ago
|
||
that is a great idea- unless there are permissions issues getting the files from the device it should be simple. We would have to adjust the whitelist of file extensions in blobber upload to make that work.
great way to think out of the box:)
Flags: needinfo?(jmaher)
Comment 22•10 years ago
|
||
Let's do that then: Filed bug 1042097.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
Updated•7 years ago
|
Product: Core → Firefox Build System
You need to log in
before you can comment on or make changes to this bug.
Description
•