Closed Bug 1045804 Opened 10 years ago Closed 9 years ago

Android 4.0 crashes rarely have usable stacks

Categories

(Firefox for Android Graveyard :: General, defect)

ARM
Android
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: RyanVM, Assigned: gbrown)

References

Details

Crashes like the logs below are pretty common and getting to the point where they're basically being ignored due to how unactionable they are. https://tbpl.mozilla.org/php/getParsedLog.php?id=44811866&tree=Mozilla-Inbound https://tbpl.mozilla.org/php/getParsedLog.php?id=44825017&tree=Fx-Team Or it'll be stacks with a crash at some random address like 0x6161616a as the top frame. Can we please fix this?
I think we have looked into this a few times, but found no resolution. For: 09:55:35 WARNING - PROCESS-CRASH | /tests/dom/indexedDB/test/test_open_objectStore.html | application crashed [Unknown top frame] 09:55:35 INFO - Crash dump filename: /tmp/tmpJ2rphp/3aad24d1-345c-2f25-1d0a2a65-6d4d4883.dmp 09:55:35 INFO - stderr from minidump_stackwalk: 09:55:35 INFO - 2014-07-29 09:55:35: minidump_processor.cc:264: INFO: Processing minidump in file /tmp/tmpJ2rphp/3aad24d1-345c-2f25-1d0a2a65-6d4d4883.dmp 09:55:35 INFO - 2014-07-29 09:55:35: minidump.cc:3815: INFO: Minidump opened minidump /tmp/tmpJ2rphp/3aad24d1-345c-2f25-1d0a2a65-6d4d4883.dmp 09:55:35 INFO - 2014-07-29 09:55:35: minidump.cc:3847: ERROR: Minidump header signature mismatch: (0x0, 0x0) != 0x504d444d 09:55:35 INFO - 2014-07-29 09:55:35: minidump_processor.cc:268: ERROR: Minidump /tmp/tmpJ2rphp/3aad24d1-345c-2f25-1d0a2a65-6d4d4883.dmp could not be read 09:55:35 INFO - 2014-07-29 09:55:35: minidump.cc:3787: INFO: Minidump closing minidump 09:55:35 INFO - 2014-07-29 09:55:35: minidump_stackwalk.cc:529: ERROR: MinidumpProcessor::Process failed 09:55:35 INFO - minidump_stackwalk exited with return code 1 The .dmp file is at mozilla-releng-blobs.s3.amazonaws.com/blobs/mozilla-inbound/sha512/253ae290016d19f1a076ab7266a4c1b90d0fb09e5798d39b20d560763315068939741be45ff32adf6cc4d514b6cb3281927b1eaf3b5f16235b583c15ed22f68d and starts with: 0000000 0000 0000 0000 0000 0000 0000 0000 0000 * 0000020 0003 0000 0994 0000 00c0 0000 0004 0000 0000030 32a4 0000 84a8 0003 0005 0000 0374 0000 0000040 7198 0005 0006 0000 00a8 0000 7510 0005 0000050 0000 0000 0000 0000 0000 0000 0000 0000 * 00000c0 0033 0000 08c3 0000 0000 0000 0000 0000 00000d0 0000 0000 0000 0000 0000 0000 7000 bef0 00000e0 0000 0000 1000 0000 0a58 0000 0170 0000 As I recall, :ted has looked at these too. :blassey -- any ideas?
Flags: needinfo?(blassey.bugs)
Ted and Jim are more likely to know what's going on here than me.
Flags: needinfo?(ted)
Flags: needinfo?(nchen)
Flags: needinfo?(blassey.bugs)
Looks like for some reason, the header in the minidump (the first 32 bytes) got replaced with all zeros. If I replace the header with a sensible one, minidump_stackwalk has no trouble processing it and gives this stack for the minidump in comment 1, > Thread 13 (crashed) > 0 libxul.so + 0x1a28f16 > Found by: given as instruction pointer in context > 1 libxul.so + 0x1a28f9f > Found by: call frame info > 2 libxul.so!js::gc::Cell::isAligned() const [Heap.h:ac8248c5b891 : 592 + 0x7] > Found by: stack scanning > 3 libxul.so!CheckMarkedThing<JSObject> + 0xbb > Found by: call frame info > 4 libxul.so!js::gc::MarkObjectRange(JSTracer*, unsigned int, js::HeapPtr<JSObject*>*, char const*) [Marking.cpp:ac8248c5b891 : 232 + 0x3] > Found by: call frame info > 5 libxul.so!JSScript::markChildren(JSTracer*) [jsscript.cpp:ac8248c5b891 : 3335 + 0xd] > Found by: call frame info > 6 libxul.so!MarkInternal<JSScript> [Marking.cpp:ac8248c5b891 : 270 + 0xb] > Found by: call frame info > 7 libxul.so!JSFunction::trace(JSTracer*) [jsfun.cpp:ac8248c5b891 : 589 + 0xd] > Found by: call frame info > 8 libxul.so!js::GCMarker::processMarkStackTop(js::SliceBudget&) [Marking.cpp:ac8248c5b891 : 1698 + 0x5] > Found by: call frame info > 9 libxul.so!js::GCMarker::drainMarkStack(js::SliceBudget&) [Marking.cpp:ac8248c5b891 : 1749 + 0x7] > Found by: call frame info > 10 libxul.so!js::gc::GCRuntime::drainMarkStack(js::SliceBudget&, js::gcstats::Phase) [jsgc.cpp:ac8248c5b891 : 4331 + 0x3] > Found by: call frame info > 11 libxul.so!js::gc::GCRuntime::incrementalCollectSlice(long long, JS::gcreason::Reason, js::JSGCInvocationKind) [jsgc.cpp:ac8248c5b891 : 4839 + 0xb] > Found by: call frame info > 12 libxul.so!js::gc::GCRuntime::gcCycle(bool, long long, js::JSGCInvocationKind, JS::gcreason::Reason) [jsgc.cpp:ac8248c5b891 : 5047 + 0x9] > Found by: call frame info > 13 libxul.so!js::gc::GCRuntime::collect(bool, long long, js::JSGCInvocationKind, JS::gcreason::Reason) [jsgc.cpp:ac8248c5b891 : 5174 + 0x11] > Found by: call frame info > 14 libxul.so!JS::IncrementalGC(JSRuntime*, JS::gcreason::Reason, long long) [jsgc.cpp:ac8248c5b891 : 5222 + 0xd] > Found by: call frame info > 15 libxul.so!nsJSContext::GarbageCollectNow(JS::gcreason::Reason, nsJSContext::IsIncremental, nsJSContext::IsShrinking, long long) [nsJSEnvironment.cpp:ac8248c5b891 : 1747 + 0xf] > Found by: call frame info > 16 libxul.so!InterSliceGCTimerFired(nsITimer*, void*) [nsJSEnvironment.cpp:ac8248c5b891 : 2234 + 0x11] > Found by: call frame info > 17 libxul.so!nsTimerImpl::Fire() [nsTimerImpl.cpp:ac8248c5b891 : 618 + 0x5] > Found by: call frame info > 18 libxul.so!nsTimerEvent::Run() [nsTimerImpl.cpp:ac8248c5b891 : 711 + 0x9] > Found by: call frame info > 19 libxul.so!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:ac8248c5b891 : 770 + 0xb] > Found by: call frame info
Flags: needinfo?(nchen)
Looking deeper at the minidump, it appears breakpad stops writing the minidump somewhere after this point [1]. Because it flushes the header to disk at the very end, we're left with a minidump that's missing its header and a lot of data. No idea why this happens though. [1] http://mxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/google-breakpad/src/client/linux/minidump_writer/minidump_writer.cc?rev=aa176fcc56b8#481
I can think of a couple of possibilities: 1) Test harness kills the browser before it's done writing the minidump. 2) Test harness pulls the minidump off of the device before it's fully written. 3) Some part of the Java app kills the process before it's done writing. 4) Something in the minidump writer code crashes after partially writing data.
Flags: needinfo?(ted)
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #5) > I can think of a couple of possibilities: > 1) Test harness kills the browser before it's done writing the minidump. > 2) Test harness pulls the minidump off of the device before it's fully > written. Geoff, can we try to see if either of these is the case? Perhaps we can add a 15s (?) sleep before killing or pulling the minidump, to see if that makes any difference, and if it does, we could go about adding a more deterministic method for checking this. Probably the "right" way to do this is to check the file size and verify it's stable over X seconds before killing the browser or pulling the minidump. I'm not sure what X should be, though.
Flags: needinfo?(gbrown)
Assignee: nobody → gbrown
Flags: needinfo?(gbrown)
To determine if changes make a difference, I first tried to establish a base line by intentionally crashing and making no changes to minidump handling: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=c010d423ae2a. The first 20+ crashes on both Android 2.3 and Android 4.0 have produced perfect stacks. I'll try to find a different way to crash that reproduces the problem.
I had better luck reproducing bad dumps with a Robocop hang - sleep in Java, in the Robotium test thread: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=9d39b9c023e0. Most 2.3 dumps are good; most 4.0 dumps are not. Same thing, with longer waits during the "staged shutdown" that follows a test timeout: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=2ea066f71df2
(In reply to Geoff Brown [:gbrown] from comment #8) > I had better luck reproducing bad dumps with a Robocop hang - sleep in Java, > in the Robotium test thread: > https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=9d39b9c023e0. > Most 2.3 dumps are good; most 4.0 dumps are not. Of 10 crashes, 7 failed, typically with "ERROR: Minidump header signature mismatch: (0x0, 0x0) != 0x504d444d" (as in Comment 3). > Same thing, with longer waits during the "staged shutdown" that follows a > test timeout: > https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=2ea066f71df2 Of 10 crashes, 7 failed, typically with "ERROR: Minidump header signature mismatch: (0x0, 0x0) != 0x504d444d". In the normal "staged shutdown" following a test timeout, the harness does this: kill -3 (to generate ANR) wait 3 seconds kill -6 (to generate minidump and abort) wait up to 15 seconds (until process dies) kill -9 retrieve minidump In my try run with longer waits, we: kill -3 wait 30 seconds kill -6 wait up to 45 seconds kill -9 wait 30 seconds retrieve minidump For this type of "crash", waiting longer in the test harness does not help, suggesting that Ted's 1) and 2) possibilities (Comment 5) are not to blame.
A different example I happened upon: 2.3 robocop crash on shutdown: http://ftp.mozilla.org/pub/mozilla.org/mobile/try-builds/gbrown@mozilla.com-3a577e72d84d/try-android/try_ubuntu64_vm_mobile_test-robocop-1-bm118-tests1-linux64-build489.txt.gz 14:53:10 INFO - mozcrash Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/try-builds/gbrown@mozilla.com-3a577e72d84d/try-android/fennec-36.0a1.en-US.android-arm.crashreporter-symbols.zip 14:53:10 INFO - mozcrash Saved minidump as /builds/slave/test/build/blobber_upload_dir/405f6533-aa16-6b57-7b30b7af-12dfed05.dmp 14:53:10 WARNING - PROCESS-CRASH | testGeckoProfile | application crashed [None] 14:53:10 INFO - Crash dump filename: /tmp/tmpBpWV3L/405f6533-aa16-6b57-7b30b7af-12dfed05.dmp 14:53:10 INFO - stderr from minidump_stackwalk: 14:53:10 INFO - 2014-11-04 14:12:39: minidump_processor.cc:264: INFO: Processing minidump in file /tmp/tmpBpWV3L/405f6533-aa16-6b57-7b30b7af-12dfed05.dmp 14:53:10 INFO - 2014-11-04 14:12:39: minidump.cc:3815: INFO: Minidump opened minidump /tmp/tmpBpWV3L/405f6533-aa16-6b57-7b30b7af-12dfed05.dmp 14:53:10 INFO - 2014-11-04 14:12:39: minidump.cc:3860: INFO: Minidump not byte-swapping minidump 14:53:10 INFO - 2014-11-04 14:12:39: minidump.cc:4226: INFO: GetStream: type 7 not present 14:53:10 INFO - 2014-11-04 14:12:39: minidump.cc:4226: INFO: GetStream: type 7 not present 14:53:10 INFO - 2014-11-04 14:12:39: minidump.cc:4226: INFO: GetStream: type 1197932545 not present 14:53:10 INFO - 2014-11-04 14:12:39: minidump.cc:4226: INFO: GetStream: type 6 not present 14:53:10 INFO - 2014-11-04 14:12:39: minidump.cc:4226: INFO: GetStream: type 1197932546 not present 14:53:10 INFO - 2014-11-04 14:12:39: minidump.cc:4226: INFO: GetStream: type 4 not present 14:53:10 INFO - 2014-11-04 14:12:39: minidump.cc:4226: INFO: GetStream: type 3 not present 14:53:10 INFO - 2014-11-04 14:12:39: minidump_processor.cc:112: ERROR: Minidump /tmp/tmpBpWV3L/405f6533-aa16-6b57-7b30b7af-12dfed05.dmp has no thread list 14:53:10 INFO - 2014-11-04 14:12:39: minidump.cc:3787: INFO: Minidump closing minidump 14:53:10 INFO - 2014-11-04 14:12:39: minidump_stackwalk.cc:529: ERROR: MinidumpProcessor::Process failed
I tried eliminating the kill -3, in case ANR generation was interfering with the minidump somehow -- but it did not help. https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=1437d94540e8
(In reply to Jim Chen [:jchen] from comment #4) > Looking deeper at the minidump, it appears breakpad stops writing the > minidump somewhere after this point [1]. Because it flushes the header to > disk at the very end, we're left with a minidump that's missing its header > and a lot of data. No idea why this happens though. > > [1] > http://mxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/google- > breakpad/src/client/linux/minidump_writer/minidump_writer. > cc?rev=aa176fcc56b8#481 Thanks Jim. Using my robocop hang (Comment 8), I've been able to reproduce and narrow it down to the /proc/cpuinfo parsing: http://mxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/google-breakpad/src/client/linux/minidump_writer/minidump_writer.cc?rev=aa176fcc56b8#1242 Still working on it...
We are crashing at http://hg.mozilla.org/mozilla-central/annotate/c0d559389a5c/toolkit/crashreporter/google-breakpad/src/client/linux/minidump_writer/minidump_writer.cc#l1258: 1258 if (!my_strncmp(line, entry->info_name, strlen(entry->info_name))) { on the first (only) entry. If I try to call strlen(entry->info_name) at this point, that crashes too. I thought it might be better to call my_strlen instead -- but that crashes just the same. Any attempt to access the content of entry->info_name meets the same fate. Since entry should be a pointer to cpu_info_table, I tried to access cpu_info_table[0].info_name[0] immediately after it is initialized at http://hg.mozilla.org/mozilla-central/annotate/c0d559389a5c/toolkit/crashreporter/google-breakpad/src/client/linux/minidump_writer/minidump_writer.cc#l1222 -- that crashes too. 1217 struct CpuInfoEntry { 1218 const char* info_name; 1219 int value; 1220 bool found; 1221 } cpu_info_table[] = { 1222 { "processor", -1, false }, 1223 #if defined(__i386) || defined(__x86_64) 1224 { "model", 0, false }, 1225 { "stepping", 0, false }, 1226 { "cpu family", 0, false }, 1227 #endif 1228 }; So...stack corruption? I notice that much of this code was re-written in http://code.google.com/p/google-breakpad/source/detail?r=1160 -- I wonder if an update would help.
It's likely to just be crappy parsing putting bad data in there. If you want to just try pulling that patch in (or pulling in the code wholesale, whatever is easiest), feel free.
I noticed that this crash only happens when breakpad does NOT install an alternate stack for signal handlers at: http://hg.mozilla.org/mozilla-central/annotate/134d1cfc5c9c/toolkit/crashreporter/google-breakpad/src/client/linux/handler/exception_handler.cc#l151 Breadpad does not install an alternate stack when elfloader has already installed one: http://hg.mozilla.org/mozilla-central/annotate/134d1cfc5c9c/mozglue/linker/ElfLoader.cpp#l1141 On Pandas, the elfloader alternate stack is frequently not installed because the signalHandlingSlow flag is set. If !signalHandlingSlow, elfloader installs an alternate stack (with size 12K), then breakpad does not install an alternate stack, and (for reasons unknown), the first access to local variable cpu_info_table[0].info_name crashes. I tried increasing the size of the elfloader alternate stack to 20K -- crashes continued. I tried increasing the required size of the breakpad alternate stack to 16K (so that breakpad installed a new alternate stack, even when elfloader already installed one) -- crashes continued. I tried forcing signalHandlingSlow = 1 -- no crashes.
:ryanvm pointed out this recent no-stack failure, where a stack would be useful: http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-inbound-android-api-11/1420644839/mozilla-inbound_panda_android_test-mochitest-5-bm102-tests1-panda-build5221.txt.gz This is an Android 4.0 hang during a mochitest and looks just like the robocop hang that I was using as a test case.
I suspect Robocop shutdown has been to blame for some of the bad crash reports. Example: http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-aurora-android-api-9/1421518439/mozilla-aurora_ubuntu64_vm_mobile_test-robocop-2-bm117-tests1-linux64-build20.txt.gz 11:53:48 INFO - TEST-OK | testPrefsObserver | took 55099ms 11:53:48 INFO - TEST-START | Shutdown 11:53:48 INFO - Passed: 20 11:53:48 INFO - Failed: 0 11:53:48 INFO - Todo: 0 11:53:48 INFO - SimpleTest FINISHED 11:53:48 INFO - INFO | automation.py | Application ran for: 0:01:29.325080 11:53:48 INFO - INFO | zombiecheck | Reading PID log: /tmp/tmpkT7k1Opidlog 11:53:48 INFO - Contents of /data/anr/traces.txt: 11:53:48 INFO - 11:53:48 INFO - 11:53:48 INFO - /data/tombstones does not exist; tombstone check skipped 11:53:48 INFO - mozcrash Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-aurora-android-api-9/1421518439/fennec-37.0a2.en-US.android-arm.crashreporter-symbols.zip 11:53:48 INFO - mozcrash Saved minidump as /builds/slave/test/build/blobber_upload_dir/754bb5b0-3d4f-509c-41c75410-38eba852.dmp 11:53:48 WARNING - PROCESS-CRASH | testPrefsObserver | application crashed [None] 11:53:48 INFO - Crash dump filename: /tmp/tmp_qrkPz/754bb5b0-3d4f-509c-41c75410-38eba852.dmp 11:53:48 INFO - stderr from minidump_stackwalk: 11:53:48 INFO - 2015-01-17 11:40:22: minidump_processor.cc:264: INFO: Processing minidump in file /tmp/tmp_qrkPz/754bb5b0-3d4f-509c-41c75410-38eba852.dmp 11:53:48 INFO - 2015-01-17 11:40:22: minidump.cc:3815: INFO: Minidump opened minidump /tmp/tmp_qrkPz/754bb5b0-3d4f-509c-41c75410-38eba852.dmp 11:53:48 INFO - 2015-01-17 11:40:22: minidump.cc:3860: INFO: Minidump not byte-swapping minidump 11:53:48 INFO - 2015-01-17 11:40:22: minidump.cc:4226: INFO: GetStream: type 7 not present 11:53:48 INFO - 2015-01-17 11:40:22: minidump.cc:4226: INFO: GetStream: type 7 not present 11:53:48 INFO - 2015-01-17 11:40:22: minidump.cc:4226: INFO: GetStream: type 1197932545 not present 11:53:48 INFO - 2015-01-17 11:40:22: minidump.cc:4226: INFO: GetStream: type 6 not present 11:53:48 INFO - 2015-01-17 11:40:22: minidump.cc:4226: INFO: GetStream: type 1197932546 not present 11:53:48 INFO - 2015-01-17 11:40:22: minidump.cc:4226: INFO: GetStream: type 4 not present 11:53:48 INFO - 2015-01-17 11:40:22: minidump.cc:4226: INFO: GetStream: type 3 not present 11:53:48 INFO - 2015-01-17 11:40:22: minidump_processor.cc:112: ERROR: Minidump /tmp/tmp_qrkPz/754bb5b0-3d4f-509c-41c75410-38eba852.dmp has no thread list 11:53:48 INFO - 2015-01-17 11:40:22: minidump.cc:3787: INFO: Minidump closing minidump 11:53:48 INFO - 2015-01-17 11:40:22: minidump_stackwalk.cc:529: ERROR: MinidumpProcessor::Process failed I am not sure, but I think robocop forced the process to exit while the minidump was being written. Also consider this common Robocop shutdown crash: http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-aurora-android-api-9/1421446979/mozilla-aurora_ubuntu64_vm_mobile_test-robocop-2-bm68-tests1-linux64-build2.txt.gz 16:24:45 INFO - TEST-OK | testInputUrlBar | took 90295ms 16:24:45 INFO - TEST-START | Shutdown 16:24:45 INFO - Passed: 28 16:24:45 INFO - Failed: 0 16:24:45 INFO - Todo: 0 16:24:45 INFO - SimpleTest FINISHED 16:24:45 INFO - INFO | automation.py | Application ran for: 0:02:08.156232 16:24:45 INFO - INFO | zombiecheck | Reading PID log: /tmp/tmpsffuhfpidlog 16:24:45 INFO - Contents of /data/anr/traces.txt: 16:24:45 INFO - 16:24:45 INFO - 16:24:45 INFO - /data/tombstones does not exist; tombstone check skipped 16:24:45 INFO - mozcrash Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-aurora-android-api-9/1421446979/fennec-37.0a2.en-US.android-arm.crashreporter-symbols.zip 16:24:45 INFO - mozcrash Saved minidump as /builds/slave/test/build/blobber_upload_dir/7e9f24ca-e761-e646-43d8c972-7c28af0c.dmp 16:24:45 INFO - mozcrash Saved app info as /builds/slave/test/build/blobber_upload_dir/7e9f24ca-e761-e646-43d8c972-7c28af0c.extra 16:24:45 WARNING - PROCESS-CRASH | testInputUrlBar | application crashed [@ libui.so + 0x1befe] 16:24:45 INFO - Crash dump filename: /tmp/tmpQCz9BW/7e9f24ca-e761-e646-43d8c972-7c28af0c.dmp 16:24:45 INFO - Operating system: Android 16:24:45 INFO - 0.0.0 Linux 2.6.29-ge3d684d #1 Mon Dec 16 22:26:51 UTC 2013 armv7l generic/sdk/generic:2.3.7/GINGERBREAD/eng.ubuntu.20140123.014351:eng/test-keys 16:24:45 INFO - CPU: arm 16:24:45 INFO - 0 CPUs 16:24:45 INFO - 16:24:45 INFO - Crash reason: SIGSEGV 16:24:45 INFO - Crash address: 0x2 16:24:45 INFO - 16:24:45 INFO - Thread 0 (crashed) 16:24:45 INFO - 0 libui.so + 0x1befe 16:24:45 INFO - r4 = 0x0024ed30 r5 = 0x00000002 r6 = 0x00000001 r7 = 0x00000002 16:24:45 INFO - r8 = 0xbeac5460 r9 = 0x4428ca78 r10 = 0x0000abe0 fp = 0xaca9f368 16:24:45 INFO - sp = 0xbeac53e8 lr = 0xac712bfd pc = 0xab91befe 16:24:45 INFO - Found by: given as instruction pointer in context 16:24:45 INFO - 1 libsurfaceflinger_client.so + 0x1977a 16:24:45 INFO - sp = 0xbeac53fc pc = 0xac71977c 16:24:45 INFO - Found by: stack scanning 16:24:45 INFO - 2 libsurfaceflinger_client.so + 0x12bfb 16:24:45 INFO - sp = 0xbeac5400 pc = 0xac712bfd 16:24:45 INFO - Found by: stack scanning 16:24:45 INFO - 3 dalvik-heap (deleted) + 0x5e7eee 16:24:45 INFO - sp = 0xbeac540c pc = 0x405f0ef0 16:24:45 INFO - Found by: stack scanning In this case, breakpad did its job and we produced a perfectly accurate crash dump, but the stack might be perceived as "unusable" since the crash happens in Android system libs. Recent efforts in bug 1105388 seem to effectively eliminate Robocop shutdown crashes, addressing both of these cases indirectly.
(In reply to Treeherder Robot from comment #18) > log: https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=8327010 Robocop shutdown crash: 21:14:07 INFO - TEST-OK | testAddonManager | took 76896ms 21:14:07 INFO - TEST-START | Shutdown 21:14:07 INFO - Passed: 20 21:14:07 INFO - Failed: 0 21:14:07 INFO - Todo: 0 21:14:07 INFO - SimpleTest FINISHED 21:14:27 INFO - INFO | automation.py | Application ran for: 0:01:47.781898 21:14:27 INFO - INFO | zombiecheck | Reading PID log: /tmp/tmp1kQORWpidlog 21:14:27 INFO - Contents of /data/anr/traces.txt: 21:14:28 INFO - mozcrash Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-inbound-android-api-11/1427858680/fennec-40.0a1.en-US.android-arm.crashreporter-symbols.zip 21:14:30 INFO - mozcrash Saved minidump as /builds/panda-0094/test/build/blobber_upload_dir/29e4d5b3-1d60-8aa3-4725dfd5-33e5b5fc.dmp 21:14:30 INFO - mozcrash Saved app info as /builds/panda-0094/test/build/blobber_upload_dir/29e4d5b3-1d60-8aa3-4725dfd5-33e5b5fc.extra 21:14:30 WARNING - PROCESS-CRASH | testAddonManager | application crashed [None] 21:14:30 INFO - Crash dump filename: /tmp/tmpUhePxF/29e4d5b3-1d60-8aa3-4725dfd5-33e5b5fc.dmp 21:14:30 INFO - stderr from minidump_stackwalk: 21:14:30 INFO - 2015-03-31 21:14:30: minidump_processor.cc:264: INFO: Processing minidump in file /tmp/tmpUhePxF/29e4d5b3-1d60-8aa3-4725dfd5-33e5b5fc.dmp 21:14:30 INFO - 2015-03-31 21:14:30: minidump.cc:3815: INFO: Minidump opened minidump /tmp/tmpUhePxF/29e4d5b3-1d60-8aa3-4725dfd5-33e5b5fc.dmp 21:14:30 INFO - 2015-03-31 21:14:30: minidump.cc:3847: ERROR: Minidump header signature mismatch: (0x0, 0x0) != 0x504d444d 21:14:30 INFO - 2015-03-31 21:14:30: minidump_processor.cc:268: ERROR: Minidump /tmp/tmpUhePxF/29e4d5b3-1d60-8aa3-4725dfd5-33e5b5fc.dmp could not be read 21:14:30 INFO - 2015-03-31 21:14:30: minidump.cc:3787: INFO: Minidump closing minidump 21:14:30 INFO - 2015-03-31 21:14:30: minidump_stackwalk.cc:529: ERROR: MinidumpProcessor::Process failed
(In reply to Treeherder Robot from comment #19) > log: > https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=8327061 Interesting! 21:43:08 INFO - 03-31 21:42:36.453 D/GeckoAppShell( 2088): Killing via System.exit() 21:43:08 INFO - 03-31 21:42:36.453 V/TabletStatusBar( 1500): setLightsOn(true) 21:43:08 INFO - 03-31 21:42:36.460 E/JavaBinder( 2088): Unknown binder error code. 0xfffffff7 21:43:08 INFO - 03-31 21:42:36.460 E/JavaBinder( 2088): Unknown binder error code. 0xfffffff7 21:43:08 INFO - 03-31 21:42:36.468 F/MOZ_CRASH( 2088): Hit MOZ_CRASH(Unexpected shutdown) at /builds/slave/m-in-and-api-11-d-000000000000/build/mozglue/linker/ElfLoader.cpp:514 21:43:08 INFO - 03-31 21:42:36.468 F/libc ( 2088): Fatal signal 11 (SIGSEGV) at 0x00000000 (code=1) That's http://hg.mozilla.org/mozilla-central/annotate/cf8864126c58/mozglue/linker/ElfLoader.cpp#l514, added for bug 1127464. :snorp -- thoughts?
Flags: needinfo?(snorp)
Geoff do you know how GeckoAppShell.systemExit() is being called? Is it from an addon? It looks like we're calling exit() without exiting the main loop. That's when that assert gets hit.
Flags: needinfo?(snorp)
I never did get around to following up on interesting findings like comments 13 and 15. I don't recall seeing any corrupt dumps for 4.0 (or elsewhere) for months now. And now that Android 4.0 tests are practically eliminated, wontfix seems the sensible resolution here.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.