Closed Bug 916923 Opened 11 years ago Closed 11 years ago

Crash dumps not working for Android x86 emulator tests

Categories

(Testing :: General, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla27

People

(Reporter: gbrown, Assigned: froydnj)

References

Details

Attachments

(3 files, 2 obsolete files)

From :ted in https://bugzilla.mozilla.org/show_bug.cgi?id=895186#c186: > This isn't a symbols issue, this dump looks completely broken. If you can grab one of > these minidumps off of a slave and attach it here or somewhere else we can poke at it. Dump file attached. This was taken from an M3 test run against http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/gbrown@mozilla.com-3f363a773eed.
Assignee: nobody → ted
We're not writing out a good module list in this dump: 2013-09-16 15:04:46: minidump.cc:2706: ERROR: MinidumpModuleList could not store module 151/152, libmozalloc.so, 0xa9f0e000+0x4000 2013-09-16 15:04:46: minidump.cc:4613: ERROR: GetStream could not read stream type 4 2013-09-16 15:04:46: minidump_dump.cc:115: ERROR: minidump.GetModuleList() failed Is this just elfhack screwing us up again? /proc/self/maps from the dump shows: a9f0a000-a9f0e000 rw-p a9f0a000 00:00 0 a9f0e000-a9f10000 r-xs 00000000 00:08 2977 /dev/ashmem/libmozalloc.so (del eted) a9f10000-a9f12000 rw-s 00001000 00:08 2977 /dev/ashmem/libmozalloc.so (del eted) a9f12000-a9f15000 rw-s 00000000 00:08 2977 /dev/ashmem/libmozalloc.so (del eted)
Ah, that'd be on demand decompression messing with breakpad. It should happen on arm, too :-/ I guess it's time for ted to review those breakpad patches ;)
Aha, now i remember. This is something that did happen on arm and that we worked around here: https://mxr.mozilla.org/mozilla-central/source/mozglue/linker/Mappable.cpp#185 But that only takes care of the increasing addresses case, and that doesn't work on x86 where the addresses are decreasing. The easy work around here (short of getting ted to review those breakpad patches) is to add another similar guard page at the beginning. Nathan, would you mind writing the patch for this? I can't stay the only one writing code under mozglue/linker (and you the only one reviewing).
Flags: needinfo?(nfroyd)
(In reply to Mike Hommey [:glandium] from comment #3) > Aha, now i remember. This is something that did happen on arm and that we > worked around here: > https://mxr.mozilla.org/mozilla-central/source/mozglue/linker/Mappable. > cpp#185 > > But that only takes care of the increasing addresses case, and that doesn't > work on x86 where the addresses are decreasing. The easy work around here > (short of getting ted to review those breakpad patches) is to add another > similar guard page at the beginning. > > Nathan, would you mind writing the patch for this? I can't stay the only one > writing code under mozglue/linker (and you the only one reviewing). I can try writing the patch for this, but I'll either need to get an x86 Android device or have someone willing to test potentially broken patches for me. (And I won't be able to get to it this week.)
Flags: needinfo?(nfroyd)
(In reply to Nathan Froyd (:froydnj) (AFK 16-20 September) from comment #4) > (In reply to Mike Hommey [:glandium] from comment #3) > I can try writing the patch for this, but I'll either need to get an x86 > Android device or have someone willing to test potentially broken patches > for me. (And I won't be able to get to it this week.) I can test patches for you.
Android x86 test sets hidden on {mozilla-central, mozilla-inbound, b2g-inbound, fx-team, try} due to this bug (per bug 895186 comment 196).
Blocks: 917361
No longer blocks: 895186
Blocks: 914439, 907383
(In reply to Phil Ringnalda (:philor) from comment #7) > Interestingly, this is intermittent - > https://tbpl.mozilla.org/php/getParsedLog.php?id=28007387&tree=Mozilla- > Inbound is busted, > https://tbpl.mozilla.org/php/getParsedLog.php?id=27990168&tree=Mozilla- > Inbound is what that crash should have been. This is not entirely surprising, as this depends on the memory layout. It is very possible that in some cases, memory is laid in such a way that makes the library mappings and the decompression mappings not adjacent, and that would stop confusing breakpad.
From IRC: ted: i have no plans to fix that, i think glandium or froydnj is going to Any volunteers to own the bug?
Assignee: ted → nobody
Nathan said he'd write the patch in comment 4
Assignee: nobody → nfroyd
Mike, I think this is what you had in mind in comment 3. I was going to try to add just a single page on each platform as appropriate, but two pages is easier to think about and read the code for. Geoff, could you please test this patch? I think it works ok on my local ARM device.
Attachment #808727 - Flags: review?(mh+mozilla)
Attachment #808727 - Flags: feedback?(gbrown)
Hm, this page actually fails to load libraries properly; the headers of the libraries turn up to be all zeros...not a good thing.
This patch seems to work better; I can actually browse pages on ARM now. The remapping of the area we want to use seemingly should have been done regardless; the mmap(2) man page says: MAP_FIXED Don't interpret addr as a hint: place the mapping at exactly that address. addr must be a multiple of the page size. If the memory region specified by addr and len overlaps pages of any existing mapping(s), then the overlapped part of the existing mapping(s) will be discarded. If the specified address cannot be used, mmap() will fail. Because requiring a fixed address for a mapping is less portable, the use of this option is discouraged. In particular, the remapping of the last page that we were doing previously ought to have deleted the mapping we just set up. Perhaps the kernel is only zealous about checking when regions have identical start addresses (as the remap of the first page here does) and is less worried about remapping within the region...? Geoff, if you would test this; I think it works better than the last one. :)
Attachment #808727 - Attachment is obsolete: true
Attachment #808727 - Flags: review?(mh+mozilla)
Attachment #808727 - Flags: feedback?(gbrown)
Attachment #808822 - Flags: review?(mh+mozilla)
Attachment #808822 - Flags: feedback?(gbrown)
Blocks: 919784
Comment on attachment 808822 [details] [diff] [review] work around crash reporter issues with adjacent memory mappings on x86 android Review of attachment 808822 [details] [diff] [review]: ----------------------------------------------------------------- ::: mozglue/linker/Mappable.cpp @@ +191,5 @@ > + * and on others, such mappings are placed at adjacent, decreasing > + * memory addresses (x86). It is more convenient to just do two pages > + * everywhere than to twiddle with platform-specific #defines. > + */ > + void *buf = ::mmap(NULL, length + 2 * PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); Since you're actually (re)mapping further below, you might as well map anonymous PROT_NONE memory here, and mmap MAP_FIXED for the actual buffer as you're doing now. And that'd remove the need for the two mmaps for the first and last page, since you'd be mapping them here. @@ +196,2 @@ > if (buf != MAP_FAILED) { > + DEBUG_LOG("Decompression buffer of size %x in ashmem \"%s\", mapped @%p", 0x%x @@ +201,5 @@ > + char *last_page = map_page + ((length + PAGE_SIZE - 1) & PAGE_MASK); > + int flags = MAP_FIXED | MAP_PRIVATE | MAP_ANONYMOUS; > + ::mmap(first_page, PAGE_SIZE, PROT_NONE, flags, -1, 0); > + ::mmap(last_page, PAGE_SIZE, PROT_NONE, flags, -1, 0); > + trailing whitespaces. @@ +204,5 @@ > + ::mmap(last_page, PAGE_SIZE, PROT_NONE, flags, -1, 0); > + > + void *actual_buf = ::mmap(map_page, last_page - map_page, PROT_READ | PROT_WRITE, > + MAP_FIXED | MAP_SHARED, fd, 0); > + DEBUG_LOG("Actual usable buffer of size %x in ashmem \"%s\", mapped @%p", I'd just move the Decompression buffer... message here.
Attachment #808822 - Flags: review?(mh+mozilla) → feedback+
One large anonymous mmap with a fixed shared mmap in the middle coming right up.
Attachment #808822 - Attachment is obsolete: true
Attachment #808822 - Flags: feedback?(gbrown)
Attachment #808916 - Flags: review?(mh+mozilla)
Attachment #808916 - Flags: feedback?(gbrown)
Comment on attachment 808916 [details] [diff] [review] work around crash reporter issues with adjacent memory mappings on x86 android Review of attachment 808916 [details] [diff] [review]: ----------------------------------------------------------------- ::: mozglue/linker/Mappable.cpp @@ +201,5 @@ > + > + void *actual_buf = ::mmap(map_page, last_page - map_page, PROT_READ | PROT_WRITE, > + MAP_FIXED | MAP_SHARED, fd, 0); > + if (actual_buf == MAP_FAILED) { > + DEBUG_LOG("Fixed allocation of decompression buffer at %p failed", map_page); may want to unmap buf, here.
Attachment #808916 - Flags: review?(mh+mozilla) → review+
Comment on attachment 808916 [details] [diff] [review] work around crash reporter issues with adjacent memory mappings on x86 android Review of attachment 808916 [details] [diff] [review]: ----------------------------------------------------------------- ::: mozglue/linker/Mappable.cpp @@ +202,5 @@ > + void *actual_buf = ::mmap(map_page, last_page - map_page, PROT_READ | PROT_WRITE, > + MAP_FIXED | MAP_SHARED, fd, 0); > + if (actual_buf == MAP_FAILED) { > + DEBUG_LOG("Fixed allocation of decompression buffer at %p failed", map_page); > + retun NULL; s/retun/return/ !!
(In reply to Geoff Brown [:gbrown] from comment #17) > Comment on attachment 808916 [details] [diff] [review] > work around crash reporter issues with adjacent memory mappings on x86 > android > > Review of attachment 808916 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: mozglue/linker/Mappable.cpp > @@ +202,5 @@ > > + void *actual_buf = ::mmap(map_page, last_page - map_page, PROT_READ | PROT_WRITE, > > + MAP_FIXED | MAP_SHARED, fd, 0); > > + if (actual_buf == MAP_FAILED) { > > + DEBUG_LOG("Fixed allocation of decompression buffer at %p failed", map_page); > > + retun NULL; > > s/retun/return/ > !! I know, discovered that in my try run. :( Fixed version running at: https://tbpl.mozilla.org/?tree=Try&rev=2dbed6097358
Operating system: Android 0.0.0 Linux 2.6.29 #2 PREEMPT Thu Nov 15 13:38:07 CST 2012 i686 generic_x86/sdk_x86/generic_x86:4.2/JOP40C/eng.and roid-build.20121231.103448:eng/test-keys CPU: x86 GenuineIntel family 6 model 3 stepping 3 1 CPU Crash reason: SIGSEGV Crash address: 0x4 Thread 47 (crashed) 0 libcutils.so + 0x7449 eip = 0xb7f7e449 esp = 0x9b52ba4c ebp = 0x9b52baac ebx = 0xb7f75ff4 esi = 0x00000000 edi = 0x9b52bb24 eax = 0x00000001 ecx = 0xb8061740 edx = 0x00000004 efl = 0x00200282 Found by: given as instruction pointer in context Thread 0 0 libc.so + 0x26997 eip = 0xb7feb997 esp = 0xbfe989dc ebp = 0xb9f41ab4 ebx = 0x0000001b esi = 0xffffffff edi = 0xffffffff eax = 0xfffffffc ecx = 0xbfe98a7c edx = 0x00000010 efl = 0x00000246 Found by: given as instruction pointer in context 1 libutils.so + 0x25ba1 eip = 0xb7f69ba2 esp = 0xbfe989f0 ebp = 0xb9f41ab4 Found by: stack scanning 2 libc.so + 0x132e9 eip = 0xb7fd82ea esp = 0xbfe98a10 ebp = 0xb9f41ab4 Found by: stack scanning 3 libc.so + 0x10cf9 eip = 0xb7fd5cfa esp = 0xbfe98a20 ebp = 0xb9f41ab4 Found by: stack scanning 4 libc.so + 0x10ca7 eip = 0xb7fd5ca8 esp = 0xbfe98a30 ebp = 0xb9f41ab4 Found by: stack scanning 5 libnss3.so!pt_PostNotifies [ptsynch.c:59411322cbbb : 113 + 0x7] eip = 0xa73b3c08 esp = 0xbfe98a40 ebp = 0xb9f41ab4 Found by: stack scanning 6 libdvm.so + 0x7eac5 eip = 0xb72c9ac6 esp = 0xbfe98a48 ebp = 0xb9f41ab4 Found by: stack scanning 7 libnss3.so!pt_PostNotifies [ptsynch.c:59411322cbbb : 70 + 0xa] eip = 0xa73b3b4b esp = 0xbfe98aac ebp = 0xb9f41ab4 Found by: stack scanning 8 libnss3.so!PR_Unlock [ptsynch.c:59411322cbbb : 208 + 0xb] eip = 0xa73b3cf5 esp = 0xbfe98ac0 ebp = 0xb9f41ab4 Found by: stack scanning 9 libxul.so!nsAString_internal::SetCapacity(unsigned int, mozilla::fallible_t const&) [nsTSubstring.cpp:59411322cbbb : 546 + 0xe] eip = 0xa23de09b esp = 0xbfe98acc ebp = 0xb9f41ab4 Found by: stack scanning 10 libxul.so!mozilla::BaseAutoLock<mozilla::Mutex>::~BaseAutoLock() [Mutex.h:59411322cbbb : 79 + 0xb] eip = 0xa16586e9 esp = 0xbfe98ae0 ebp = 0xb9f41ab4 Found by: stack scanning 11 libnss3.so!PR_NotifyCondVar [ptsynch.c:59411322cbbb : 413 + 0x7] eip = 0xa73b3e72 esp = 0xbfe98af0 ebp = 0xb9f41ab4 Found by: stack scanning 12 libxul.so!mozilla::BaseAutoLock<mozilla::Mutex>::~BaseAutoLock() [Mutex.h:59411322cbbb : 176 + 0x8] eip = 0xa16586d3 esp = 0xbfe98af4 ebp = 0xb9f41ab4 Found by: stack scanning 13 libxul.so!nsAppShell::NotifyNativeEvent() [nsAppShell.cpp:59411322cbbb : 163 + 0xa] eip = 0xa20073d7 esp = 0xbfe98b00 ebp = 0xb9f41ab4 Found by: stack scanning 14 libxul.so!mozilla::BaseAutoLock<mozilla::Mutex>::~BaseAutoLock() [Mutex.h:59411322cbbb : 79 + 0xb] eip = 0xa16586e9 esp = 0xbfe98b10 ebp = 0xb9f41ab4 Found by: stack scanning 15 libxul.so!nsAppShell::PostEvent(mozilla::AndroidGeckoEvent*) [nsAppShell.cpp:59411322cbbb : 769 + 0x7] eip = 0xa2007b56 esp = 0xbfe98b30 ebp = 0xb9f41ab4 Found by: stack scanning 16 libutils.so + 0x25ad8 eip = 0xb7f69ad9 esp = 0xbfe98b5c ebp = 0xb9f41ab4 Found by: stack scanning 17 libandroid_runtime.so + 0x11aff3 eip = 0xb7ec8ff4 esp = 0xbfe98b60 ebp = 0xb9f41ab4 Found by: stack scanning 18 libutils.so + 0x2616b eip = 0xb7f6a16c esp = 0xbfe98b70 ebp = 0xb9f41ab4 Found by: stack scanning 19 libbinder.so + 0x1f6ff eip = 0xb7ef1700 esp = 0xbfe98b80 ebp = 0xb9f41ab4 Found by: stack scanning ...
Attachment #808916 - Flags: feedback?(gbrown) → feedback+
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla27
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: