Closed
Bug 705154
Opened 13 years ago
Closed 13 years ago
Crash in chromehang | mach_msg_trap | ABORT: HangMonitor triggered
Categories
(Core :: XPCOM, defect)
Tracking
()
RESOLVED
FIXED
mozilla11
People
(Reporter: bc, Assigned: benjamin)
References
Details
(Keywords: crash, regression)
Crash Data
Attachments
(2 files)
(deleted),
patch
|
Gavin
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
smichaud
:
review+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-a75ba7c2-ab2e-4bc5-8e2c-eff802111124 .
=============================================================
Also bp-00403124-e021-4bfa-b7b2-0a2fd2111124
0 libmozalloc.dylib mozalloc_abort memory/mozalloc/mozalloc_abort.cpp:66
1 XUL NS_DebugBreak_P xpcom/base/nsDebugImpl.cpp:388
2 XUL mozilla::HangMonitor::ThreadMain xpcom/threads/HangMonitor.cpp:111
3 libnspr4.dylib _pt_root nsprpub/pr/src/pthreads/ptthread.c:187
4 libSystem.B.dylib _pthread_start
5 libSystem.B.dylib thread_start
I did have some memory intensive pages loaded, but I've been doing that for several days and have not seen this abort.
Reporter | ||
Updated•13 years ago
|
Component: General → XPCOM
Product: Firefox → Core
QA Contact: general → xpcom
Version: unspecified → Trunk
Reporter | ||
Comment 1•13 years ago
|
||
This started just after I updated to today's Nightly. I've been crashing regularly every few minutes since. I've been disabling extensions one by one to see if that helps. I looked at the push log for the last couple of days but didn't see anything that stood out as a possible cause.
Keywords: regression
Comment 2•13 years ago
|
||
Every crashes that have this stack trace have chromehang in their crash signature. The second term in the crash signature is the first frame in thread 0.
Summary: crash chromehang | ABORT: HangMonitor triggered → Crash in chromehang | mach_msg_trap
Reporter | ||
Updated•13 years ago
|
Summary: Crash in chromehang | mach_msg_trap → Crash in chromehang | mach_msg_trap | ABORT: HangMonitor triggered
Hi,
Mainly updating to add myself to the CC-list for this bug.
FWIW, I filed Bug 705003 which might be related. I saw HangMonitor Aborts right after the Bug 429592 patches were incorporated into the tinderbox builds. Most of my Nightly Crash Reports were unsuccessfully sent (about:crashes says cannot find most of those OOIDs), but one seems to have made it through:
https://crash-stats.mozilla.com/report/index/bp-305eb917-2c82-4245-9f0d-ad1ad2111124
... which apparently now has a pointer to this very bug-report.
I also posted to the nightly discussion mail-list, hopefully to fore-warn people that this would likely be seen in the next "official" Nightly builds.
BTW Starting Nightly via CLI with -safe-mode did *not* help anything with this bug, eventually we still got 'pop'ed. ;)
(I went back to a tinderbox build before the 429592 patches were applied.)
HTH
Reporter | ||
Comment 4•13 years ago
|
||
sci-fi, thanks. If you reload open the crash report from about:crashes and hit reload a few times it will probably be submitted. That's what I have to do. That definitely looks like a candidate.
Blocks: hang-detector
Hi,
Thank you to :bc: for the clue how to re-re-…-submit the about:crashes reports. Seems mine are all finally recorded.
Here's a list of the 13 reports I have, sectioned according to "Signature" and "Build ID"[1] fields:
Build ID 20111123101127
@ chromehang | TSFNTFont::GetFormat() const
https://crash-stats.mozilla.com/report/index/bp-22e7fd4f-c853-4e70-ab86-3e4202111124
Build ID 20111123111426
@ chromehang | TSFNTFont::GetFormat() const
https://crash-stats.mozilla.com/report/index/bp-0055ed42-8ce4-4dbc-a4bf-760e72111124
https://crash-stats.mozilla.com/report/index/bp-9b09a418-0f0a-4893-ac4a-ccc692111124
Build ID 20111123101127
@ chromehang | mach_msg_trap
https://crash-stats.mozilla.com/report/index/bp-2a77d2f6-39ad-4374-88c7-9c1b72111124
https://crash-stats.mozilla.com/report/index/bp-21f7503b-a058-44c9-a6fc-440042111124
https://crash-stats.mozilla.com/report/index/bp-462950ca-a71b-4f1d-8417-2f2b62111124
https://crash-stats.mozilla.com/report/index/bp-75ba6b07-0c1e-4d08-b124-133472111124
https://crash-stats.mozilla.com/report/index/bp-4a252859-0489-4894-9dc8-df6492111124
https://crash-stats.mozilla.com/report/index/bp-a16632b3-80f9-4fcc-89ed-90fec2111124
Build ID 20111123111426
@ chromehang | mach_msg_trap
https://crash-stats.mozilla.com/report/index/bp-088853f0-0766-4999-a578-19ada2111124
https://crash-stats.mozilla.com/report/index/bp-521bc212-cbc6-4c54-8e01-6f05a2111124
https://crash-stats.mozilla.com/report/index/bp-15a08077-fc70-41fc-b962-9d8e52111124
https://crash-stats.mozilla.com/report/index/bp-305eb917-2c82-4245-9f0d-ad1ad2111124
[1] - fetched from <https://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64/?C=M;O=D>
Bug 705154 seems for the "mach_msg_trap".
If "TSFNTFont::GetFormat" is not related, could we perhaps handle it in Bug 705003 to give it a "personality"?
(I'll post this list to both bugs.)
I only get this crash if I let Firefox sit idle for a few minutes. If I constantly use it, I don't get any crash. If I were to step away from the computer it will crash.
Updated•13 years ago
|
No longer blocks: hang-detector
Depends on: hang-detector
Updated•13 years ago
|
Blocks: hang-detector
No longer depends on: hang-detector
Comment 7•13 years ago
|
||
A side note, might not be related, though: I also experienced some system crashes today on my 17-inch, Late 2006 iMac with 10.7.2 and ATI Radeon X1600 128 MB. The crash was in plugin-container, I unfortunately didn't get the info on time since the system has crashed again and didn't give detailed info the second time.
Confirming the crash happen with Nightly in the background.
Assignee | ||
Comment 8•13 years ago
|
||
This bug appears as if the hang detector might be malfunctioning, but none of the crash reports have a usable stack on thread 0 (which is the interesting thread). I propose to disable the hang detector on mac for this weekends nightlies so that Ted and I can loop back around on Tuesday to figure out why we aren't getting better stacks. It may be that we need to get symbols for OS libraries.
Assignee | ||
Comment 9•13 years ago
|
||
Tagging a few possible reviewers of the temporary disablement, but if there's somebody else around who can review please feel free.
Attachment #576996 -
Flags: review?(smichaud)
Attachment #576996 -
Flags: review?(jmathies)
Attachment #576996 -
Flags: review?(gavin.sharp)
Assignee | ||
Comment 10•13 years ago
|
||
Please ignore the xpcom/ bits of this patch, they are for a different bug.
Assignee | ||
Updated•13 years ago
|
Assignee: nobody → benjamin
Comment 11•13 years ago
|
||
Comment on attachment 576996 [details] [diff] [review]
Disable the hang monitor on mac, rev. 1
(it'd be nice if the #ifndef DEBUG was an #ifdef instead, easier to read that way IMO)
Attachment #576996 -
Flags: review?(gavin.sharp) → review+
Comment 12•13 years ago
|
||
I'm also getting this quite frequently when I leave firefox in the background. Here are some of my crash signatures:
http://crash-stats.mozilla.com/report/index/bp-41e2788b-82f0-41be-bf84-6e76f2111125
http://crash-stats.mozilla.com/report/index/bp-535b250a-1a4b-4aa2-a630-4a8712111125
http://crash-stats.mozilla.com/report/index/bp-4ab0bfbc-983d-40ea-9a17-dd1792111125
They all have CoreFoundation@0x4c901 in the main thread which translates to:
> atos -l 0x0 -o /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation 0x4c901
CFDictionaryRemoveAllValues (in CoreFoundation) + 17
Assignee | ||
Comment 13•13 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/2729a78cd35e
once we hit unlabeled addresses we're not walking the stack correctly and I really want to see what is "above" all this on the stack. Leaving this bug open to track the real problem and reenable the hang monitor.
Status: NEW → ASSIGNED
Comment 14•13 years ago
|
||
(In reply to Benjamin Smedberg [:bsmedberg] from comment #8)
> This bug appears as if the hang detector might be malfunctioning, but none
> of the crash reports have a usable stack on thread 0 (which is the
> interesting thread). I propose to disable the hang detector on mac for this
> weekends nightlies so that Ted and I can loop back around on Tuesday to
> figure out why we aren't getting better stacks. It may be that we need to
> get symbols for OS libraries.
Why aren't we backing this out rather than putting in band-aids like this?
Assignee | ||
Comment 15•13 years ago
|
||
Why would we back it out when the pref was specifically designed so that we could disable it? It's still giving valuable data on Windows/Linux.
Comment 16•13 years ago
|
||
(In reply to Benjamin Smedberg [:bsmedberg] from comment #15)
> Why would we back it out when the pref was specifically designed so that we
> could disable it? It's still giving valuable data on Windows/Linux.
Valuable data == user crashes. This feature has resulted in almost an order of magnitude increase in crashes on Windows:
https://bugzilla.mozilla.org/show_bug.cgi?id=429592#c117
Like all regressions, this should be backed out or disabled by default on all platforms.
Comment 18•13 years ago
|
||
Comment on attachment 576996 [details] [diff] [review]
Disable the hang monitor on mac, rev. 1
http://hg.mozilla.org/mozilla-central/rev/2729a78cd35e
Attachment #576996 -
Flags: review?(smichaud)
Attachment #576996 -
Flags: review?(jmathies)
Comment 19•13 years ago
|
||
(In reply to John Daggett (:jtd) from comment #16)
> Valuable data == user crashes.
Nightly user crashes. This feature's goal was to turn hangs into crashes so that we could track them - that necessarily involves an increase in crash reports. Assuming the functionality is working as expected (an assumption that apparently might not hold true on Mac), there's no reason to back it out solely because the crash count increased. We do need to investigate the crash reports, of course...
Status: ASSIGNED → NEW
Comment 20•13 years ago
|
||
(In reply to John Daggett (:jtd) from comment #16)
> https://bugzilla.mozilla.org/show_bug.cgi?id=429592#c117
Sorry, I wasn't up to date on the comments in that bug when I wrote my last reply - it's obviously not a simple tradeoff, and the discussion there is much more nuanced. Forget I said anything!
Status: NEW → ASSIGNED
Assignee | ||
Comment 21•13 years ago
|
||
The stack which I didn't account for is:
#0 0x00007fff863cad7a in mach_msg_trap ()
#1 0x00007fff863cb3ed in mach_msg ()
#2 0x00007fff8060a902 in __CFRunLoopRun ()
#3 0x00007fff80609d8f in CFRunLoopRunSpecific ()
#4 0x00007fff8587574e in RunCurrentEventLoopInMode ()
#5 0x00007fff85875553 in ReceiveNextEventCommon ()
#6 0x00007fff8587540c in BlockUntilNextEventMatchingListInMode ()
#7 0x00007fff83dd6eb2 in _DPSNextEvent ()
#8 0x00007fff83dd6801 in -[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] ()
#9 0x00007fff83d9c68f in -[NSApplication run] ()
#10 0x00000001028b8fd4 in nsAppShell::Run (this=0x100303a20) at /builds/mozilla-central/src/widget/src/cocoa/nsAppShell.mm:780
#11 0x0000000102616df5 in nsAppStartup::Run (this=0x1177ed5b0) at /builds/mozilla-central/src/toolkit/components/startup/nsAppStartup.cpp:220
#12 0x0000000101444ff0 in XRE_main (argc=3, argv=0x7fff5fbff860, aAppData=0x1000071c0) at /builds/mozilla-central/src/toolkit/xre/nsAppRunner.cpp:3558
#13 0x0000000100001aeb in do_main (exePath=0x7fff5fbff430 "/builds/mozilla-central/ff-debug/dist/NightlyDebug.app/Contents/MacOS/", argc=3, argv=0x7fff5fbff860) at /builds/mozilla-central/src/browser/app/nsBrowserApp.cpp:201
#14 0x0000000100001d4a in main (argc=3, argv=0x7fff5fbff860) at /builds/mozilla-central/src/browser/app/nsBrowserApp.cpp:287
So the cocoa version of nsAppShell doesn't delegate to nsBaseAppShell::Run which means that the XPCOM event loop is not the outermost event loop at all on mac.
I think this can be fixed by subclassing [NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] and suspending the hang monitor when appropriate, but I also need to write down how all the different native event loops work, because each one is a little bit different and together they are a nightmare.
Assignee | ||
Comment 22•13 years ago
|
||
Attachment #578045 -
Flags: review?(smichaud)
Comment 23•13 years ago
|
||
Comment on attachment 578045 [details] [diff] [review]
Suspend the hang monitor by overriding a method in the event loop, rev. 1
I haven't tested this. But it looks reasonable to me, and it should do no harm.
Attachment #578045 -
Flags: review?(smichaud) → review+
Comment 24•13 years ago
|
||
Mac patch checked in for mozilla11. Leaving the bug open for the remaining patch.
https://hg.mozilla.org/mozilla-central/rev/1b3f17ffa656
Target Milestone: --- → mozilla11
Assignee | ||
Comment 25•13 years ago
|
||
The other one landed already ;-)
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•