Closed
Bug 608021
Opened 14 years ago
Closed 13 years ago
Crashes based in nsFrameMessageManager::ReceiveMessage [@ js::Interpret] [@ js::TraceRecorder::record_NativeCallComplete] [@ js_LookupPropertyWithFlags]
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: jdm, Assigned: smaug)
References
Details
(Keywords: crash, topcrash, Whiteboard: fennec-related-jscript-crashers)
Crash Data
Attachments
(1 file, 1 obsolete file)
(deleted),
patch
|
jst
:
review+
asa
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
I've been seeing mysterious JS engine crashes that are all a few frames removed from nsFrameMessageManager::ReceiveMessage, which makes me suspicious.
http://crash-stats.mozilla.com/report/index/dd2e2b60-a87d-401d-bc6c-7c76d2101028
0 libxul.so js::Interpret js/src/jsinterp.cpp:2760
1 libxul.so js::Invoke js/src/jsinterp.cpp:637
2 libxul.so js::ExternalInvoke js/src/jsinterp.cpp:855
3 libxul.so JS_CallFunctionValue js/src/jsinterp.h:955
4 libxul.so nsFrameMessageManager::ReceiveMessage content/base/src/nsFrameMessageManager.cpp:468
5 libxul.so nsAsyncMessageToChild::Run content/base/src/nsFrameLoader.cpp:1812
6 libxul.so nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:547
7 libxul.so NS_ProcessNextEvent_P nsThreadUtils.cpp:250
8 libxul.so mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:111
9 libxul.so MessageLoop::RunInternal ipc/chromium/src/base/message_loop.cc:220
10 libxul.so MessageLoop::Run ipc/chromium/src/base/message_loop.cc:512
11 libxul.so nsBaseAppShell::Run widget/src/xpwidgets/nsBaseAppShell.cpp:187
12 libxul.so nsAppStartup::Run toolkit/components/startup/src/nsAppStartup.cpp:192
13 libxul.so XRE_main toolkit/xre/nsAppRunner.cpp:3684
14 libxul.so GeckoStart toolkit/xre/nsAndroidStartup.cpp:131
15 libc.so libc.so@0x11453
16 libc.so libc.so@0x10f3f
http://crash-stats.mozilla.com/report/index/89f2c194-93ac-49f8-a565-f4e392101027
0 libxul.so js::TraceRecorder::record_NativeCallComplete js/src/jstracer.cpp:8093
1 libxul.so js::Interpret js/src/jstracer.h:981
2 libxul.so js::Invoke js/src/jsinterp.cpp:638
3 libxul.so js::ExternalInvoke js/src/jsinterp.cpp:856
4 libxul.so JS_CallFunctionValue js/src/jsinterp.h:956
5 libxul.so nsFrameMessageManager::ReceiveMessage content/base/src/nsFrameMessageManager.cpp:468
6 libxul.so nsAsyncMessageToChild::Run content/base/src/nsFrameLoader.cpp:1812
7 libxul.so nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:547
8 libxul.so NS_ProcessNextEvent_P nsThreadUtils.cpp:250
9 libxul.so mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:111
10 libxul.so MessageLoop::RunInternal ipc/chromium/src/base/message_loop.cc:220
11 libxul.so MessageLoop::Run ipc/chromium/src/base/message_loop.cc:512
12 libxul.so nsBaseAppShell::Run widget/src/xpwidgets/nsBaseAppShell.cpp:187
13 libxul.so nsAppStartup::Run toolkit/components/startup/src/nsAppStartup.cpp:192
14 libxul.so XRE_main toolkit/xre/nsAppRunner.cpp:3684
15 libxul.so GeckoStart toolkit/xre/nsAndroidStartup.cpp:131
16 libc.so libc.so@0xfdd7
17 libc.so libc.so@0xf8a3
http://crash-stats.mozilla.com/report/index/bd647b32-12aa-4a75-9083-f5b5b2101028
0 libxul.so js::NewBuiltinClassInstance js/src/jsgc.h:629
1 libxul.so js_ConsumeJSONText js/src/json.cpp:848
2 libxul.so JS_ConsumeJSONText js/src/jsapi.cpp:5353
3 libxul.so nsFrameMessageManager::ReceiveMessage content/base/src/nsFrameMessageManager.cpp:392
4 libxul.so mozilla::dom::TabParent::ReceiveMessage jsapi.h:757
5 libxul.so mozilla::dom::TabParent::RecvAsyncMessage dom/ipc/TabParent.cpp:286
6 libxul.so mozilla::dom::PBrowserParent::OnMessageReceived PBrowserParent.cpp:573
7 libxul.so mozilla::dom::PContentParent::OnMessageReceived PContentParent.cpp:463
8 libxul.so mozilla::ipc::AsyncChannel::OnDispatchMessage ipc/glue/AsyncChannel.cpp:262
9 libxul.so mozilla::ipc::RPCChannel::OnMaybeDequeueOne ipc/glue/RPCChannel.cpp:440
10 libxul.so RunnableMethod<mozilla::ipc::RPCChannel, bool , Tuple0>::Run ipc/chromium/src/base/task.h:308
11 libxul.so mozilla::ipc::RPCChannel::DequeueTask::Run RPCChannel.h:474
12 libxul.so MessageLoop::RunTask ipc/chromium/src/base/message_loop.cc:344
13 libxul.so MessageLoop::DeferOrRunPendingTask ipc/chromium/src/base/message_loop.cc:354
14 libxul.so MessageLoop::DoWork ipc/chromium/src/base/message_loop.cc:451
15 libxul.so mozilla::ipc::DoWorkRunnable::Run ipc/glue/MessagePump.cpp:71
16 libxul.so nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:547
17 libxul.so NS_InvokeByIndex_P xpcom/reflect/xptcall/src/md/unix/xptcinvoke_arm.cpp:191
18 libxul.so XPCWrappedNative::CallMethod js/src/xpconnect/src/xpcwrappednative.cpp:3054
19 libxul.so XPC_WN_CallMethod js/src/xpconnect/src/xpcwrappednativejsops.cpp:1631
20 libxul.so js::Interpret js/src/jsinterp.cpp:4704
21 libxul.so js::Execute js/src/jsinterpinlines.h:493
22 libxul.so JS_ExecuteScript js/src/jsapi.cpp:4816
23 libxul.so mozJSComponentLoader::GlobalForLocation js/src/xpconnect/loader/mozJSComponentLoader.cpp:1240
24 libxul.so mozJSComponentLoader::ImportInto nsTHashtable.h:199
25 libxul.so mozJSComponentLoader::Import js/src/xpconnect/loader/mozJSComponentLoader.cpp:1375
26 libxul.so nsXPCComponents_Utils::Import js/src/xpconnect/src/xpccomponents.cpp:3779
27 libxul.so NS_InvokeByIndex_P xpcom/reflect/xptcall/src/md/unix/xptcinvoke_arm.cpp:191
28 libxul.so XPCWrappedNative::CallMethod js/src/xpconnect/src/xpcwrappednative.cpp:3054
29 libxul.so XPC_WN_CallMethod js/src/xpconnect/src/xpcwrappednativejsops.cpp:1631
30 libxul.so js::Interpret js/src/jsinterp.cpp:4704
31 libxul.so js::Invoke js/src/jsinterp.cpp:638
32 libxul.so js::ExternalGetOrSet js/src/jsinterp.cpp:856
33 libxul.so js_GetPropertyHelper js/src/jsobj.cpp:4856
34 libxul.so js::Interpret js/src/jsinterp.cpp:4114
35 libxul.so js::Invoke js/src/jsinterp.cpp:638
36 libxul.so js::ExternalInvoke js/src/jsinterp.cpp:856
37 libxul.so JS_CallFunctionValue js/src/jsinterp.h:956
38 libxul.so nsXPCWrappedJSClass::CallMethod js/src/xpconnect/src/xpcwrappedjsclass.cpp:1696
39 libxul.so nsXPCWrappedJS::CallMethod js/src/xpconnect/src/xpcwrappedjs.cpp:572
40 libxul.so PrepareAndDispatch xpcom/reflect/xptcall/src/md/unix/xptcstubs_arm.cpp:134
41 libxul.so libxul.so@0x8fba3c
42 libxul.so nsTimerImpl::Fire xpcom/threads/nsTimerImpl.cpp:429
Further crashes:
http://crash-stats.mozilla.com/report/list?range_value=2&range_unit=weeks&date=2010-10-28%2008%3A00%3A00&signature=js%3A%3AInterpret&version=Fennec%3A4.0b2pre
http://crash-stats.mozilla.com/report/list?range_value=2&range_unit=weeks&date=2010-10-28%2008%3A00%3A00&signature=js%3A%3ATraceRecorder%3A%3Arecord_NativeCallComplete&version=Fennec%3A4.0b2pre
http://crash-stats.mozilla.com/report/list?range_value=2&range_unit=weeks&date=2010-10-28%2007%3A00%3A00&signature=js%3A%3ANewBuiltinClassInstance&version=Fennec%3A4.0b2pre
Assignee | ||
Comment 1•14 years ago
|
||
Nothing has changed in message manager for some time.
Is it possible that something in JS engine has regressed?
Comment 2•14 years ago
|
||
Try cc'ing more JS people.
Has anyone reviewed the JS API usage in the message manager code? Such code in our experience needs guru-level code review...
/be
Assignee | ||
Comment 3•14 years ago
|
||
JS exception handling in message manager needs still work, I believe.
And yes, it would be great to get someone very familiar with JS API to
look at the code.
But if the crashes are something new, perhaps there is some real
JS engine bug, or some change how to use API.
Reporter | ||
Comment 4•14 years ago
|
||
The combined signatures of js::Interpret, js::TraceRecord::record_NativeCallComplete and js_LookupPropertyWithFlags make this one of the higher-volume crashes for Fennec 4.0b2, and we continue to see these stacks in 4.0b3pre. Seeing as this is also one of the few actionable stacks we possess in the top crashes, it would be wonderful for a knowledgeable JSAPI person to peruse the message manager's usage and make sure it's kosher.
Reporter | ||
Updated•14 years ago
|
Summary: Crashes based in nsFrameMessageManager::ReceiveMessage → Crashes based in nsFrameMessageManager::ReceiveMessage [@ js::Interpret] [@ js::TraceRecorder::record_NativeCallComplete] [@ js_LookupPropertyWithFlags]
Reporter | ||
Comment 5•14 years ago
|
||
js::Interpret crashes are the #7 topcrash for 4.0b3 at the moment. It's interesting to note that every stack I look at has this signature:
0 libxul.so js::Interpret js/src/jsinterp.cpp:2824
1 libxul.so js::Invoke js/src/jsinterp.cpp:657
2 libxul.so js::ExternalInvoke js/src/jsinterp.cpp:858
3 libxul.so JS_CallFunctionValue js/src/jsinterp.h:962
4 libxul.so nsFrameMessageManager::ReceiveMessage content/base/src/nsFrameMessageManager.cpp:459
5 libxul.so nsAsyncMessageToChild::Run content/base/src/nsFrameLoader.cpp:1831
So it appears to be limited to situations in which we're using the message manager in non-remote situations, with uptimes varying from 0 to 1329.
Updated•14 years ago
|
tracking-fennec: --- → ?
Updated•14 years ago
|
blocking2.0: --- → ?
tracking-fennec: ? → 2.0+
Updated•14 years ago
|
blocking2.0: ? → -
Updated•14 years ago
|
Whiteboard: fennec-related-jscript-crashers
Updated•14 years ago
|
tracking-fennec: 2.0+ → 2.0-
Comment 6•14 years ago
|
||
It is #13 top crasher in Fennec 4.0.
In Fennec, stack traces now look like:
0 libxul.so js::Interpret js/src/jscompartment.h:553
1 libxul.so js::Invoke js/src/jsinterp.cpp:653
2 libxul.so js::ExternalInvoke js/src/jsinterp.cpp:863
3 libxul.so JS_CallFunctionValue js/src/jsapi.cpp:5173
4 libxul.so nsFrameMessageManager::ReceiveMessage content/base/src/nsFrameMessageManager.cpp:463
5 libxul.so mozilla::dom::TabChild::RecvAsyncMessage dom/ipc/TabChild.cpp:761
6 libxul.so mozilla::dom::PBrowserChild::OnMessageReceived PBrowserChild.cpp:884
7 libxul.so mozilla::dom::PContentChild::OnMessageReceived PContentChild.cpp:1133
8 libxul.so mozilla::ipc::AsyncChannel::OnDispatchMessage ipc/glue/AsyncChannel.cpp:262
9 libxul.so mozilla::ipc::RPCChannel::OnMaybeDequeueOne ipc/glue/RPCChannel.cpp:435
10 libxul.so RunnableMethod<mozilla::ipc::RPCChannel, bool , Tuple0>::Run ipc/chromium/src/base/task.h:308
11 libxul.so mozilla::ipc::RPCChannel::DequeueTask::Run RPCChannel.h:475
OS: Mac OS X → All
Hardware: x86 → All
Version: unspecified → 2.0 Branch
Comment 7•13 years ago
|
||
Running my patch queue (which makes some SpecialPowers-related changes) on Linux opt tryserver builds, I am hitting what I think might be this bug seemingly consistently, right at the end of the test run at about the time I would expect things to be winding up for a normal process exit.
http://tbpl.mozilla.org/?tree=Try&pusher=cmccormack@mozilla.com&rev=e0065ad9ad00
http://tinderbox.mozilla.org/showlog.cgi?log=Try/1307326012.1307326810.8802.gz
Assignee | ||
Comment 8•13 years ago
|
||
Cameron, can you reproduce the crash easily?
If so, what are the steps to reproduce?
Reporter | ||
Comment 9•13 years ago
|
||
I have been able to reproduce the crash with the following method:
1) perform each build step at http://tinderbox.mozilla.org/showlog.cgi?log=Try/1307324450.1307328061.12699.gz&fulltext=1
2) simply running the failing test command |python mochitest/runtests.py --appname=firefox/firefox-bin --utility-path=bin --extra-profile-file=bin/plugins --certificate-path=certs --autorun --close-when-done --console-level=INFO --symbols-path=symbols --setpref=dom.ipc.plugins.enabled=false --ipcplugins| makes the crash reproduce for me.
3) running --debugger=gdb makes the crash non-reproducible, of course
4) attaching gdb to the process near the end of the mochitest run makes it reproducible
Now I just need to figure out how to load the symbols.
Comment 10•13 years ago
|
||
Update: #5 top crasher in Fennec 5 Betas
Comment 11•13 years ago
|
||
In case it's useful, here's another stack trace, this time on a Linux x86 debug build, and this time getting a bit further and hitting an assertion:
http://tinderbox.mozilla.org/showlog.cgi?log=Try/1307779731.1307783410.2819.gz#err0
Assertion failure: !regs_, at ../../../js/src/vm/Stack.cpp:376
0 linux-gate.so + 0x424
1 libxul.so!JS_Assert [jsutil.cpp : 89 + 0xb]
2 libxul.so!js::ContextStack::~ContextStack [Stack.cpp : 376 + 0x28]
3 libxul.so!JSContext::~JSContext [jscntxt.cpp : 1466 + 0x2d]
4 libxul.so!js::Foreground::delete_<JSContext> [jsutil.h : 493 + 0x10]
5 libxul.so!js_DestroyContext [jscntxt.cpp : 675 + 0xa]
6 libxul.so!JS_DestroyContext [jsapi.cpp : 1031 + 0x12]
7 libxul.so!XPCCallContext::~XPCCallContext [xpccallcontext.cpp : 402 + 0xd]
8 libxul.so!XPC_WN_CallMethod [xpcwrappednativejsops.cpp : 1597 + 0xd]
9 libxul.so!js::CallJSNative [jscntxtinlines.h : 277 + 0x18]
10 libxul.so!js::Interpret [jsinterp.cpp : 4676 + 0x2b]
11 libxul.so!js::RunScript [jsinterp.cpp : 613 + 0x19]
12 libxul.so!js::Invoke [jsinterp.cpp : 694 + 0x18]
13 libxul.so!js::ExternalInvoke [jsinterp.cpp : 816 + 0x19]
14 libxul.so!js::JSProxyHandler::call [jsproxy.cpp : 273 + 0x27]
15 libxul.so!JSWrapper::call [jswrapper.cpp : 250 + 0x69]
16 libxul.so!JSCrossCompartmentWrapper::call [jswrapper.cpp : 652 + 0x26]
17 libxul.so!js::JSProxy::call [jsproxy.cpp : 839 + 0x32]
18 libxul.so!js::proxy_Call [jsproxy.cpp : 1104 + 0x1f]
19 libxul.so!js::CallJSNative [jscntxtinlines.h : 277 + 0x18]
20 libxul.so!js::Invoke [jsinterp.cpp : 649 + 0x34]
21 libxul.so!js::Interpret [jsinterp.cpp : 4687 + 0x3e]
22 libxul.so!js::RunScript [jsinterp.cpp : 613 + 0x19]
23 libxul.so!js::Invoke [jsinterp.cpp : 694 + 0x18]
24 libxul.so!js::ExternalInvoke [jsinterp.cpp : 816 + 0x19]
25 libxul.so!JS_CallFunctionValue [jsapi.cpp : 5087 + 0x3c]
26 libxul.so!nsFrameMessageManager::ReceiveMessage [nsFrameMessageManager.cpp : 452 + 0x49]
27 libxul.so!nsAsyncMessageToChild::Run [nsFrameLoader.cpp : 1907 + 0x53]
28 libxul.so!nsThread::ProcessNextEvent [nsThread.cpp : 618 + 0x16]
29 libxul.so!NS_ProcessNextEvent_P [nsThreadUtils.cpp : 245 + 0x1f]
30 libxul.so!mozilla::ipc::MessagePump::Run [MessagePump.cpp : 110 + 0x15]
31 libxul.so!MessageLoop::RunInternal [message_loop.cc : 218 + 0x20]
32 libxul.so!MessageLoop::RunHandler [message_loop.cc : 202 + 0xa]
33 libxul.so!MessageLoop::Run [message_loop.cc : 176 + 0xa]
34 libxul.so!nsBaseAppShell::Run [nsBaseAppShell.cpp : 189 + 0xc]
35 libxul.so!nsAppStartup::Run [nsAppStartup.cpp : 222 + 0x19]
...
Comment 12•13 years ago
|
||
The assertion indicates that code is running on the stack which is part of the context being destroyed. Since this all involves nsFrameMessageManager, my suspicion would be that nsFrameMessageManager is passing a 'cx' to JS_CallFunctionValue (frame 25) that gets used to do all the calls between frame 25 and the XPC call in frame 8. Then, some content code (called by frame 8) calls ReleaseJSContext (there's only a few call-sites: http://mxr.mozilla.org/mozilla-central/ident?i=releaseJSContext) passing this same context used by frame 8 through 25, hence the assert. Thus, I would look for data-flow paths whereby the 'cx' passed to frame 25 makes its way to ReleaseJSContext before control returns to frame 26.
I hope that is helpful.
Comment 13•13 years ago
|
||
Thanks Luke. The patches I have applied (both for comment 7 and comment 11), modify the mochitest framework so that at the end of each test, inside TestRunner.testFinished, some extra communication is done with the chrome process:
TestRunner.testFinished:
* send an async "ping" message from content to chrome
* wait for the async "pong" message to be received from chrome
* inside the message listener for that "pong" message, do
* send a couple of sync messages to chrome asking it to delete some
crash dump files
* call TestRunner.runNextTest()
At the time TestRunner.testFinished is called by each individual test, the chrome process might have got a plugin crash notification, which I get SpecialPowersObserver.js to send on with an async message to the content's SpecialPowers.js object. The sending of the async ping/pong messages at the top of TestRunner.testFinished is done so that any crash notification message can be processed by the test harness before it moves on to the next test.
The behaviour of TestRunner.runNextTest when called after the last test has run is to immediately call nsIAppStartup.quit(eForceQuit) (when not in e10s mode).
I don't really know how shutdown works, but I am thinking that it must be within that quit() call that things are torn down while the message listener code is still on the stack. If I wrap the contents of my "pong" listener inside a setTimeout(..., 0), so that the quit() is done without the message listener on the stack, then I don't get the assertion and crash.
Assignee | ||
Comment 14•13 years ago
|
||
OK, I'll try to reproduce the bug and if I can, I'll take this.
Assignee | ||
Comment 15•13 years ago
|
||
(In reply to comment #9)
> 1) perform each build step at
> http://tinderbox.mozilla.org/showlog.cgi?log=Try/1307324450.1307328061.12699.
> gz&fulltext=1
Not sure what "each build step"
> 2) simply running the failing test command |python mochitest/runtests.py
> --appname=firefox/firefox-bin --utility-path=bin
> --extra-profile-file=bin/plugins --certificate-path=certs --autorun
> --close-when-done --console-level=INFO --symbols-path=symbols
> --setpref=dom.ipc.plugins.enabled=false --ipcplugins| makes the crash
> reproduce for me.
Where should I run this?
I tried few obvious ones, and it didn't work
Assignee | ||
Comment 16•13 years ago
|
||
And
python runtests.py --autorun --close-when-done --setpref=dom.ipc.plugins.enabled=false --ipcplugins
didn't crash.
Assignee | ||
Comment 17•13 years ago
|
||
Something like this might help, but since I don't know how to
reproduce this, I can't verify.
Reporter | ||
Comment 18•13 years ago
|
||
(In reply to comment #15)
> (In reply to comment #9)
> > 1) perform each build step at
> > http://tinderbox.mozilla.org/showlog.cgi?log=Try/1307324450.1307328061.12699.
> > gz&fulltext=1
> Not sure what "each build step"
From the log:
======== BuildStep started ========
download
=== Output ===
wget --progress=dot:mega -N http://stage.mozilla.org/pub/mozilla.org/firefox/try-builds/cmccormack@mozilla.com-e0065ad9ad00/try-linux-debug/firefox-7.0a1.en-US.linux-i686.tar.bz2
in dir /home/cltbld/talos-slave/test/build (timeout 1200 secs)
I still haven't been able to reproduce in my local build; only by downloading the try build and running the relevant steps. Of course, this is a linux crash, so we can't use the debug symbols provided since they only work on windows.
Comment 19•13 years ago
|
||
Olli, trying with your patch gives me this assertion not long after startup of a `make mochitest-ipcplugins` with my patch queue also applied.
###!!! ASSERTION: QueryInterface needed: 'query_result.get() == mRawPtr', file ../../../dist/include/nsCOMPtr.h, line 544
nsCOMPtr<nsIInProcessContentFrameMessageManager>::Assert_NoQueryNeeded() (/home/cam/mozilla/obj-linux64-dbg-gcc4.5/content/base/src/../../../dist/include/nsCOMPtr.h:543)
nsCOMPtr<nsIInProcessContentFrameMessageManager>::operator=(nsIInProcessContentFrameMessageManager*) (/home/cam/mozilla/obj-linux64-dbg-gcc4.5/content/base/src/../../../dist/include/nsCOMPtr.h:665)
nsFrameLoader::EnsureMessageManager() (/home/cam/mozilla/content/base/src/nsFrameLoader.cpp:2052)
nsFrameLoader::MaybeCreateDocShell() (/home/cam/mozilla/content/base/src/nsFrameLoader.cpp:1513)
nsFrameLoader::CheckURILoad(nsIURI*) (/home/cam/mozilla/content/base/src/nsFrameLoader.cpp:526)
nsFrameLoader::LoadURI(nsIURI*) (/home/cam/mozilla/content/base/src/nsFrameLoader.cpp:414)
...
Just to clarify, from the description in comment 13, do you think my use of the message manager is correct or not (i.e., should I not be calling nsIAppStartup.quit() inside a message listener)?
BTW, STR should be to apply the following patches on a Linux64 gcc4.5 debug build, and then to do a `make mochitest-ipcplugins`:
http://hg.mozilla.org/users/cmccormack_mozilla.com/mq/raw-file/07e9d5fa54e7/bug-642175-mochitest-cleanup
http://hg.mozilla.org/users/cmccormack_mozilla.com/mq/raw-file/07e9d5fa54e7/bug-642175-plugin-crash-cleanup-mochitests
http://hg.mozilla.org/users/cmccormack_mozilla.com/mq/raw-file/07e9d5fa54e7/bug-642175-plugin-crash-cleanup-mochitest-fixes
Assignee | ||
Comment 20•13 years ago
|
||
(In reply to comment #19)
> Olli, trying with your patch gives me this assertion not long after startup
> of a `make mochitest-ipcplugins` with my patch queue also applied.
Bah. Though the assertion may not be that bad, or easily fixable?
I assume the crash is still there?
> (i.e., should I not be calling
> nsIAppStartup.quit() inside a message listener)?
That sounds pretty evil. I wonder if we can really handle that...
> BTW, STR should be to apply the following patches on a Linux64 gcc4.5 debug
> build, and then to do a `make mochitest-ipcplugins`:
In top level objdir?
>
> http://hg.mozilla.org/users/cmccormack_mozilla.com/mq/raw-file/07e9d5fa54e7/
> bug-642175-mochitest-cleanup
> http://hg.mozilla.org/users/cmccormack_mozilla.com/mq/raw-file/07e9d5fa54e7/
> bug-642175-plugin-crash-cleanup-mochitests
> http://hg.mozilla.org/users/cmccormack_mozilla.com/mq/raw-file/07e9d5fa54e7/
> bug-642175-plugin-crash-cleanup-mochitest-fixes
Assignee | ||
Comment 21•13 years ago
|
||
(In reply to comment #20)
> > (i.e., should I not be calling
> > nsIAppStartup.quit() inside a message listener)?
> That sounds pretty evil. I wonder if we can really handle that...
I guess I need to debug when the cx is being deleted during quit().
Comment 22•13 years ago
|
||
(In reply to comment #20)
> > Olli, trying with your patch gives me this assertion not long after startup
> > of a `make mochitest-ipcplugins` with my patch queue also applied.
> Bah. Though the assertion may not be that bad, or easily fixable?
> I assume the crash is still there?
I am too much of an XPCOM noob to know how to fix it, unfortunately. The assertion happens early on and then crashes, so it doesn't get to the crash we care about.
> > (i.e., should I not be calling
> > nsIAppStartup.quit() inside a message listener)?
> That sounds pretty evil. I wonder if we can really handle that...
I am happy with my workaround of dispatching a runnable to do the quit() if it's decided that calling it inside the listener is evil and shouldn't be supported. (Though this doesn't help with the crashes for which this bug was originally filed.)
> > BTW, STR should be to apply the following patches on a Linux64 gcc4.5 debug
> > build, and then to do a `make mochitest-ipcplugins`:
> In top level objdir?
Yep.
Reporter | ||
Comment 23•13 years ago
|
||
With regards to Luke's hint, my money is on the frameloader being destroyed by content JS, which in turn calls Disconnect on the child message manager. This posts an event which calls nsFrameScriptExecutor::DestroyCx, which in turn can call ReleaseJSContext. It looks like there's machinery in nsInProcessTabChildGlobal to prevent this in the case of LoadFrameScript (see mLoadingScript) - could we extend this to work with receiving async messages as well?
Assignee | ||
Comment 24•13 years ago
|
||
(In reply to comment #19)
> BTW, STR should be to apply the following patches on a Linux64 gcc4.5 debug
> build, and then to do a `make mochitest-ipcplugins`:
>
> http://hg.mozilla.org/users/cmccormack_mozilla.com/mq/raw-file/07e9d5fa54e7/
> bug-642175-mochitest-cleanup
> http://hg.mozilla.org/users/cmccormack_mozilla.com/mq/raw-file/07e9d5fa54e7/
> bug-642175-plugin-crash-cleanup-mochitests
> http://hg.mozilla.org/users/cmccormack_mozilla.com/mq/raw-file/07e9d5fa54e7/
> bug-642175-plugin-crash-cleanup-mochitest-fixes
With the WIP patch and those patches make mochitest-ipcplugins
to crash in ###!!! ABORT: file /home/smaug/mozilla/hg/mozilla/ipc/chromium/src/base/pickle.cc, line 98
And in fact, I get that crash even without WIP.
So, I still don't know how to reproduce this :(
Assignee | ||
Comment 25•13 years ago
|
||
(In reply to comment #23)
> With regards to Luke's hint, my money is on the frameloader being destroyed
> by content JS, which in turn calls Disconnect on the child message manager.
> This posts an event which calls nsFrameScriptExecutor::DestroyCx, which in
> turn can call ReleaseJSContext. It looks like there's machinery in
> nsInProcessTabChildGlobal to prevent this in the case of LoadFrameScript
> (see mLoadingScript) - could we extend this to work with receiving async
> messages as well?
Did you look at the WIP? It is delaying DestroyCx
Reporter | ||
Comment 26•13 years ago
|
||
Cameron, I couldn't reproduce the assertion you saw with Olli's patch. Granted, I'm on a Mac, but for the output you pasted, that shouldn't make a difference. Are you sure that you did a thorough rebuild?
Comment 27•13 years ago
|
||
You're right, I must not have. I did a clean build and don't get that assertion now. Now I still get a crash on shutdown, but slightly earlier I think:
#0 0x00002aaaabddf1b2 in nsFrameMessageManager::ReceiveMessage (this=0x5a5a5a5a5a5a5a5a, aTarget=0x2aaac34313d0,
aMessage=..., aSync=0, aJSON=..., aObjectsArray=0x2aaacd0554c8, aJSONRetVal=0x0, aContext=0x5a5a5a5a5a5a5a5a)
at /home/cam/mozilla/content/base/src/nsFrameMessageManager.cpp:337
#1 0x00002aaaabddfd0d in nsFrameMessageManager::ReceiveMessage (this=0x2aaac34811d0, aTarget=0x2aaac34313d0, aMessage=...,
aSync=0, aJSON=..., aObjectsArray=0x2aaacd0554c8, aJSONRetVal=0x0, aContext=0x0)
at /home/cam/mozilla/content/base/src/nsFrameMessageManager.cpp:467
#2 0x00002aaaabd54261 in nsAsyncMessageToChild::Run (this=0x2aaac656b580)
at /home/cam/mozilla/content/base/src/nsFrameLoader.cpp:1908
#3 0x00002aaaacdd327c in nsThread::ProcessNextEvent (this=0x2aaab742dc60, mayWait=0, result=0x7fffffffd38c)
at /home/cam/mozilla/xpcom/threads/nsThread.cpp:618
...
Assignee | ||
Comment 28•13 years ago
|
||
That is interesting stack. "this=0x5a5a5a5a5a5a5a5a"
/me thinks..
Assignee | ||
Comment 29•13 years ago
|
||
Keep message managers alive long enough.
Would be great if someone could test this.
Attachment #538746 -
Attachment is obsolete: true
Comment 30•13 years ago
|
||
That seems to work for me! I just pushed this to try to make sure: http://tbpl.mozilla.org/?tree=Try&rev=683c7ee50b20
Assignee | ||
Comment 31•13 years ago
|
||
Comment on attachment 538855 [details] [diff] [review]
WIP2
So there are two parts here.
One is the keep message manager alive when calling
ReceiveMessage, the other is to make sure
cx isn't deleted when calling ReceiveMessage on tabchilds.
(In the chrome message manager the cx from chrome XUL is reused.)
Attachment #538855 -
Flags: review?(jst)
Updated•13 years ago
|
Crash Signature: [@ js::Interpret]
[@ js::TraceRecorder::record_NativeCallComplete]
[@ js_LookupPropertyWithFlags]
Reporter | ||
Comment 32•13 years ago
|
||
Johnny, do you think you'll be able to get to this review soon? I'm really hopeful that it'll squash one of the topcrashers for Fennec, and it deserves some baking time before we try to get it into Aurora.
Crash Signature: [@ js::Interpret]
[@ js::TraceRecorder::record_NativeCallComplete]
[@ js_LookupPropertyWithFlags] → [@ js::Interpret]
[@ js::TraceRecorder::record_NativeCallComplete]
[@ js_LookupPropertyWithFlags]
Updated•13 years ago
|
Attachment #538855 -
Flags: review?(jst) → review+
Assignee | ||
Comment 33•13 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/97bf954c6970
Marking fixed, but please reopen or file followups if this doesn't help
with the Fennec problems.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 34•13 years ago
|
||
Comment on attachment 538855 [details] [diff] [review]
WIP2
I have not seen any new crashes with stacks that look like this patch could be a culprit, which is a long way of saying that this patch seems to be, if not helping, at least not harming. It also has the potential to fix a top crasher for us, so I'd like to get it into aurora asap.
Attachment #538855 -
Flags: approval-mozilla-aurora?
Comment 35•13 years ago
|
||
I found a few crashes on tinderbox this morning which seem to be the one I was having problems with in bug 642175 (quitting the browser from inside a message listener) now that that has landed, but it looks intermittent now. I'm going to put back in my workaround in bug 666060. I don't know if that means anything for the Fennec-related crashes.
Assignee | ||
Comment 36•13 years ago
|
||
Do you have links to those crashes?
Could you file a followup bug, and post the links there?
Comment 37•13 years ago
|
||
Oh, sorry, I think I misread the logs. It's not a crash in the message manager at all. It is just a real, unexpected crash in a test. Sorry for the noise. (http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1308677954.1308682390.5623.gz&fulltext=1#err7 was one of the logs, btw.)
Updated•13 years ago
|
Assignee: general → Olli.Pettay
Comment 38•13 years ago
|
||
(In reply to comment #34)
> Comment on attachment 538855 [details] [diff] [review] [review]
> WIP2
>
> I have not seen any new crashes with stacks that look like this patch could
> be a culprit, which is a long way of saying that this patch seems to be, if
> not helping, at least not harming. It also has the potential to fix a top
> crasher for us, so I'd like to get it into aurora asap.
How much of this patch impacts desktop/single-process firefox?
Assignee | ||
Comment 39•13 years ago
|
||
The patch should be pretty safe. But the code is run in single-process Fx, at
least with InstallTrigger etc. which is implemented as Messagemanager script.
Updated•13 years ago
|
Attachment #538855 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Reporter | ||
Comment 40•13 years ago
|
||
status-firefox6:
--- → fixed
You need to log in
before you can comment on or make changes to this bug.
Description
•