Closed Bug 575113 Opened 14 years ago Closed 14 years ago

e10s: IPDL shutdown occurs before cycle collection and breaks necko

Categories

(Core :: IPC, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 572980

People

(Reporter: jdm, Unassigned)

References

Details

I believe I've finally figured out the cycle-collector segfaults I've been seeing on shutdown with regards to the HttpChannelChild.

Program received signal SIGSEGV, Segmentation fault.
0x024c8df6 in canonicalize (in=0xb3946084) at /home/t_mattjo/src/firefox/mobilebase/xpcom/base/nsCycleCollector.cpp:1209
1209	                       getter_AddRefs(child));
(gdb) bt
#0  0x024c8df6 in canonicalize (in=0xb3946084) at /home/t_mattjo/src/firefox/mobilebase/xpcom/base/nsCycleCollector.cpp:1209
#1  0x024c9529 in GCGraphBuilder::NoteXPCOMChild (this=0xbfda06f4, child=0xb3946084) at /home/t_mattjo/src/firefox/mobilebase/xpcom/base/nsCycleCollector.cpp:1555
#2  0x0164dc37 in nsDocument::cycleCollection::Traverse (this=0x2d30364, p=0xb36d3400, cb=...) at /home/t_mattjo/src/firefox/mobilebase/content/base/src/nsDocument.cpp:1683
#3  0x0180c059 in nsHTMLDocument::cycleCollection::Traverse (this=0x2d30364, p=0xb36d3400, cb=...) at /home/t_mattjo/src/firefox/mobilebase/content/html/document/src/nsHTMLDocument.cpp:243
#4  0x024c931b in GCGraphBuilder::Traverse (this=0xbfda06f4, aPtrInfo=0xb0a02384) at /home/t_mattjo/src/firefox/mobilebase/xpcom/base/nsCycleCollector.cpp:1484
#5  0x024c98f6 in nsCycleCollector::MarkRoots (this=0xb751f000, builder=...) at /home/t_mattjo/src/firefox/mobilebase/xpcom/base/nsCycleCollector.cpp:1706
#6  0x024ca246 in nsCycleCollector::BeginCollection (this=0xb751f000) at /home/t_mattjo/src/firefox/mobilebase/xpcom/base/nsCycleCollector.cpp:2645
#7  0x024ca5ca in nsCycleCollector_beginCollection () at /home/t_mattjo/src/firefox/mobilebase/xpcom/base/nsCycleCollector.cpp:3232
#8  0x01dba7a1 in XPCCycleCollectGCCallback (cx=0xb67e5400, status=JSGC_MARK_END) at /home/t_mattjo/src/firefox/mobilebase/js/src/xpconnect/src/nsXPConnect.cpp:361
#9  0x0655347d in GC (cx=0xb67e5400) at /home/t_mattjo/src/firefox/mobilebase/js/src/jsgc.cpp:2797
#10 0x06553cbc in GCUntilDone (cx=0xb67e5400, gckind=GC_NORMAL) at /home/t_mattjo/src/firefox/mobilebase/js/src/jsgc.cpp:3156
#11 0x06553df4 in js_GC (cx=0xb67e5400, gckind=GC_NORMAL) at /home/t_mattjo/src/firefox/mobilebase/js/src/jsgc.cpp:3207
#12 0x064f59bd in JS_GC (cx=0xb67e5400) at /home/t_mattjo/src/firefox/mobilebase/js/src/jsapi.cpp:2317
#13 0x01dba912 in nsXPConnect::Collect (this=0xb75148d0) at /home/t_mattjo/src/firefox/mobilebase/js/src/xpconnect/src/nsXPConnect.cpp:448
#14 0x024ca0a9 in nsCycleCollector::Collect (this=0xb751f000, aTryCollections=1) at /home/t_mattjo/src/firefox/mobilebase/xpcom/base/nsCycleCollector.cpp:2523
#15 0x024ca55c in nsCycleCollector_collect () at /home/t_mattjo/src/firefox/mobilebase/xpcom/base/nsCycleCollector.cpp:3220
#16 0x019159f5 in nsJSContext::CC () at /home/t_mattjo/src/firefox/mobilebase/dom/base/nsJSEnvironment.cpp:3589
#17 0x01915be9 in nsJSContext::IntervalCC () at /home/t_mattjo/src/firefox/mobilebase/dom/base/nsJSEnvironment.cpp:3677
#18 0x01915b97 in nsJSContext::CCIfUserInactive () at /home/t_mattjo/src/firefox/mobilebase/dom/base/nsJSEnvironment.cpp:3667
#19 0x01915c5e in GCTimerFired (aTimer=0xb319ff40, aClosure=0x0) at /home/t_mattjo/src/firefox/mobilebase/dom/base/nsJSEnvironment.cpp:3703
#20 0x024b9ed0 in nsTimerImpl::Fire (this=0xb319ff40) at /home/t_mattjo/src/firefox/mobilebase/xpcom/threads/nsTimerImpl.cpp:427
#21 0x024ba107 in nsTimerEvent::Run (this=0xb0c502a0) at /home/t_mattjo/src/firefox/mobilebase/xpcom/threads/nsTimerImpl.cpp:519
#22 0x024b34b8 in nsThread::ProcessNextEvent (this=0xb753b1a0, mayWait=0, result=0xbfda49dc) at /home/t_mattjo/src/firefox/mobilebase/xpcom/threads/nsThread.cpp:547
#23 0x0244dbd9 in NS_ProcessNextEvent_P (thread=0xb753b1a0, mayWait=0) at nsThreadUtils.cpp:250
#24 0x0233495e in mozilla::ipc::MessagePump::Run (this=0xb7511130, aDelegate=0xbfda55e0) at /home/t_mattjo/src/firefox/mobilebase/ipc/glue/MessagePump.cpp:118
#25 0x02334e9c in mozilla::ipc::MessagePumpForChildProcess::Run (this=0xb7511130, aDelegate=0xbfda55e0) at /home/t_mattjo/src/firefox/mobilebase/ipc/glue/MessagePump.cpp:232
#26 0x025197e9 in MessageLoop::RunInternal (this=0xbfda55e0) at /home/t_mattjo/src/firefox/mobilebase/ipc/chromium/src/base/message_loop.cc:219
#27 0x02519769 in MessageLoop::RunHandler (this=0xbfda55e0) at /home/t_mattjo/src/firefox/mobilebase/ipc/chromium/src/base/message_loop.cc:202
#28 0x0251970d in MessageLoop::Run (this=0xbfda55e0) at /home/t_mattjo/src/firefox/mobilebase/ipc/chromium/src/base/message_loop.cc:176
#29 0x021f0962 in nsBaseAppShell::Run (this=0xb347c1f0) at /home/t_mattjo/src/firefox/mobilebase/widget/src/xpwidgets/nsBaseAppShell.cpp:175
#30 0x0110fcb3 in XRE_RunAppShell () at /home/t_mattjo/src/firefox/mobilebase/toolkit/xre/nsEmbedFunctions.cpp:566
#31 0x02334dbc in mozilla::ipc::MessagePumpForChildProcess::Run (this=0xb7511130, aDelegate=0xbfda55e0) at /home/t_mattjo/src/firefox/mobilebase/ipc/glue/MessagePump.cpp:218
#32 0x025197e9 in MessageLoop::RunInternal (this=0xbfda55e0) at /home/t_mattjo/src/firefox/mobilebase/ipc/chromium/src/base/message_loop.cc:219
#33 0x02519769 in MessageLoop::RunHandler (this=0xbfda55e0) at /home/t_mattjo/src/firefox/mobilebase/ipc/chromium/src/base/message_loop.cc:202
#34 0x0251970d in MessageLoop::Run (this=0xbfda55e0) at /home/t_mattjo/src/firefox/mobilebase/ipc/chromium/src/base/message_loop.cc:176
#35 0x0110f730 in XRE_InitChildProcess (aArgc=1, aArgv=0xbfda5834, aProcess=GeckoProcessType_Content) at /home/t_mattjo/src/firefox/mobilebase/toolkit/xre/nsEmbedFunctions.cpp:447
#36 0x08049080 in main (argc=3, argv=0xbfda5834) at /home/t_mattjo/src/firefox/mobilebase/ipc/app/MozillaRuntimeMain.cpp:87
(gdb) fr 2
#2  0x0164dc37 in nsDocument::cycleCollection::Traverse (this=0x2d30364, p=0xb36d3400, cb=...) at /home/t_mattjo/src/firefox/mobilebase/content/base/src/nsDocument.cpp:1683
1683	  NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR(mChannel)

The document retains a strong reference to the channel.  However, since the channel is actually an IPDL actor  masquerading as a channel, when the IPDL tree cleans up any remaining actors, it calls DeallocPHttpChannel on the channel which is held by the document.  This is bad for the same reason we see in bug 572980, in that we're making sure that DeallocPHttpChannel is only ever called at very specific times.  The channel has previously already hit OnStopRequest, ie. IPDL has already relinquished its hold over it, so IPDL ends up causing the channel to delete itself.  When the cycle collector comes along later, it's game over.
I believe I remember bz advocating that we just remove the cycle collector entry from nsDocument, but at the time we suspected that would just be papering over the problem.  Is this still a viable suggestion?  It would sidestep the problem very nicely.
Nevermind, removing the cycle collector traversal isn't a solution.  Rather, it fixes that crash, but when the nsDocument destructs later, the mChannel member is still garbage and we get a different crash.
Blocks: 572980
I've done some more thinking and determined that this is really just a symptom of the incorrect refcounting model I was employing in bug 572980.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.