Closed Bug 620647 Opened 14 years ago Closed 14 years ago

plugin-container crashes after channel error

Categories

(Core :: IPC, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: tatiana, Assigned: tatiana)

References

Details

(Keywords: crash)

Attachments

(2 files, 2 obsolete files)

Attached file [@cairo_surface_reference].bt (deleted) —
STEPS LEADING TO PROBLEM: 1. Open Fennec and navigate to www.qq.com. 2. Wait until page is loaded. 3. Close Fennec. EXPECTED OUTCOME: no crash observed ACTUAL OUTCOME: ###!!! [Child][SyncChannel] Error: Channel error: cannot send/recv plugin-container: cairo-surface.c:614: cairo_surface_reference: Assertion `((*&(&surface->ref_count)->ref_count) > 0)' failed.
Blocks: 619056
same happens sometimes in youtube when close Fennec while playing flash video
the situation is following, we are getting channel error and destroying PluginInstanceChild: #0 gfxSharedImageSurface::~gfxSharedImageSurface (this=0x97448, __in_chrg=<value optimized out>) at mozilla-central/gfx/thebes/gfxSharedImageSurface.cpp:102 #1 0x3c040e8c in gfxASurface::SurfaceDestroyFunc (data=0x97448) at mozilla-central/gfx/thebes/gfxASurface.cpp:131 #2 0x3df4cf58 in _cairo_user_data_array_fini (array=0xec0c0) at cairo-array.c:390 #3 0x3df8b2a8 in cairo_surface_destroy (surface=0xec0a0) at cairo-surface.c:650 #4 0x3c041594 in gfxASurface::Release (this=0x97448) at mozilla-central/gfx/thebes/gfxASurface.cpp:112 #5 0x3bcff5cc in ~nsRefPtr (this=0x78248, __in_chrg=<value optimized out>) at ../../dist/include/nsAutoPtr.h:969 #6 mozilla::plugins::PluginInstanceChild::~PluginInstanceChild (this=0x78248, __in_chrg=<value optimized out>) at mozilla-central/dom/plugins/PluginInstanceChild.cpp:196 #7 0x3bd03674 in mozilla::plugins::PluginModuleChild::DeallocPPluginInstance ( this=<value optimized out>, aActor=0x78248) at mozilla-central/dom/plugins/PluginModuleChild.cpp:1836 #8 0x3be185ec in mozilla::plugins::PPluginModuleChild::DeallocSubtree (this=0x28818) at PPluginModuleChild.cpp:1043 #9 0x3be1976c in mozilla::plugins::PPluginModuleChild::OnChannelError (this=0x28818) at PPluginModuleChild.cpp:792 #10 0x3bd1d284 in NotifyMaybeChannelError (this=0x28820) at mozilla-central/ipc/glue/AsyncChannel.cpp:346 #11 mozilla::ipc::AsyncChannel::OnNotifyMaybeChannelError (this=0x28820) at mozilla-central/ipc/glue/AsyncChannel.cpp:311 #12 0x3bd1d9e4 in DispatchToMethod<mozilla::ipc::AsyncChannel, void (mozilla::ipc::AsyncChannel::*)()> (this=<value optimized out>) at mozilla-central/ipc/chromium/src/base/tuple.h:383 #13 RunnableMethod<mozilla::ipc::AsyncChannel, void (mozilla::ipc::AsyncChannel::*)(), Tuple0>::Run ( this=<value optimized out>) at mozilla-central/ipc/chromium/src/base/task.h:307 #14 0x3bfd7000 in RunTask (this=0xaec6a7f0) at mozilla-central/ipc/chromium/src/base/message_loop.cc:344 #15 DeferOrRunPendingTask (this=0xaec6a7f0) at mozilla-central/ipc/chromium/src/base/message_loop.cc:352 #16 MessageLoop::DoWork (this=0xaec6a7f0) at mozilla-central/ipc/chromium/src/base/message_loop.cc:452 ... but we have pending InvalidateRectDelayed, which we didn't cancel, so we have crash when it's called for already destroyed PluginInstanceChild: TaskSwitcherWindow::closeEvent TaskSwitcherWindow::closeTab TaskSwitcherWindow::notifyObservers aBrowser: 0xa5eda8, aTopic: fennec-taskswitcher-window-closed TaskSwitcherWindow::notifyObservers Notifying observers with topic 'fennec-taskswitcher-window-closed' [time:1278066257604588][PContentParent] Sending Msg_PreferenceUpdate([TODO]) [time:1278066257815190][PBrowserParent] Sending Msg_Destroy([TODO]) [time:1278066258052312][PHttpChannelParent] Sending Msg_OnStatus([TODO]) [time:1278066258055119][PHttpChannelParent] Sending Msg_OnProgress([TODO]) [time:1278066258056493][PHttpChannelParent] Sending Msg_OnDataAvailable([TODO]) [time:1278066258074162][PContentParent] Sending Msg_SetOffline([TODO]) [time:1278066258080876][PHttpChannelParent] Sending Msg_OnStopRequest([TODO]) TaskSwitcherWindow::~TaskSwitcherWindow [time:1278066258270665][PContentChild] Received Msg_PreferenceUpdate([TODO]) [time:1278066258270970][PBrowserChild] Received Msg_Destroy([TODO]) [time:1278066258283971][PBrowserChild] Sending Msg_AsyncMessage([TODO]) [time:1278066258284581][PBrowserChild] Sending Msg_GetIMEEnabled([TODO]) [time:1278066258477757][PPluginModuleChild] Sending Msg_NPN_UserAgent([TODO]) [time:1278066258779210][PBrowserParent] Received Msg_AsyncMessage([TODO]) [time:1278066258779393][PBrowserParent] Received Msg_GetIMEEnabled([TODO]) [time:1278066258779424][PBrowserParent] Sending reply Reply_GetIMEEnabled([TODO]) [time:1278066258779759][PBrowserChild] Received reply Reply_GetIMEEnabled([TODO]) [time:1278066258779851][PBrowserChild] Sending Msg_SetIMEEnabled([TODO]) NOTE: child process received `Goodbye', closing down [time:1278066258810063][PBrowserChild] Sending Msg_AsyncMessage([TODO]) ###!!! [Child][AsyncChannel] Error: Channel closing: too late to send/recv, messages will be lost ###!!! [Child][RPCChannel] Error: Channel error: cannot send/recv [time:1278066258889867][PPluginInstanceChild] Sending Msg_PStreamNotifyConstructor([TODO]) ###!!! [Child][RPCChannel] Error: Channel error: cannot send/recv [time:1278066258932927][PPluginModuleChild] Sending Msg_NPN_UserAgent([TODO]) ###!!! [Child][RPCChannel] Error: Channel error: cannot send/recv [time:1278066258933324][PPluginInstanceChild] Sending Msg_PStreamNotifyConstructor([TODO]) ###!!! [Child][RPCChannel] Error: Channel error: cannot send/recv void mozilla::plugins::PluginInstanceChild::AsyncShowPluginFrame():2839 Plugin: 0xee460, Current: 0x6bab0, Back: 0x18d190 virtual mozilla::plugins::PluginInstanceChild::~PluginInstanceChild():196 Plugin: 0xd7fb0, Current: 0x103000, Back: 0x1b7b80 virtual gfxSharedImageSurface::~gfxSharedImageSurface():102 Surface: 0x1b7b80 virtual gfxSharedImageSurface::~gfxSharedImageSurface():102 Surface: 0x103000 virtual mozilla::plugins::PluginInstanceChild::~PluginInstanceChild():196 Plugin: 0xee460, Current: 0x6bab0, Back: 0x18d190 virtual gfxSharedImageSurface::~gfxSharedImageSurface():102 Surface: 0x18d190 virtual gfxSharedImageSurface::~gfxSharedImageSurface():102 Surface: 0x6bab0 void mozilla::plugins::PluginInstanceChild::PaintRectToSurface(const nsIntRect&, gfxASurface*, const gfxRGBA&):2602 Plugin: 0xee460, Surface: 0x6bab0, Current: 0x6bab0, Back: 0x18d190 plugin-container: cairo-surface.c:614: cairo_surface_reference: Assertion `((*&(&surface->ref_count)->ref_count) > 0)' failed.
Attached patch patch (obsolete) (deleted) — Splinter Review
Attachment #498986 - Flags: review?(jones.chris.g)
Attachment #498983 - Attachment mime type: application/octet-stream → text/plain
I just found out, it's not enough to cancel InvalidateRectDelayed, we still can get InvalidateRect updates from plugin after PluginInstanceChild is destroyed. Reproduced when closing qq.com: ###!!! [Child][AsyncChannel] Error: Channel closing: too late to send/recv, messages will be lost void mozilla::plugins::PluginInstanceChild::AsyncShowPluginFrame():2848 PluginInstanceChild: 0x770c8 void mozilla::plugins::PluginInstanceChild::InvalidateRectDelayed():2822 PluginInstanceChild: 0x770c8 void mozilla::plugins::PluginInstanceChild::InvalidateRectDelayed():2832 [time:1277996518647132][PPluginInstanceChild] Sending Msg_Show([TODO]) WARNING: pipe error (3): Connection reset by peer: file mozilla-central/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 404 ###!!! [Child][SyncChannel] Error: Channel error: cannot send/recv void mozilla::plugins::PluginInstanceChild::AsyncShowPluginFrame():2848 PluginInstanceChild: 0x770c8 virtual mozilla::plugins::PluginInstanceChild::~PluginInstanceChild():199 virtual mozilla::plugins::PluginInstanceChild::~PluginInstanceChild():203 PluginInstanceChild: 0x770c8 virtual mozilla::plugins::PluginInstanceChild::~PluginInstanceChild():203 PluginInstanceChild: 0x100d38 virtual mozilla::plugins::PluginInstanceChild::~PluginInstanceChild():203 PluginInstanceChild: 0x119120 virtual mozilla::plugins::PluginInstanceChild::~PluginInstanceChild():203 PluginInstanceChild: 0x11ae88 virtual mozilla::plugins::PluginInstanceChild::~PluginInstanceChild():203 PluginInstanceChild: 0x12e0a0 virtual mozilla::plugins::PluginInstanceChild::~PluginInstanceChild():203 PluginInstanceChild: 0x137e98 virtual mozilla::plugins::PluginInstanceChild::~PluginInstanceChild():203 PluginInstanceChild: 0x155a10 void mozilla::plugins::PluginInstanceChild::AsyncShowPluginFrame():2848 PluginInstanceChild: 0x770c8 void mozilla::plugins::PluginInstanceChild::InvalidateRectDelayed():2822 PluginInstanceChild: 0x770c8 void mozilla::plugins::PluginInstanceChild::InvalidateRectDelayed():2832 plugin-container: cairo-surface.c:614: cairo_surface_reference: Assertion `((*&(&surface->ref_count)->ref_count) > 0)' failed.
Attachment #498986 - Attachment is obsolete: true
Attachment #498986 - Flags: review?(jones.chris.g)
Attachment #499262 - Flags: review?(jones.chris.g)
Thanks Tatiana. I'm pretty sure that what's going on here is that, first the chrome process shuts down and ends up triggering an _exit() in the content process. The _exit() is exactly what we want there, because the content process can't keep persistent state. But, this _exit() is triggering an abnormal channel close in the plugin subprocess, and that's not something we've cared much about yet, because previously it was only caused by firefox-bin crashing. (Async plugin painting will have made these symptoms more probable, but nothing fundamental has changed.) We don't _exit() here in plugin subprocesses because plugins can keep persistent state. The long-term plan, required for multiple content processes, is to have plugins as direct children of the chrome process. It's unclear how shutdown will work then. There are two mostly orthogonal issues raised (1) Do we care enough about firefox crashing out from under plugins (that have persistent state) to test those cases? (2) What's the minimum we should do to band-aid this problem while we have a single content process? (Multiple content processes will change all this.) If "yes" to (1), we should spin off a new bug, add tests to tickle the problems seen by Tatiana, and consider her patch there. For (2), I think we should just go with a temporary _exit(), like we do in content processes. We can use that temporary impl to see if any issues arise with plugin state in the wild, which would inform (1).
Comment on attachment 499262 [details] [diff] [review] patch v2: cancel InvalidateRectDelayed and NPP_Destroy on ~PluginInstanceChild See comment 7. Adding code like http://mxr.mozilla.org/mozilla-central/source/dom/ipc/ContentChild.cpp#424 to PluginModuleChild::ActorDestroy is simpler and a bit safer, wrt use-after-free bugs.
Attachment #499262 - Flags: review?(jones.chris.g) → review-
Attached patch patch v3: _exit() (deleted) — Splinter Review
Assignee: nobody → tanya.meshkova
Attachment #499262 - Attachment is obsolete: true
Attachment #499824 - Flags: review?(jones.chris.g)
Comment on attachment 499824 [details] [diff] [review] patch v3: _exit() Looks good to me. Requesting sr on the logic of comment 7. (Sorry roc, bsmedberg is on PTO ;) .)
Attachment #499824 - Flags: superreview?(roc)
Attachment #499824 - Flags: review?(jones.chris.g)
Attachment #499824 - Flags: review+
Attachment #499824 - Flags: superreview?(roc) → superreview+
Comment on attachment 499824 [details] [diff] [review] patch v3: _exit() I think we can safely push it in.
Attachment #499824 - Flags: approval2.0?
Turns out we even have tests that would have exposed this bug!
Blocks: 615386
Keywords: crash
Attachment #499824 - Flags: approval2.0? → approval2.0+
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: