Closed
Bug 532208
Opened 15 years ago
Closed 15 years ago
Plugins: Make PBrowserStream/PPluginStream destructors uni-directional
Categories
(Core :: IPC, defect)
Core
IPC
Tracking
()
RESOLVED
FIXED
People
(Reporter: cjones, Assigned: benjamin)
References
(Depends on 1 open bug)
Details
Attachments
(7 files, 2 obsolete files)
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
bent.mozilla
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
cjones
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
bent.mozilla
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
cjones
:
review-
|
Details | Diff | Splinter Review |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
benjamin
:
review+
|
Details | Diff | Splinter Review |
According to bsmedberg: "PBrowserStream/PPluginStream may not need a bi-directional destructor, I misunderstood how NPP_Write behaves." I suspect that they're only sent in one direction currently, and if so this would be an IPDL-only change.
Reporter | ||
Comment 1•15 years ago
|
||
(In reply to comment #0) > I suspect that they're only sent in one direction currently, and if so this > would be an IPDL-only change. Nope :(. https://developer.mozilla.org/en/NPN_DestroyStream
Reporter | ||
Comment 2•15 years ago
|
||
From reading the docs a little closer, I think that NPP_DestroyStream() really means "NPP_NotifyDestroyStream()", and that the browser always owns the underlying "real" NPStream. So I think all dtors can go plugin-->browser. jgriffin/bsmedberg, does that sound correct?
Reporter | ||
Comment 3•15 years ago
|
||
From a third read of the docs it seemed to me that for BrowserStream, NPN_Destroy() ~= "done, close, delete" and NPP_Destroy() ~= "cancel" (and similarly for PluginStream). So that's what I've implemented, though there are remaining kinks to work out.
Assignee | ||
Comment 4•15 years ago
|
||
That's definitely incorrect. If anything, NPN_Destroy means cancel and NPP_Destroy means "done close delete" for browser streams. It could be reversed for plugin streams, I'm not sure.
Reporter | ||
Comment 5•15 years ago
|
||
Oops, I actually did mean to say NPP_Destroy = "done close delete" and NPN_Destroy = "cancel". Basically the NP*_Destroy matched with the NP*_New is the "done close delete", the other is "cancel".
Assignee | ||
Comment 6•15 years ago
|
||
This is how I think we can make PBrowserStream stateful. In this protocol, nothing races with __delete__ (even discard messages, I think).
Assignee: nobody → benjamin
Attachment #416173 -
Flags: review?(jones.chris.g)
Reporter | ||
Comment 8•15 years ago
|
||
Comment on attachment 421754 [details] valgrind log showing stream errors > (processing deferred in-call) >==25042== Invalid read of size 4 >==25042== at 0x6A7EED6: mozilla::plugins::PBrowserStreamChild::OnCallReceived(IPC::Message const&, IPC::Message*&) (PBrowserStreamChild.cpp:294) >==25042== by 0x6A6DCE4: mozilla::plugins::PPluginModuleChild::OnCallReceived(IPC::Message const&, IPC::Message*&) (PPluginModuleChild.cpp:375) >==25042== by 0x6A18B2B: mozilla::ipc::RPCChannel::DispatchIncall(IPC::Message const&) (RPCChannel.cpp:347) >==25042== by 0x6A18A44: mozilla::ipc::RPCChannel::Incall(IPC::Message const&, unsigned long) (RPCChannel.cpp:332) >==25042== by 0x6A18374: mozilla::ipc::RPCChannel::MaybeProcessDeferredIncall() (RPCChannel.cpp:216) >==25042== by 0x6A18662: mozilla::ipc::RPCChannel::OnMaybeDequeueOne() (RPCChannel.cpp:257) >==25042== by 0x6A1B351: void DispatchToMethod<mozilla::ipc::RPCChannel, void (mozilla::ipc::RPCChannel::*)()>(mozilla::ipc::RPCChannel*, void (mozilla::ipc::RPCChannel::*)(), Tuple0 const&) (tuple.h:383) >==25042== by 0x6A1B1A7: RunnableMethod<mozilla::ipc::RPCChannel, void (mozilla::ipc::RPCChannel::*)(), Tuple0>::Run() (task.h:307) >==25042== by 0x6ACE0A1: MessageLoop::RunTask(Task*) (message_loop.cc:326) >==25042== by 0x6ACE111: MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) (message_loop.cc:334) >==25042== by 0x6ACE50F: MessageLoop::DoWork() (message_loop.cc:434) >==25042== by 0x6A169F0: mozilla::ipc::DoWorkRunnable::Run() (MessagePump.cpp:75) >==25042== Address 0x13c7f7d8 is 24 bytes inside a block of size 144 free'd >==25042== at 0x4C24A7A: operator delete(void*) (vg_replace_malloc.c:346) >==25042== by 0x6A0F974: mozilla::plugins::BrowserStreamChild::~BrowserStreamChild() (BrowserStreamChild.h:63) >==25042== by 0x69FAD34: mozilla::plugins::PluginInstanceChild::DeallocPBrowserStream(mozilla::plugins::PBrowserStreamChild*) (PluginInstanceChild.cpp:914) >==25042== by 0x6A7EEA6: mozilla::plugins::PBrowserStreamChild::OnCallReceived(IPC::Message const&, IPC::Message*&) (PBrowserStreamChild.cpp:292) >==25042== by 0x6A6DCE4: mozilla::plugins::PPluginModuleChild::OnCallReceived(IPC::Message const&, IPC::Message*&) (PPluginModuleChild.cpp:375) >==25042== by 0x6A18B2B: mozilla::ipc::RPCChannel::DispatchIncall(IPC::Message const&) (RPCChannel.cpp:347) >==25042== by 0x6A18A44: mozilla::ipc::RPCChannel::Incall(IPC::Message const&, unsigned long) (RPCChannel.cpp:332) >==25042== by 0x6A18374: mozilla::ipc::RPCChannel::MaybeProcessDeferredIncall() (RPCChannel.cpp:216) >==25042== by 0x6A18662: mozilla::ipc::RPCChannel::OnMaybeDequeueOne() (RPCChannel.cpp:257) >==25042== by 0x6A1B351: void DispatchToMethod<mozilla::ipc::RPCChannel, void (mozilla::ipc::RPCChannel::*)()>(mozilla::ipc::RPCChannel*, void (mozilla::ipc::RPCChannel::*)(), Tuple0 const&) (tuple.h:383) >==25042== by 0x6A1B1A7: RunnableMethod<mozilla::ipc::RPCChannel, void (mozilla::ipc::RPCChannel::*)(), Tuple0>::Run() (task.h:307) >==25042== by 0x6ACE0A1: MessageLoop::RunTask(Task*) (message_loop.cc:326) >==25042== > I don't have the NPAPI log from before this error, but it appears to me that parent and child calls raced, the child won, and the child's out-call eventually led to the deletion of this BrowserStream. Then the parent's deferred call was processed, and was directed at this dead stream. This is a problem with RPC race resolution that I nebulously hinted at in bug 533002 comment 4. I've got an in-progress newsgroup post describing some possible fixes, but I haven't had time to sit down for a good think on it yet. >==25042== Invalid read of size 4 >==25042== at 0x6A80DE3: mozilla::plugins::PStreamNotifyChild::OnCallReceived(IPC::Message const&, IPC::Message*&) (PStreamNotifyChild.cpp:117) >==25042== by 0x6A6DCE4: mozilla::plugins::PPluginModuleChild::OnCallReceived(IPC::Message const&, IPC::Message*&) (PPluginModuleChild.cpp:375) >==25042== by 0x6A18B2B: mozilla::ipc::RPCChannel::DispatchIncall(IPC::Message const&) (RPCChannel.cpp:347) >==25042== by 0x6A18A44: mozilla::ipc::RPCChannel::Incall(IPC::Message const&, unsigned long) (RPCChannel.cpp:332) >==25042== by 0x6A186FA: mozilla::ipc::RPCChannel::OnMaybeDequeueOne() (RPCChannel.cpp:267) >==25042== by 0x6A1B351: void DispatchToMethod<mozilla::ipc::RPCChannel, void (mozilla::ipc::RPCChannel::*)()>(mozilla::ipc::RPCChannel*, void (mozilla::ipc::RPCChannel::*)(), Tuple0 const&) (tuple.h:383) >==25042== by 0x6A1B1A7: RunnableMethod<mozilla::ipc::RPCChannel, void (mozilla::ipc::RPCChannel::*)(), Tuple0>::Run() (task.h:307) >==25042== by 0x6ACE0A1: MessageLoop::RunTask(Task*) (message_loop.cc:326) >==25042== by 0x6ACE111: MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) (message_loop.cc:334) >==25042== by 0x6ACE50F: MessageLoop::DoWork() (message_loop.cc:434) >==25042== by 0x6A169F0: mozilla::ipc::DoWorkRunnable::Run() (MessagePump.cpp:75) >==25042== by 0x6BD1718: nsThread::ProcessNextEvent(int, int*) (nsThread.cpp:527) >==25042== Address 0x130cf9a8 is 24 bytes inside a block of size 64 free'd >==25042== at 0x4C24A7A: operator delete(void*) (vg_replace_malloc.c:346) >==25042== by 0x69FB5CB: mozilla::plugins::StreamNotifyChild::~StreamNotifyChild() (StreamNotifyChild.h:48) >==25042== by 0x69FAEF2: mozilla::plugins::PluginInstanceChild::DeallocPStreamNotify(mozilla::plugins::PStreamNotifyChild*) (PluginInstanceChild.cpp:970) >==25042== by 0x6A80DB9: mozilla::plugins::PStreamNotifyChild::OnCallReceived(IPC::Message const&, IPC::Message*&) (PStreamNotifyChild.cpp:115) >==25042== by 0x6A6DCE4: mozilla::plugins::PPluginModuleChild::OnCallReceived(IPC::Message const&, IPC::Message*&) (PPluginModuleChild.cpp:375) >==25042== by 0x6A18B2B: mozilla::ipc::RPCChannel::DispatchIncall(IPC::Message const&) (RPCChannel.cpp:347) >==25042== by 0x6A18A44: mozilla::ipc::RPCChannel::Incall(IPC::Message const&, unsigned long) (RPCChannel.cpp:332) >==25042== by 0x6A186FA: mozilla::ipc::RPCChannel::OnMaybeDequeueOne() (RPCChannel.cpp:267) >==25042== by 0x6A1B351: void DispatchToMethod<mozilla::ipc::RPCChannel, void (mozilla::ipc::RPCChannel::*)()>(mozilla::ipc::RPCChannel*, void (mozilla::ipc::RPCChannel::*)(), Tuple0 const&) (tuple.h:383) >==25042== by 0x6A1B1A7: RunnableMethod<mozilla::ipc::RPCChannel, void (mozilla::ipc::RPCChannel::*)(), Tuple0>::Run() (task.h:307) >==25042== by 0x6ACE0A1: MessageLoop::RunTask(Task*) (message_loop.cc:326) >==25042== by 0x6ACE111: MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) (message_loop.cc:334) >==25042== > This appears to be a bug in IPDL-generated code; apparently the invalid read is of an |mId| of the stream that was just deleted. I'm surprised I haven't seen this in IPDL testing; I think it's because none of the tests use synchronous destructors. I'll cook up a test case to confirm. Spun off into bug 539856.
Reporter | ||
Updated•15 years ago
|
Blocks: LorentzBeta1
Updated•15 years ago
|
Blocks: LorentzAlpha
Assignee | ||
Comment 9•15 years ago
|
||
IPDL doesn't have the state machine completely hooked up, but this is a stateful protocol where the states are currently implemented by hand. Browser stream data is now delivered completely asynchronously. Apart from some horrible bugs in the plugin host and an IPDL message delivery ordering bug (both marked as dependencies), this should be good to go. It passes mochitests with the plugnhost patch.
Attachment #416173 -
Attachment is obsolete: true
Attachment #428662 -
Flags: review?(bent.mozilla)
Attachment #416173 -
Flags: review?(jones.chris.g)
Assignee | ||
Updated•15 years ago
|
Comment on attachment 428662 [details] [diff] [review] Make browser stream data delivery asynchronous, rev. 1 >+BrowserStreamChild::RecvWrite(const int32_t& offset, > ... >+ NS_ASSERTION(ALIVE == mState, "Received data after NPP_DestroyStream"); Let's abort on child-side state inconsistencies? >+ mInstance->mPluginIface->destroystream(&mInstance->mData, &mStream, reason); Do we not care about plugin errors here? What does pluginhost do here? >+ if (!r) { > ... >+ if (r == 0) { Nit: pick a style ;) >+static const int32_t kSendDataChunk = 0x1000; Nit: Can this hang out at the top of the file? I feel like that's where most people look for constants like this. > BrowserStreamParent::Write(int32_t offset, > int32_t len, > void* buffer) Is it worth asserting that len > 0?
Attachment #428662 -
Flags: review?(bent.mozilla) → review+
Reporter | ||
Comment 12•15 years ago
|
||
Comment on attachment 428886 [details] [diff] [review] DeliverData must not re-enter, rev. 1 Some of these invariants want comments wanted, but shipping alpha >> comments.
Attachment #428886 -
Flags: review?(jones.chris.g) → review+
Reporter | ||
Comment 13•15 years ago
|
||
The invariants as I understand them are - when NPP_Write() returns, a DeliverData task is in the event queue - when DeliverData returns, !mPending.empty() => there is a DeliverData task in the event queue I don't see those violated.
Assignee | ||
Comment 14•15 years ago
|
||
Assignee | ||
Comment 15•15 years ago
|
||
Comment on attachment 428982 [details] [diff] [review] DeliverData must not reenter, rev. 2 This patch suspends not just DeliverData, but also NPP_DestroyStream until data delivery is complete (or the plugin cancels it). jgriffin's stream tests were a lifesaver! For the record, this got landed and backed out last night because of async delivery and re-entrancy issues which reprented at least three different bugs in the first patch: DeliverData could re-enter (bad) event when DeliverData couldn't re-enter, merely appending to mPendingData could invalidate the reference we held to mPendingData[0] RecvNPP_DestroyStream could nest inside of DeliverData while there was pending data, which is both unexpected NPAPI behavior and also caused us to delete the actor while it was still on the stack
Attachment #428982 -
Attachment description: De → DeliverData must not reenter, rev. 2
Attachment #428982 -
Attachment is patch: true
Attachment #428982 -
Attachment mime type: application/octet-stream → text/plain
Attachment #428982 -
Flags: review?(bent.mozilla)
Comment on attachment 428982 [details] [diff] [review] DeliverData must not reenter, rev. 2 >+BrowserStreamChild::MaybeDeliverNPP_DestroyStream() > ... >+ NPReason reason = mDestroyPending; You don't seem to use this. >+ * stack frame. It instead posts a runnable using this tracker to cancel >+ * in case we are destroyed. >+ */ >+ ScopedRunnableMethodFactory<BrowserStreamChild> mDeliverDataTracker; I don't see where you would cancel this... Maybe you can avoid the tracker?
Updated•15 years ago
|
Attachment #428982 -
Flags: review?(bent.mozilla) → review+
Reporter | ||
Updated•15 years ago
|
Assignee | ||
Comment 17•15 years ago
|
||
This patch makes the destructor unidirectional without giving the protocol async semantics. Async semantics rev. 2 still has problems with NPN_URLNotify being delivered before NPP_DestroyStream.
Attachment #429040 -
Flags: review?(jones.chris.g)
Assignee | ||
Comment 18•15 years ago
|
||
If you get a chance, please land this along with the test I will attach and re-land bug 548217.
Assignee | ||
Comment 19•15 years ago
|
||
Reporter | ||
Comment 20•15 years ago
|
||
Comment on attachment 429040 [details] [diff] [review] RPC but unidirectional, rev. 1 This patch is still susceptible to a use-after-__delete__ hazard, I think. Namely, Parent Child --------- ------- CallWrite CallDestroy AnswerDestroy [defer Write] <- NPN_Destroy -> NPP_Destroy (?) Call__dealloc__ delete stream delete stream ... ... [process deferred] AnswerWrite BOOM This might be better in that the crash would likely happen in the plugin process, and we would error out of the C++ stack frame on which the deleted BrowserStreamParent actor made the out-call. I think we can work around this by - adding an NPP_Destroy() message - having that enqueue the stream for __delete__ before CallNPP_Destroy() - having AnswerNPP_Destroy call iface->NPP_Destroy - guarding re-entry of Write et al. with an mClosed flag - override OnExitedCxxStack() to flush the streamDeleteQueue, calling __delete__ The patch is pretty straightforward, but I don't think I'll be able to test it locally due to remote-X latency running firefox. If that doesn't work, I'll push it pending-r? to the e10s "tryserver" and if it doesn't flame, maybe we can transplant over tomorrow or Saturday.
Attachment #429040 -
Flags: review?(jones.chris.g) → review-
Reporter | ||
Comment 21•15 years ago
|
||
This is the basic idea, but it's failing the new mochitest-ipcplugins test, apparently because streams aren't being removed from the to-delete array as I expect them to be. In the hotel, my debugging cycle is around half an hour, so I'll go ahead and put this up for review and fix the bug tomorrow.
Attachment #429081 -
Flags: review?(benjamin)
Assignee | ||
Comment 22•15 years ago
|
||
Chris, the latest patch wasn't meant to be purely safe, just good enough for beta... the actual likelihood of race conditions happening is pretty small in this case... I'll look over this patch, but in the interest of starting alpha builds today I'd like to take the not-perfect patch if there continue to be problems with yours.
Assignee | ||
Comment 23•15 years ago
|
||
Comment on attachment 429081 [details] [diff] [review] Delete BrowserStreams off of ExitedCxxStack() callbacks The DeleteOnStackExit/DontDelete machinery seems really strange to me. I would expect that problem to normally be solved by posting a cancelable event and canceling the event in the destructor.
Attachment #429081 -
Flags: review?(benjamin) → review-
Reporter | ||
Comment 24•15 years ago
|
||
I don't feel comfortable with "RPC but unidirectional"; I think it's still susceptible to several bugs already on file. Can we give a "delete event" version of the latest patch a chance? I leave for home around noonish, I'm willing to "call it" then.
Reporter | ||
Comment 25•15 years ago
|
||
Passes mochitest-ipcplugins (along with the new test). Additionally "fixes" bug 543917 and presumably its likely dups.
Attachment #429081 -
Attachment is obsolete: true
Attachment #429148 -
Flags: review?(benjamin)
Assignee | ||
Updated•15 years ago
|
Attachment #429148 -
Flags: review?(benjamin) → review+
Reporter | ||
Comment 26•15 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/f829f942873d http://hg.mozilla.org/mozilla-central/rev/62af68860e7f Leaving open in case we can get async BrowserStream under the wire.
Assignee | ||
Comment 28•15 years ago
|
||
I'll mark this FIXED and do the async delivery in a followup.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•