Closed
Bug 1134252
Opened 9 years ago
Closed 9 years ago
[e10s] PBrowserChild::SendGetRenderFrameInfo fails with 'message was deserialized, but contained an illegal value'
Categories
(Core :: Graphics, defect)
Tracking
()
RESOLVED
FIXED
mozilla42
People
(Reporter: jimm, Assigned: mconley)
References
Details
(Whiteboard: gfx-noted)
Attachments
(2 files)
(deleted),
text/x-review-board-request
|
billm
:
review+
ritu
:
approval-mozilla-aurora+
|
Details |
(deleted),
application/zip
|
Details |
http://mxr.mozilla.org/mozilla-central/source/dom/ipc/TabParent.cpp#2327 KillHard child signature breakdown: ------------------------------------ 21 14.9% mozilla::dom::PBrowserChild::SendGetRenderFrameInfo(mozilla::layout::PRenderFrameChild *,mozilla::layout::ScrollingBehavior IPC error breakdown per KillHard signature: -------------------------------------------- mozilla::dom::PBrowserChild::SendGetRenderFrameInfo(mozilla::layout::PRenderFrameChild *,mozilla::layout::ScrollingBehavior '(msgtype=0x200002,name=PBrowser::Msg_PRenderFrameConstructor) Value error: message was deserialized, but contained an illegal value'
Updated•9 years ago
|
tracking-e10s:
--- → m7+
Reporter | ||
Comment 1•9 years ago
|
||
http://www.mathies.com/mozilla/bug1116884.txt This is pretty low volume, but it represents an abort in the content process when it happens, and the error message indicates we should be able to avoid this.
Blocks: 1111396
Updated•9 years ago
|
Whiteboard: gfx-noted
Reporter | ||
Updated•9 years ago
|
Assignee: nobody → jmathies
Updated•9 years ago
|
Reporter | ||
Comment 2•9 years ago
|
||
top killhard crasher on aurora. This was already m8, reflaging so we can prioritize it.
Updated•9 years ago
|
Assignee: jmathies → mconley
Assignee | ||
Comment 3•9 years ago
|
||
So I think the problem here is that we're returning nullptr when attempting to allocate the RenderFrameParent in ContentParent: PRenderFrameParent* TabParent::AllocPRenderFrameParent() { MOZ_ASSERT(ManagedPRenderFrameParent().IsEmpty()); nsRefPtr<nsFrameLoader> frameLoader = GetFrameLoader(); TextureFactoryIdentifier textureFactoryIdentifier; uint64_t layersId = 0; bool success = false; if(frameLoader) { PRenderFrameParent* renderFrame = new RenderFrameParent(frameLoader, &textureFactoryIdentifier, &layersId, &success); MOZ_ASSERT(success); AddTabParentToTable(layersId, this); return renderFrame; } else { return nullptr; // <-- I think we're hitting this } } And the only reason we'd hit that is if frameLoader is nullptr: already_AddRefed<nsFrameLoader> TabParent::GetFrameLoader(bool aUseCachedFrameLoaderAfterDestroy) const { if (mIsDestroyed && !aUseCachedFrameLoaderAfterDestroy) { return nullptr; } if (mFrameLoader) { nsRefPtr<nsFrameLoader> fl = mFrameLoader; return fl.forget(); } nsCOMPtr<nsIFrameLoaderOwner> frameLoaderOwner = do_QueryInterface(mFrameElement); return frameLoaderOwner ? frameLoaderOwner->GetFrameLoader() : nullptr; } ^-- either the TabParent has already been destroyed, or we can't resolve the frame element to a frameLoaderOwner. So that's my current working theory. Not 100% sure what to do about it just yet.
Assignee | ||
Comment 4•9 years ago
|
||
The best I can come up with is that in some cases, the TabParent has started to destroy before the content process has a chance to allocate the RenderFrameParent. I _think_ there's a small window of time where that's possible, after the top-level window has been created, and before the parent responds to AllocPRenderFrameParent. Does that sound plausible? Any suggestions on what to do if that's the case, billm?
Flags: needinfo?(wmccloskey)
Your theory sounds correct. I remember hitting a situation like this a while ago but I don't remember the testcase. Perhaps you could try removing the branch here: http://mxr.mozilla.org/mozilla-central/source/dom/ipc/TabParent.cpp#2615 It looks like the RenderFrameParent might handle the case where the frameloader is null. The alternative is to change IPDL so it somehow allows null results from AllocPFoo functions, but that will be a lot harder.
Flags: needinfo?(wmccloskey)
Assignee | ||
Comment 6•9 years ago
|
||
Bug 1134252 - Don't crash the content process if RenderFrameParent is not constructed successfully. r?kats
Assignee | ||
Comment 7•9 years ago
|
||
Comment on attachment 8636808 [details] MozReview Request: Bug 1134252 - Don't crash the content process if RenderFrameParent is not constructed successfully. r?kats I don't really know how to feel about this patch. We can't return nullptr for AllocPRenderFrameParent (or any AllocPFoo, really) without the content process crashing, so we have to return a real pointer... But it looks like there are cases where the RenderFrameParent might fail to properly construct - like, the mFrameLoader might be nullptr or something. This patch makes it so that, in those cases, the child can detect it via the follow-up call to SendGetRenderFrameInfo, which, after receiving 0 as the layer ID, will tear down the RenderFrameParent/Child pair. The good news is that we no longer crash the content process. The bad news is that in this case, instead of crashing, we just fail to draw anything in the content area. I guess that's a bit better, but it's really not much better. I don't know what else to do in this case... is there a better way to recover from the possibility that mFrameLoader is nullptr?
Attachment #8636808 -
Flags: feedback?(wmccloskey)
Attachment #8636808 -
Flags: feedback?(bugmail.mozilla)
Comment 8•9 years ago
|
||
Comment on attachment 8636808 [details] MozReview Request: Bug 1134252 - Don't crash the content process if RenderFrameParent is not constructed successfully. r?kats Unfortunately I don't think I'm qualified to comment on this patch. My knowledge of RenderFrameParent is pretty much limited to the parts that deal with APZ, and this is not one of those parts :/
Attachment #8636808 -
Flags: feedback?(bugmail.mozilla)
Assignee | ||
Comment 9•9 years ago
|
||
Comment on attachment 8636808 [details] MozReview Request: Bug 1134252 - Don't crash the content process if RenderFrameParent is not constructed successfully. r?kats Hey fabrice, kats said that you might be a good person to ask my RenderFrameParent question to. Got any opinion on the above?
Attachment #8636808 -
Flags: feedback?(fabrice)
Comment on attachment 8636808 [details] MozReview Request: Bug 1134252 - Don't crash the content process if RenderFrameParent is not constructed successfully. r?kats https://reviewboard.mozilla.org/r/13717/#review12477 This is what I was thinking. No idea if it will work though :-). ::: dom/ipc/TabParent.cpp:2635 (Diff revision 1) > - MOZ_ASSERT(success); > + MOZ_ASSERT(success); Let's remove this line.
Attachment #8636808 -
Flags: review+
Assignee | ||
Comment 11•9 years ago
|
||
Comment on attachment 8636808 [details] MozReview Request: Bug 1134252 - Don't crash the content process if RenderFrameParent is not constructed successfully. r?kats Thanks! billm assuaged my fears over Vidyo, so feedback requests dropped.
Attachment #8636808 -
Flags: feedback?(wmccloskey)
Attachment #8636808 -
Flags: feedback?(fabrice)
Assignee | ||
Comment 12•9 years ago
|
||
url: https://hg.mozilla.org/integration/mozilla-inbound/rev/7aeee9f7969d87dafd3ebdec94a0af914c90fee6 changeset: 7aeee9f7969d87dafd3ebdec94a0af914c90fee6 user: Mike Conley <mconley@mozilla.com> date: Tue Jul 21 17:34:36 2015 -0400 description: Bug 1134252 - Don't crash the content process if RenderFrameParent is not constructed successfully. r=billm. We were returning a nullptr from AllocPRenderFrameParent in TabParent, which causes a killhard abort in the child. We suspect this is occurring because the TabParent is attempting to kick off drawing in a tab that's already closed (so there is no frame loader, which means we can't create a PRenderFrameParent). So now, we return a PRenderFrameParent* even if constructing it was unsuccessful, and the child destroys it once it confirms that there is an invalid layer ID associated with the RenderFrame.
Comment 13•9 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/7aeee9f7969d
Status: NEW → RESOLVED
Closed: 9 years ago
status-firefox42:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla42
Assignee | ||
Comment 14•9 years ago
|
||
Needinfo'ing myself to get this uplifted to Aurora after the weekend.
Flags: needinfo?(mconley)
Assignee | ||
Comment 15•9 years ago
|
||
[Tracking Requested - why for this release]: This is another content process crasher that we think we've found the solution for. This is e10s only, but as there are a good number of users opting into using e10s on Dev Edition, I figure it might be worth tracking and uplifting.
Assignee | ||
Comment 16•9 years ago
|
||
Comment on attachment 8636808 [details] MozReview Request: Bug 1134252 - Don't crash the content process if RenderFrameParent is not constructed successfully. r?kats Approval Request Comment [Feature/regressing bug #]: None. [User impact if declined]: More chance of the content process crashing (thus crashing all tabs) for users with e10s enabled. [Describe test coverage new/current, TreeHerder]: This code is exercised quite a bit by anything that creates or destroys remote browsers, which we do quite a bit in our testing suite. We've seen no issues in those tests. This also had a chance to bake over the weekend. [Risks and why]: Very low. In the event that we would have normally killed the content process, what we do instead is pass back a PRenderFrame to the content process with a layer ID of 0, which it understands as "invalid". It then destroys that PRenderFrame. The end result is that there is no ability to render anything in the content area for the associated TabParent/TabChild, but that's alright - we should only get into this case if the tab/browser was already being removed anyway. [String/UUID change made/needed]: None.
Attachment #8636808 -
Flags: approval-mozilla-aurora?
Comment on attachment 8636808 [details] MozReview Request: Bug 1134252 - Don't crash the content process if RenderFrameParent is not constructed successfully. r?kats Appears safe as it has been in m-c for a few days. Let's uplift to Aurora.
Attachment #8636808 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Assignee | ||
Comment 19•8 years ago
|
||
bugnotes |
http://people.mozilla.org/~mconley2/bugnotes/bug-1134252.html
You need to log in
before you can comment on or make changes to this bug.
Description
•