Closed
Bug 1322633
Opened 8 years ago
Closed 8 years ago
Intermittent various tests | application crashed [@ mozilla::layers::CrossProcessCompositorBridgeParent::AllocPAPZCTreeManagerParent]
Categories
(Core :: Panning and Zooming, defect, P3)
Tracking
()
RESOLVED
FIXED
mozilla54
People
(Reporter: intermittent-bug-filer, Assigned: kats)
References
Details
(Keywords: crash, intermittent-failure, Whiteboard: [gfx-noted])
Crash Data
Attachments
(1 file)
(deleted),
patch
|
dvander
:
review+
gchang
:
approval-mozilla-aurora+
jcristau
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
Filed by: philringnalda [at] gmail.com
https://treeherder.mozilla.org/logviewer.html#?job_id=7745087&repo=autoland
https://archive.mozilla.org/pub/firefox/tinderbox-builds/autoland-linux-debug/1481249930/autoland_ubuntu32_vm-debug_test-mochitest-e10s-browser-chrome-6-bm07-tests1-linux32-build264.txt.gz
Assignee | ||
Comment 1•8 years ago
|
||
Crash appears to be at https://hg.mozilla.org/integration/autoland/file/3be54c62e0011dca5367be2a04e54bc6882f1908/gfx/layers/ipc/CrossProcessCompositorBridgeParent.cpp#l138
OS: Unspecified → Linux
Priority: -- → P3
Hardware: Unspecified → x86
Whiteboard: [gfx-noted]
Version: unspecified → 53 Branch
Assignee | ||
Comment 2•8 years ago
|
||
Seeing this on other tests too, like test_navigation.py TestNavigate.test_about_blank_for_new_docshell (Linux debug Mn job)
Summary: Intermittent browser/components/sessionstore/test/browser_windowStateContainer.js | application crashed [@ mozilla::layers::CrossProcessCompositorBridgeParent::AllocPAPZCTreeManagerParent] → Intermittent various tests | application crashed [@ mozilla::layers::CrossProcessCompositorBridgeParent::AllocPAPZCTreeManagerParent]
Comment hidden (Intermittent Failures Robot) |
Comment 4•8 years ago
|
||
Affects 52 as well:
https://treeherder.mozilla.org/logviewer.html#?job_id=71637938&repo=mozilla-beta
status-firefox52:
--- → affected
status-firefox53:
--- → affected
status-firefox54:
--- → affected
Keywords: crash
Assignee | ||
Updated•8 years ago
|
Assignee | ||
Updated•8 years ago
|
Comment 7•8 years ago
|
||
Firefox UI tests are also affected:
https://treeherder.mozilla.org/#/jobs?repo=autoland&filter-searchStr=Firefox%20UI&bugfiler&selectedJob=71836529
The crash there happens when opening and closing browser windows.
Severity: normal → critical
Assignee | ||
Comment 9•8 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #7)
> Firefox UI tests are also affected:
> https://treeherder.mozilla.org/#/jobs?repo=autoland&filter-
> searchStr=Firefox%20UI&bugfiler&selectedJob=71836529
https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=28765570b2933291117323689668a4ce99c03da8&selectedJob=71836529 is a better url.
It seems like all the crashes are happening on python-based tests. I'm not sure why. I've been trying to reproduce this locally but haven't had any success. If I can't get it today I'll try to just land some logging and wait for it to hit in automation.
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → bugmail
Assignee | ||
Comment 10•8 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #9)
> It seems like all the crashes are happening on python-based tests.
Scratch that, it happens on browser-chrome as well, which I missed.
Assignee | ||
Comment 11•8 years ago
|
||
I spent some time looking at the code. Because the crash is happening in CrossProcessCompositorBridgeParent::AllocPAPZCTreeManagerParent, the content process must be at [1]. That function in turn is called at [2], just a little bit after the PLayerTransaction constructor is sent across the same channel. If we look at the parent side of that PLayerTransaction constructor, it does a similar kind of check that AllocPAPZCTreeManagerParent does, but has a fallback path [3] to handle the case where the layers id is not found in sIndirectLayerTrees. This fallback path looks like a bogus thing based on the comment there and bug 900745. If we remove that bogus thing then presumably the crash will move up to that location instead of crashing in AllocPAPZCTreeManagerParent. I don't know if that's helpful but at least it's a more honest crash in that the right code will get blamed. :)
[1] http://searchfox.org/mozilla-central/rev/8fa84ca6444e2e01fb405e078f6d2c8da0e55723/dom/ipc/TabChild.cpp#2638
[2] http://searchfox.org/mozilla-central/rev/8fa84ca6444e2e01fb405e078f6d2c8da0e55723/dom/ipc/TabChild.cpp#2617
[3] http://searchfox.org/mozilla-central/rev/8fa84ca6444e2e01fb405e078f6d2c8da0e55723/gfx/layers/ipc/CrossProcessCompositorBridgeParent.cpp#97
Depends on: 900745
Assignee | ||
Comment 13•8 years ago
|
||
I have patches for bug 900745 and this one that I think will fix this crash. Try push is at https://treeherder.mozilla.org/#/jobs?repo=try&revision=a2d662e65982b2fa92bffd3ccbdfe0bc12b54b81.
Assignee | ||
Comment 14•8 years ago
|
||
The main change in this patch is moving the InitAPZState() call inside the if (success) guard, so that we don't call it if the PLayerTransaction construction wasn't successful (see bug 900745, which this patch depends on). The rest of the patch is just moving a hunk of code so that mLayersId is still set before the call to InitAPZState.
I'll file a follow-up bug for the graphics branch which will need a call to InitAPZState() inserted at https://hg.mozilla.org/projects/graphics/file/d52552ae7dcf/dom/ipc/TabChild.cpp#l2608 for the WR layermanager case.
Attachment #8831570 -
Flags: review?(dvander)
Comment hidden (Intermittent Failures Robot) |
bug 900745 looks super ancient. Why exactly do we fail to create a LayerTransactionParent?
Comment on attachment 8831570 [details] [diff] [review]
Don't call InitAPZState if PLayerTransaction creation failed
Review of attachment 8831570 [details] [diff] [review]:
-----------------------------------------------------------------
Actually if we're opening and closing a window I guess maybe we could deallocate the layer tree ID before the tab is shown?
Attachment #8831570 -
Flags: review?(dvander) → review+
Assignee | ||
Comment 18•8 years ago
|
||
(In reply to David Anderson [:dvander] from comment #17)
> Actually if we're opening and closing a window I guess maybe we could
> deallocate the layer tree ID before the tab is shown?
I guess I never said that anywhere, but yes - that seems to be the only way in which this situation might arise. The PLayerTransaction constructor (inadvertently) avoided the crash because of the dummy LayerTransactionParent that it creates on the parent side but the PAPZCTreeManager constructor just crashes. Instead of creating a dummy APZCTreeManagerParent as well I figured we could resurrect that "success" flag and use it as a guard to solve this problem.
Comment 19•8 years ago
|
||
Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d500ad0713bb
Don't initialize APZ protocols unless PLayerTransaction was successfully set up. r=dvander
Comment 20•8 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
Assignee | ||
Comment 21•8 years ago
|
||
Adding crash signatures. The crash volume on Nightly that matches this bug seems low enough that it won't be useful in letting us know if this is really fixed. I'm going to wait a few days and see if there are still matching intermittent failures - if not, then I'll request aurora uplift. I'm not sure it's worth uplifting to beta given it's relatively low frequency. I'll leave 52 as "affected" for now and decide later.
Crash Signature: [@ RefPtr<mozilla::layers::APZCTreeManager>::RefPtr| mozilla::layers::CrossProcessCompositorBridgeParent::AllocPAPZCTreeManagerParent]
[@ RefPtr<T>::RefPtr<T> | mozilla::layers::CrossProcessCompositorBridgeParent::AllocPAPZCTreeManagerParent]
[@ mozilla…
Assignee | ||
Comment 23•8 years ago
|
||
Comment on attachment 8831570 [details] [diff] [review]
Don't call InitAPZState if PLayerTransaction creation failed
Approval Request Comment
[Feature/Bug causing the regression]: gpu process separation, most likely
[User impact if declined]: intermittent failures/crashes when rapidly opening and closing windows
[Is this code covered by automated tests?]: yes
[Has the fix been verified in Nightly?]: we don't have clear STR so it's hard to say it's verified. the intermittent failures/crashes are also very low volume, but they haven't occurred since this landed.
[Needs manual test from QE? If yes, steps to reproduce]: no, since we don't have STR
[List of other uplifts needed for the feature/fix]: bug 900745
[Is the change risky?]: shouldn't be
[Why is the change risky/not risky?]: it's a pretty small change, only affects the "degenerate" code path where the window is closed so quickly that we're still in the process of initializing it. that's what causes the crash.
[String changes made/needed]: none
Attachment #8831570 -
Flags: approval-mozilla-beta?
Attachment #8831570 -
Flags: approval-mozilla-aurora?
Comment 24•8 years ago
|
||
Comment on attachment 8831570 [details] [diff] [review]
Don't call InitAPZState if PLayerTransaction creation failed
Fix a crash. Aurora53+.
Attachment #8831570 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 25•8 years ago
|
||
bugherder uplift |
Comment 26•8 years ago
|
||
Comment on attachment 8831570 [details] [diff] [review]
Don't call InitAPZState if PLayerTransaction creation failed
fix apz init crash, for beta52
Attachment #8831570 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Assignee | ||
Comment 27•8 years ago
|
||
So it looks like the code on beta for this is quite different. In particular, the InitRenderingState function has some early-returns and cleanup paths that aren't there in 53+ and the code movement hunk in the patch might have different effects on 52. I'll see if I can write an equivalent patch for 52 that's less risky.
Assignee | ||
Comment 28•8 years ago
|
||
Oh, actually - bug 900745 by itself should fix this on 52, because the TabChild.cpp code has an early return codepath that would avoid calling the AllocPAPZCTreeManager function. So we can just uplift bug 900745 to beta and don't need to bother with this patch.
Assignee | ||
Comment 29•8 years ago
|
||
I uplifted bug 900745 to beta, so marking this fixed on 52. For posterity, I'll just note that the patch on bug 900745 should cause the child process to go down the branch at [1] which should avoid the crash even without the patch on this bug.
[1] https://hg.mozilla.org/releases/mozilla-beta/file/dcdc85ecb2f3/dom/ipc/TabChild.cpp#l2628
Assignee | ||
Comment 30•8 years ago
|
||
Seems like this may not have fixed the issue, since bug 1338422 was just filed which looks similar and is from a build that contains the fix :( I'll use that bug to track.
Assignee | ||
Updated•8 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•