Closed
Bug 1187466
Opened 9 years ago
Closed 9 years ago
[Windows] crash in mozilla::layers::BasicCompositor::DrawQuad(mozilla::gfx::RectTyped<T> const&, mozilla::gfx::RectTyped<T> const&, mozilla::layers::EffectChain const&, float, mozilla::gfx::Matrix4x4 const&)
Categories
(Core :: Graphics: Layers, defect)
Tracking
()
People
(Reporter: u279076, Assigned: jerry)
References
Details
(Keywords: crash, topcrash, topcrash-win, Whiteboard: [gfx-noted])
Crash Data
Attachments
(1 file, 2 obsolete files)
(deleted),
patch
|
nical
:
review+
dvander
:
review+
Sylvestre
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-7b27d517-8e78-482b-a580-7b8da2150724.
=============================================================
0 xul.dll mozilla::layers::BasicCompositor::DrawQuad(mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::layers::EffectChain const&, float, mozilla::gfx::Matrix4x4 const&) gfx/layers/basic/BasicCompositor.cpp
1 xul.dll mozilla::layers::ImageHost::Composite(mozilla::layers::EffectChain&, float, mozilla::gfx::Matrix4x4 const&, mozilla::gfx::Filter const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const&, nsIntRegion const*) gfx/layers/composite/ImageHost.cpp
2 xul.dll mozilla::layers::ImageLayerComposite::RenderLayer(nsIntRect const&) gfx/layers/composite/ImageLayerComposite.cpp
3 xul.dll mozilla::layers::RenderLayers<mozilla::layers::ContainerLayerComposite>(mozilla::layers::ContainerLayerComposite*, mozilla::layers::LayerManagerComposite*, mozilla::gfx::IntRectTyped<mozilla::RenderTargetPixel> const&) gfx/layers/composite/ContainerLayerComposite.cpp
4 xul.dll mozilla::layers::ContainerRender<mozilla::layers::ContainerLayerComposite>(mozilla::layers::ContainerLayerComposite*, mozilla::layers::LayerManagerComposite*, nsIntRect const&) gfx/layers/composite/ContainerLayerComposite.cpp
5 xul.dll mozilla::layers::ContainerLayerComposite::RenderLayer(nsIntRect const&) gfx/layers/composite/ContainerLayerComposite.cpp
6 xul.dll mozilla::layers::RenderLayers<mozilla::layers::ContainerLayerComposite>(mozilla::layers::ContainerLayerComposite*, mozilla::layers::LayerManagerComposite*, mozilla::gfx::IntRectTyped<mozilla::RenderTargetPixel> const&) gfx/layers/composite/ContainerLayerComposite.cpp
7 xul.dll mozilla::layers::ContainerRender<mozilla::layers::ContainerLayerComposite>(mozilla::layers::ContainerLayerComposite*, mozilla::layers::LayerManagerComposite*, nsIntRect const&) gfx/layers/composite
=============================================================
More reports: https://crash-stats.mozilla.com/report/list?product=Firefox&signature=mozilla%3A%3Alayers%3A%3ABasicCompositor%3A%3ADrawQuad%28mozilla%3A%3Agfx%3A%3ARectTyped%3CT%3E+const%26%2C+mozilla%3A%3Agfx%3A%3ARectTyped%3CT%3E+const%26%2C+mozilla%3A%3Alayers%3A%3AEffectChain+const%26%2C+float%2C+mozilla%3A%3Agfx%3A%3AMatrix4x4+const%26%29
Note, I've filed this as a Windows bug but there are reports of this on Mac OS X. I suspect the Mac OS X crashes deserve a different bug for the following reasons:
1. The crash reasons are different.
* Windows shows EXCEPTION_ACCESS_VIOLATION_READ
* Mac shows EXC_BAD_ACCESS/KERN_INVALID_ADDRESS
2. The crashes start with different Firefox versions
* Windows crashes started with Firefox 30
* Mac crashes started with Firefox 39
3. Volume is higher on Mac OS X than it is on Windows, especially when you consider user population.
* Windows has 43% of the crashes with 87% of the users
* Mac OS X has 57% of the crashes with 7% of the users
4. The URLs reported are quite different on Windows compared to Mac
* Top URL on Windows is Facebook.com with 83% whereas only 2% of the crashes on Mac OS X are with Facebook.com
* Top URL on Mac is Google (Maps and Chrome) with 70% whereas these URLs don't show up at all for Windows users
5. The stacks are slightly different.
* Frame 2 on Windows is in mozilla::layers::ImageLayerComposite::RenderLayer(nsIntRect const&)
* Frame 2 on Mac is in mozilla::layers::CanvasLayerComposite::RenderLayer(nsIntRect const&)
The Mac OS X case is covered by bug 1187464. Feel free to merge these bugs if you think they're the same.
Note that this is the #125 topcrash for Windows-specific crashes. It's #48 overall which I suspect is driven by the Mac-specific case in bug 1187464 (ranked #6 for Mac crashes).
Summary: crash in mozilla::layers::BasicCompositor::DrawQuad(mozilla::gfx::RectTyped<T> const&, mozilla::gfx::RectTyped<T> const&, mozilla::layers::EffectChain const&, float, mozilla::gfx::Matrix4x4 const&) → [Windows] crash in mozilla::layers::BasicCompositor::DrawQuad(mozilla::gfx::RectTyped<T> const&, mozilla::gfx::RectTyped<T> const&, mozilla::layers::EffectChain const&, float, mozilla::gfx::Matrix4x4 const&)
Updated•9 years ago
|
Crash Signature: [@ mozilla::layers::BasicCompositor::DrawQuad(mozilla::gfx::RectTyped<T> const&, mozilla::gfx::RectTyped<T> const&, mozilla::layers::EffectChain const&, float, mozilla::gfx::Matrix4x4 const&)] → [@ mozilla::layers::BasicCompositor::DrawQuad(mozilla::gfx::RectTyped<T> const&, mozilla::gfx::RectTyped<T> const&, mozilla::layers::EffectChain const&, float, mozilla::gfx::Matrix4x4 const&)]
[@ mozilla::layers::BasicCompositor::DrawQuad]
Updated•9 years ago
|
Updated•9 years ago
|
Flags: needinfo?(sotaro.ikeda.g)
Whiteboard: [gfx-noted]
Peter, this is just outside the Top 20 on Beta, but almost in top 10 in Aurora; anybody you have that could take a look?
Flags: needinfo?(howareyou322)
Updated•9 years ago
|
Comment 3•9 years ago
|
||
Jerry, please help to check it.
Assignee: nobody → hshih
Flags: needinfo?(howareyou322)
Assignee | ||
Comment 4•9 years ago
|
||
I will check the crash report first.
Now in top 10 for Aurora. Note that we wouldn't see this as a crash in beta and release, because we turn it into telemetry instead. Example crash: https://crash-stats.mozilla.com/report/index/b8be9da7-86dc-4011-8f32-9d33d2160325
Assignee | ||
Comment 6•9 years ago
|
||
From comment 5, it shows an interesting log:
|[172][GFX1 21]: Bad for basic with DataTextureSourceD3D11 and 00000000 (t=1.31793e+006)
The log comes from this:
https://hg.mozilla.org/releases/mozilla-aurora/annotate/bf6d2a4b1031/gfx/layers/basic/BasicCompositor.cpp#l514
The texture type is "DataTextureSourceD3D11", but we don't implement the AsSourceBasic() in the class DataTextureSourceD3D11. So we hit the checking.
I would like to check why we pass a DataTextureSourceD3D11 to the BasicCompositor.
Assignee | ||
Comment 7•9 years ago
|
||
I'm trying to force resetting the device while video playing and check whether we hit this assert.
Status: NEW → ASSIGNED
Assignee | ||
Comment 8•9 years ago
|
||
Assignee | ||
Comment 9•9 years ago
|
||
It's my assumption. If the device changes from d3d11 to software while playing a video, we might still receive a DataTextureSourceD3D11 and pass it to BasicCompositor.
Gecko will invalidate all frames when driver reset, but maybe the ImageBridge still the old configuration.
Is it possible?
Flags: needinfo?(nical.bugzilla)
Assignee | ||
Updated•9 years ago
|
Flags: needinfo?(sotaro.ikeda.g)
Assignee | ||
Updated•9 years ago
|
Flags: needinfo?(sotaro.ikeda.g)
Comment 10•9 years ago
|
||
(In reply to Jerry Shih[:jerry] (UTC+8) from comment #9)
> Is it possible?
Yea, it is possible, I think. Compositor reset seems not directly affect to ImageBridge's TextureClient.
And media framework's implementation seems isolated from all frames invalidation. Compositor backend type is cached in MediaFormatReader::InitLayersBackendType().
https://dxr.mozilla.org/mozilla-central/source/dom/media/MediaFormatReader.cpp#145
Flags: needinfo?(sotaro.ikeda.g)
Comment 12•9 years ago
|
||
Bug 1254874 might be used to address the problem. But I am not sure about it.
Assignee | ||
Comment 13•9 years ago
|
||
I will turn to check compatibility between the texture type and the current compositor. And just have a early return before drawQuad.
Assignee | ||
Comment 14•9 years ago
|
||
Try to check the TextureSource type before calling the DrawQuad().
In the crash report: https://crash-stats.mozilla.com/report/index/b8be9da7-86dc-4011-8f32-9d33d2160325
We can see a gfx message like:
|[172][GFX1 21]: Bad for basic with DataTextureSourceD3D11 and 00000000 (t=1.31793e+006)
We hit the assert when the TextureSource->AsSourceBasic() return nullptr.
https://hg.mozilla.org/releases/mozilla-aurora/annotate/bf6d2a4b1031/gfx/layers/basic/BasicCompositor.cpp#l514
I have an assumption that the backend might change during the video playback. So the TextureSource type might not be compatible with current compositor.
So I add the type checking before the draw call.
Assignee | ||
Updated•9 years ago
|
Attachment #8741673 -
Attachment is obsolete: true
Assignee | ||
Comment 15•9 years ago
|
||
Comment on attachment 8742305 [details] [diff] [review]
check TextureSource before draw quad. v1
Please check comment 14
Attachment #8742305 -
Flags: review?(mstange)
Comment 16•9 years ago
|
||
It seems like we should prevent this from happening at a point that's closer to the source. Unfortunately I don't know this code well enough to say where that might be.
Matt, can you help?
Flags: needinfo?(matt.woodrow)
Updated•9 years ago
|
Attachment #8742305 -
Flags: review?(mstange)
Comment 17•9 years ago
|
||
CC'ing dvander, since recovering from a Compositor change is something he's worked on a lot.
It seems like we should be calling SetCompositor() on all TextureHosts when we switch Compositors, and that should destroy any existing TextureSources that don't match.
We also need to make sure that media is being notified on this compositor change so that we can start sending the right texture types for video playback to resume.
Flags: needinfo?(matt.woodrow) → needinfo?(dvander)
(In reply to Matt Woodrow (:mattwoodrow) from comment #17)
> CC'ing dvander, since recovering from a Compositor change is something he's
> worked on a lot.
>
> It seems like we should be calling SetCompositor() on all TextureHosts when
> we switch Compositors, and that should destroy any existing TextureSources
> that don't match.
>
> We also need to make sure that media is being notified on this compositor
> change so that we can start sending the right texture types for video
> playback to resume.
Aurora is missing some of the follow-up patches for device resets, namely bug 1256517 and bug 1255711 which I will request uplift on ASAP. That could reduce a lot of these crashes.
That said,
(1) I didn't address the media problem you mentioned. That's still a big TODO, potentially a little trickier than normal content since there is one ImageBridge for multiple compositors.
(2) OMTC currently takes a lazy approach to ensuring Compositables are bound to the right compositor. It would be much cleaner to atomically switch everything attached to all layers, and SetCompositor on new stuff as soon as it comes in. The new could eliminate tons of checks. Layer transactions could also forward along the last known compositor ID, which would make it a lot easier to reject old transactions. Currently we rely on IPDL ordering which might make solving (1) harder.
Maybe nical knows if there's a reason it's lazy right now, making it up-front seems like it would simplify things quite a bit.
Flags: needinfo?(dvander) → needinfo?(nical.bugzilla)
Comment 19•9 years ago
|
||
(In reply to David Anderson [:dvander] from comment #18)
>
> Maybe nical knows if there's a reason it's lazy right now, making it
> up-front seems like it would simplify things quite a bit.
I think that it organically grew into beeing lazy. We'd run into cases where a texture didn't have a compositor in various places in the code and we'd add SetCompositor calls there.
It wasn't a design choice as far as I know, and I have no objections if you want to make it more up-front.
Flags: needinfo?(nical.bugzilla)
Assignee | ||
Comment 20•9 years ago
|
||
Comment on attachment 8742305 [details] [diff] [review]
check TextureSource before draw quad. v1
There is no notification from parent to child side decoder for compositor change. So I just do the simply way that just skip the frame which is not usable by current compositor.
Maybe we could do the notification task later.
Attachment #8742305 -
Flags: feedback?(nical.bugzilla)
Comment 21•9 years ago
|
||
Comment on attachment 8742305 [details] [diff] [review]
check TextureSource before draw quad. v1
Review of attachment 8742305 [details] [diff] [review]:
-----------------------------------------------------------------
This should not be needed because calling SetCompositor with an incompatible compositor/texture pair should put the TextureHost in a state where it fails to Lock() and we never end up calling DrawQuad with the incompatible TextureSource. I would prefer that we make sure that this holds true for all texture types, rather than add another layer of checks in in the compositor.
Attachment #8742305 -
Flags: feedback?(nical.bugzilla) → feedback-
If we can do that (prevent the bad thing from happening), I would want to see gfxDevCrash() on the bad compositor/texture source mixtures. So, prevent it from happening, but report if it does happen, and make it fatal on nightly and aurora by using gfxDevCrash.
Updated•9 years ago
|
Keywords: topcrash,
topcrash-win
Reporter | ||
Comment 23•9 years ago
|
||
This is currently #27 in Release [2629 crashes], #15 in Aurora [67 crashes], #22 in Nightly [22 crashes] (it doesn't show up in the topcrash report for Beta).
status-firefox46:
--- → affected
status-firefox47:
--- → affected
status-firefox48:
--- → affected
status-firefox49:
--- → affected
status-firefox-esr45:
--- → affected
OS: Windows NT → Windows
Assignee | ||
Comment 24•9 years ago
|
||
Attachment #8757642 -
Flags: review?(nical.bugzilla)
Attachment #8757642 -
Flags: review?(dvander)
Assignee | ||
Updated•9 years ago
|
Attachment #8742305 -
Attachment is obsolete: true
Assignee | ||
Comment 25•9 years ago
|
||
Comment on attachment 8757642 [details] [diff] [review]
check compositor status in TextureHost::lock(). v1
Review of attachment 8757642 [details] [diff] [review]:
-----------------------------------------------------------------
With bug 1256517, we have some checking for compositor type in CompositableParentManager::ReceiveCompositableUpdate().
If the driver reset is occurred during scheduling, we will use the incompatible compositor for painting.
There are a lot of checking in SetCompositor()[1][2]. If we set an incompatible compositor(e.g. a BasicCompositor) to a textureHost(e.g. a DXGITextureHostD3D9), we will clear all textureSource in SetCompositor().
But in TextureHost::lock(), gecko still create the textureSource(e.g. a DataTextureSourceD3D9) back[3]. Then this textureSource is still incompatible with current compositor.
So, my solution is that we check the compositor in lock(). If the compositor is nullptr, make the lock() fail.
[1]
https://hg.mozilla.org/mozilla-central/annotate/4d63dde701b47b8661ab7990f197b6b60e543839/gfx/layers/d3d9/TextureD3D9.cpp#l950
[2]
https://hg.mozilla.org/mozilla-central/annotate/4d63dde701b47b8661ab7990f197b6b60e543839/gfx/layers/d3d11/TextureD3D11.cpp#l673
[3]
https://hg.mozilla.org/mozilla-central/annotate/4d63dde701b47b8661ab7990f197b6b60e543839/gfx/layers/d3d11/TextureD3D11.cpp#l700
ReceiveCompositableUpdate
|
-------------------> |
|
Schedule composition |
+-----------+
| |
| |
| | The driver reset is
| | happened during scheduling
| | <----------
| |
| |
+---------> |We still use incompatible
|compositor during painting
|
|
v
Assignee | ||
Updated•9 years ago
|
Assignee | ||
Comment 26•9 years ago
|
||
And I think the attachment 8757642 [details] [diff] [review] could also solve the problem in bug 1275179.
Comment 27•9 years ago
|
||
(In reply to Jerry Shih[:jerry] (UTC+8) from comment #25)
> Comment on attachment 8757642 [details] [diff] [review]
...
> There are a lot of checking in SetCompositor()[1][2]. If we set an
> incompatible compositor(e.g. a BasicCompositor) to a textureHost(e.g. a
> DXGITextureHostD3D9), we will clear all textureSource in SetCompositor().
> But in TextureHost::lock(), gecko still create the textureSource(e.g. a
> DataTextureSourceD3D9) back[3]. Then this textureSource is still
> incompatible with current compositor.
...
Good point, Jerry.
Updated•9 years ago
|
Attachment #8757642 -
Flags: review?(nical.bugzilla) → review+
Assignee | ||
Comment 28•9 years ago
|
||
This is a top 20 on Windows, I'd like to see it uplifted to 48 Aurora (probably too late for 47 beta at this point, but it would be a good ride along.) Please nominate once it lands on nightly. Thanks!
Version: Trunk → 30 Branch
Attachment #8757642 -
Flags: review?(dvander) → review+
Assignee | ||
Comment 30•9 years ago
|
||
Please land the attachment 8757642 [details] [diff] [review] to m-c.
try result:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=83da2a15425d
Keywords: checkin-needed
Comment 32•9 years ago
|
||
Keywords: checkin-needed
Comment 33•9 years ago
|
||
bugherder |
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla49
Assignee | ||
Comment 34•9 years ago
|
||
Comment on attachment 8757642 [details] [diff] [review]
check compositor status in TextureHost::lock(). v1
Approval Request Comment
[Feature/regressing bug #]:
Bug 1187466, bug 1275179.
[User impact if declined]:
This is a top 20 crash on Windows. It has chance to crash when we have driver reset during video playback.
[Describe test coverage new/current, TreeHerder]:
land at mozilla49.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=83da2a15425d
[Risks and why]:
low.
Just have early return for incompatible compositor case.
[String/UUID change made/needed]:
none
Attachment #8757642 -
Flags: approval-mozilla-aurora?
Updated•9 years ago
|
Comment 35•9 years ago
|
||
Comment on attachment 8757642 [details] [diff] [review]
check compositor status in TextureHost::lock(). v1
Fix a top crash, taking it
Attachment #8757642 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 36•9 years ago
|
||
bugherder uplift |
(In reply to Milan Sreckovic [:milan] from comment #29)
> This is a top 20 on Windows, I'd like to see it uplifted to 48 Aurora
> (probably too late for 47 beta at this point, but it would be a good ride
> along.) Please nominate once it lands on nightly. Thanks!
Hi Milan, 47 has been on the release channel for a week. I looked at the # of instances of this crash on release channel and there is only ~30 instances. Given that, I would prefer to wontfix this for Fx47. Please let me know if you have any concerns. Thanks!
Flags: needinfo?(milan)
It's good on 48.
Flags: needinfo?(milan)
Comment 39•8 years ago
|
||
Crash volume for signature 'mozilla::layers::BasicCompositor::DrawQuad':
- nightly (version 50): 2 crashes from 2016-06-06.
- aurora (version 49): 19 crashes from 2016-06-07.
- beta (version 48): 54 crashes from 2016-06-06.
- release (version 47): 173 crashes from 2016-05-31.
- esr (version 45): 1202 crashes from 2016-04-07.
Crash volume on the last weeks:
Week N-1 Week N-2 Week N-3 Week N-4 Week N-5 Week N-6 Week N-7
- nightly 0 0 0 0 2 0 0
- aurora 3 3 2 1 4 3 0
- beta 6 8 4 10 9 11 4
- release 31 21 28 29 21 26 9
- esr 172 121 147 144 130 103 83
Affected platforms: Windows, Linux
status-firefox50:
--- → affected
You need to log in
before you can comment on or make changes to this bug.
Description
•