Closed
Bug 1232042
Opened 9 years ago
Closed 9 years ago
repainting problems with combinations of maximizing, attaching/detaching (docking) Microsoft Surface Book, and restarting Firefox
Categories
(Core :: Graphics, defect)
Tracking
()
People
(Reporter: dbaron, Assigned: bas.schouten)
References
(Blocks 1 open bug)
Details
(Whiteboard: [gfx-noted])
Attachments
(4 files, 2 obsolete files)
This is a bug that I've observed a few times on a new Microsoft Surface Book running Windows 10.
However, I haven't figured out steps to reproduce the bug. However, I've determined that the minimum steps that are needed after rebooting the machine are some combination of:
1. maximizing and unmaximizing Firefox
2. starting and exiting Firefox (closing the window with the [X]), possibly while maximized
3. detaching and re-attaching the tablet half of the machine from the laptop
After doing some of these things, we get into a situation where the first repaint after the size change resulting from maximizing / unmaximizing doesn't actually repaint, but then later repaints work.
(That said, I've also seen some graphics card failures (reset by Windows or whatever it is) and one kernel crash during attempts to reproduce this bug.
I showed it to :Bas in action, and he's reasonably convinced that there's something failing in Graphics code.
The machine has two graphics cards:
1. Intel(R) HD Graphics 520 used when the tablet is detached (or sometimes, for some reasons, while it's attached too)
2. an NVIDIA card (can't get the details right now since it's not currently being used for some reason), which is used much of the time when the tablet is attached to the keyboard half of the machine
Reporter | ||
Comment 1•9 years ago
|
||
Reporter | ||
Comment 2•9 years ago
|
||
(not sure if rotation was involved here)
Reporter | ||
Comment 3•9 years ago
|
||
This one definitely didn't involve rotation of the tablet (since the most recent reboot), and also didn't involve going to about:support.
It seems we're running into this with SurfaceBook often enough that we need to figure out what it is. Bas, do you have access to one of those?
David, any error messages in graphics section of about:support when this happens?
Assignee: nobody → bas
Flags: needinfo?(bas)
Updated•9 years ago
|
Whiteboard: [gfx-noted]
Andrew, this about matches what you see?
Flags: needinfo?(overholt)
Comment 6•9 years ago
|
||
Yes, but i haven't pinned down an str yet. I see graphics crashes reported by the OS each time it happens and they often (always?) note the Intel driver.
Flags: needinfo?(overholt)
Reporter | ||
Comment 7•9 years ago
|
||
I've seen the graphics crash popup from the OS once or twice, but far from every time.
Assignee | ||
Comment 8•9 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #4)
> It seems we're running into this with SurfaceBook often enough that we need
> to figure out what it is. Bas, do you have access to one of those?
> David, any error messages in graphics section of about:support when this
> happens?
I do not have access to one of these. But I have heard the stories, and yes, we need to fix this.
I've also seen a lot of instability in the intel driver recently fwiw, but I'm not convinced there's a strong relation between the two. I get the intel driver instability with Edge as well.
Flags: needinfo?(bas)
Reporter | ||
Comment 9•9 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #4)
> David, any error messages in graphics section of about:support when this
> happens?
Not sure what I should be looking for there (or where), but I don't think so.
I just saw this without any docking/undocking. I'd been using flightradar24 in fullscreen, and tried to exit fullscreen, but hit Ctrl+PgDn first (which switched tabs to google calendar), and then F11 to un-fullscreen (the difference between the two is just Ctrl vs. Fn), and then the UI above the tab strip failed to paint on that one repaint (but then repainted later as I took further actions).
[Tracking Requested - why for this release]: We now have a standalone application that reproduces this problem, and we've sent the bug report to Microsoft. In the meantime, Bas is testing a workaround. We may want to consider it for full uplifts, especially since last weeks Windows update seems to have made this somewhat worse (although the reports are sketchy.)
tracking-firefox44:
--- → ?
tracking-firefox45:
--- → ?
tracking-firefox46:
--- → ?
tracking-firefox47:
--- → ?
Assignee | ||
Comment 11•9 years ago
|
||
Review commit: https://reviewboard.mozilla.org/r/32589/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/32589/
Attachment #8712616 -
Flags: review?(jmuizelaar)
Updated•9 years ago
|
Attachment #8712616 -
Flags: review?(jmuizelaar)
Comment 12•9 years ago
|
||
Comment on attachment 8712616 [details]
MozReview Request: Bug 1232042: Workaround Windows presentation bug by executing a present call on the main thread during a WM_PAINT event. r=jrmuizel
https://reviewboard.mozilla.org/r/32589/#review29451
::: gfx/ipc/GfxMessageUtils.h:791
(Diff revision 1)
> + WriteParam(aMsg, (void*)aParam.mSwapChain);
Can you specialize ParamTraits for mSwapChain instead of casting to void*?
It also seems like this could cause a use after free because a reference count is not being sent over IPC.
Can we explicitly hold a reference in both places instead of relying on some implicit ownership scheme?
status-firefox44:
--- → affected
status-firefox45:
--- → affected
status-firefox46:
--- → affected
status-firefox47:
--- → affected
Tracking for 45+, since this sounds like a bad issue for Windows users and we may want to uplift a fix.
Assignee | ||
Comment 14•9 years ago
|
||
Comment on attachment 8712616 [details]
MozReview Request: Bug 1232042: Workaround Windows presentation bug by executing a present call on the main thread during a WM_PAINT event. r=jrmuizel
Review request updated; see interdiff: https://reviewboard.mozilla.org/r/32589/diff/1-2/
Attachment #8712616 -
Flags: review?(jmuizelaar)
Comment 15•9 years ago
|
||
Comment on attachment 8712616 [details]
MozReview Request: Bug 1232042: Workaround Windows presentation bug by executing a present call on the main thread during a WM_PAINT event. r=jrmuizel
https://reviewboard.mozilla.org/r/32589/#review29781
::: gfx/ipc/GfxMessageUtils.h:790
(Diff revisions 1 - 2)
> + MOZ_ASSERT(XRE_IsParentProcess());
I believe these should take a reference to aParam and then move it into aResult.
::: gfx/ipc/GfxMessageUtils.h:796
(Diff revisions 1 - 2)
> + uint32_t supportedBlendModes = 0;
I don't think supportedBlendModes is supposed to be here and result should probably default to false.
::: gfx/layers/Compositor.cpp:30
(Diff revision 2)
> +TextureFactoryIdentifier::operator=(const TextureFactoryIdentifier& aOther)
This should be:
TextureFactoryIdentifier::operator=(const TextureFactoryIdentifier& aOther) = default;
same for the copy constructor.
Attachment #8712616 -
Flags: review?(jmuizelaar)
Assignee | ||
Comment 16•9 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #15)
> Comment on attachment 8712616 [details]
> MozReview Request: Bug 1232042: Workaround Windows presentation bug by
> executing a present call on the main thread during a WM_PAINT event.
> r=jrmuizel
>
> https://reviewboard.mozilla.org/r/32589/#review29781
>
> ::: gfx/ipc/GfxMessageUtils.h:790
> (Diff revisions 1 - 2)
> > + MOZ_ASSERT(XRE_IsParentProcess());
>
> I believe these should take a reference to aParam and then move it into
> aResult.
They can't, this is also sent to the child process, which then can't release the reference. There's no way to check from here whether we're sending to another thread or another process as far as I could tell. I could add a check to only send this if we're in a sync call, which is something we -can- check from here.
> ::: gfx/ipc/GfxMessageUtils.h:796
> (Diff revisions 1 - 2)
> > + uint32_t supportedBlendModes = 0;
>
> I don't think supportedBlendModes is supposed to be here and result should
> probably default to false.
Copy paste fail. No, it shouldn't, if we're not in the main process, i.e. a content process that's perfectly fine, we just don't want the content process to have a dangling pointer here. (A content process main thread will never have an nsWindow so it will never need this pointer)
> ::: gfx/layers/Compositor.cpp:30
> (Diff revision 2)
> > +TextureFactoryIdentifier::operator=(const TextureFactoryIdentifier& aOther)
>
> This should be:
> TextureFactoryIdentifier::operator=(const TextureFactoryIdentifier& aOther)
> = default;
>
> same for the copy constructor.
Hrm, doesn't seem to make any difference to the compiled code in VS2015. But sure :).
Flags: needinfo?(jmuizelaar)
Assignee | ||
Comment 17•9 years ago
|
||
Comment on attachment 8712616 [details]
MozReview Request: Bug 1232042: Workaround Windows presentation bug by executing a present call on the main thread during a WM_PAINT event. r=jrmuizel
Review request updated; see interdiff: https://reviewboard.mozilla.org/r/32589/diff/2-3/
Attachment #8712616 -
Flags: review?(jmuizelaar)
Updated•9 years ago
|
Attachment #8712616 -
Flags: review?(jmuizelaar) → review+
Comment 18•9 years ago
|
||
Comment on attachment 8712616 [details]
MozReview Request: Bug 1232042: Workaround Windows presentation bug by executing a present call on the main thread during a WM_PAINT event. r=jrmuizel
https://reviewboard.mozilla.org/r/32589/#review29987
:(
::: gfx/layers/Compositor.cpp:42
(Diff revision 2)
> +}
::: gfx/layers/CompositorTypes.h:175
(Diff revision 3)
> +#endif
Add a comment here about this member being needed for passing the swap chain to the main thread so that we can work around bugXXXXX
Comment 19•9 years ago
|
||
Comment 20•9 years ago
|
||
Curiouser and curiouser. Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/31373f8aaa98 for giving Win8 around a 10% chance of crashing during the addon manager browser-chrome tests:
https://treeherder.mozilla.org/logviewer.html#?job_id=21209334&repo=mozilla-inbound
https://treeherder.mozilla.org/logviewer.html#?job_id=21199549&repo=mozilla-inbound
https://treeherder.mozilla.org/logviewer.html#?job_id=21204017&repo=mozilla-inbound
https://treeherder.mozilla.org/logviewer.html#?job_id=21204528&repo=mozilla-inbound
Comment 21•9 years ago
|
||
And one assert about "D3D11 swap chain preset failed 0x887a0005" in https://treeherder.mozilla.org/logviewer.html#?job_id=21212984&repo=mozilla-inbound
Updated•9 years ago
|
Flags: needinfo?(jmuizelaar)
Assignee | ||
Comment 23•9 years ago
|
||
Comment on attachment 8712616 [details]
MozReview Request: Bug 1232042: Workaround Windows presentation bug by executing a present call on the main thread during a WM_PAINT event. r=jrmuizel
Review request updated; see interdiff: https://reviewboard.mozilla.org/r/32589/diff/3-4/
Assignee | ||
Comment 24•9 years ago
|
||
Let's try this. This is the only idea I've got.
Comment 25•9 years ago
|
||
Valid but not critical enough to warrant (inclusion in) a dot release. Wontfix for Fx44.
Comment 27•9 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
Bas, seems like this is something that we might just let ride the train with 47. Wontfixing for 45 and 46.
I thought I'd check with you and dbaron though; please let me know if it's something you would like to try uplifting.
Flags: needinfo?(dbaron)
Flags: needinfo?(bas)
Assignee | ||
Comment 29•9 years ago
|
||
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #28)
> Bas, seems like this is something that we might just let ride the train with
> 47. Wontfixing for 45 and 46.
>
> I thought I'd check with you and dbaron though; please let me know if it's
> something you would like to try uplifting.
Tricky. It seems this might fix some other issues people have been having, it might be worth uplifting it to Aurora at least.
Flags: needinfo?(bas)
Comment 30•9 years ago
|
||
Missed seeing it the first time, but both times this has apparently caused... something, like a driver bluescreen maybe, which causes the Win7 slaves to disconnect (that being the only visible symptom) and stop working until it is rebooted - the blue things in https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&group_state=expanded&filter-searchStr=ccd5c97a295aafb6dd47e1855ac4b66d587c26ca&tochange=70b70f6fc6a7&fromchange=78fe845914c0 are all slaves that I then had to reboot to get working again.
This apparently is causing Windows 7 machines that run it to break and become unusable.
Backed out in https://hg.mozilla.org/mozilla-central/rev/0629918a09ae
Status: RESOLVED → REOPENED
Flags: needinfo?(bas)
Resolution: FIXED → ---
Reporter | ||
Comment 32•9 years ago
|
||
I don't think I have anything to add here (or any relevant expertise).
Flags: needinfo?(dbaron)
Assignee | ||
Comment 33•9 years ago
|
||
(In reply to Phil Ringnalda (:philor) from comment #30)
> Missed seeing it the first time, but both times this has apparently
> caused... something, like a driver bluescreen maybe, which causes the Win7
> slaves to disconnect (that being the only visible symptom) and stop working
> until it is rebooted - the blue things in
> https://treeherder.mozilla.org/#/jobs?repo=mozilla-
> inbound&group_state=expanded&filter-
> searchStr=ccd5c97a295aafb6dd47e1855ac4b66d587c26ca&tochange=70b70f6fc6a7&from
> change=78fe845914c0 are all slaves that I then had to reboot to get working
> again.
Do we have any report of this occurring outside of our test machines?
Flags: needinfo?(bas)
Comment 34•9 years ago
|
||
This regressed tresize on windows:
https://treeherder.allizom.org/perf.html#/graphs?series=[mozilla-inbound,4ac681a39a4caefb56468c5bc86fa23b8cee4c4f,1]&zoom=1455204874546.0342,1455252525044.0476,11.619847807106572,12.372231838139383&selected=[mozilla-inbound,4ac681a39a4caefb56468c5bc86fa23b8cee4c4f,19235,18942494]
We didn't get an alert when it happened because there was some noise when it was committed, but we did get notified of the improvement when it was backed out:
https://treeherder.allizom.org/perf.html#/graphs?series=[mozilla-inbound,4ac681a39a4caefb56468c5bc86fa23b8cee4c4f,1]&zoom=1455624591937.6392,1455765760801.7817,11.567843809446881,12.506505519484055&selected=[mozilla-inbound,4ac681a39a4caefb56468c5bc86fa23b8cee4c4f,19468,19244849]
It's about a 2% regression overall. Not huge but it would be good to mitigate this in any updated patch if possible.
Comment 35•9 years ago
|
||
My backout of this patch apparently caused bug 1249512.
Blocks: 1249512
Comment 37•9 years ago
|
||
Assignee | ||
Comment 38•9 years ago
|
||
Attachment #8712616 -
Attachment is obsolete: true
Attachment #8721482 -
Flags: review?(milan)
Comment on attachment 8721482 [details] [diff] [review]
Bug 1232042: Workaround Windows presentation bug by executing a present call on the main thread during a WM_PAINT event.
Review of attachment 8721482 [details] [diff] [review]:
-----------------------------------------------------------------
::: gfx/layers/d3d11/CompositorD3D11.cpp
@@ +1145,5 @@
> + if (::GetTickCount() > waitStart + 10000) {
> + gfxDevCrash(LogReason::PresentNeverOccured);
> + break;
> + }
> + }
Change this to avoid calling ::GetTickCount() in the first place if mSyncState is ::Free?
::: gfx/thebes/gfxWindowsPlatform.cpp
@@ +2409,5 @@
>
> + RefPtr<ID3D10Multithread> multi;
> + HRESULT hr = mD3D11Device->QueryInterface(__uuidof(ID3D10Multithread), getter_AddRefs(multi));
> + if (SUCCEEDED(hr) && multi) {
> + multi->SetMultithreadProtected(TRUE);
I'm assuming this is on purpose and not part of a debugging session - can we put it behind a pref, just in case?
::: widget/windows/nsWindowGfx.cpp
@@ +518,5 @@
> case LayersBackend::LAYERS_CLIENT:
> + {
> + ClientLayerManager* clientLM =
> + static_cast<ClientLayerManager*>(GetLayerManager());
> + clientLM->PrepareForPresentedComposite();
ClientLayerManager is guaranteed to exist?
Attachment #8721482 -
Flags: review?(milan)
Attachment #8721482 -
Flags: review?(jmuizelaar)
Attachment #8721482 -
Flags: feedback+
Assignee | ||
Comment 40•9 years ago
|
||
Attachment #8721482 -
Attachment is obsolete: true
Attachment #8721482 -
Flags: review?(jmuizelaar)
Attachment #8722620 -
Flags: review?(jmuizelaar)
Comment 41•9 years ago
|
||
Comment on attachment 8722620 [details] [diff] [review]
Execute an additional present on the compositor thread when a WM_PAINT event returns
Review of attachment 8722620 [details] [diff] [review]:
-----------------------------------------------------------------
Please add some more comments about why we need this. CompositorD3D11.h and nsWindowGfx are probably the best places to put this.
Attachment #8722620 -
Flags: review?(jmuizelaar) → review+
Comment 42•9 years ago
|
||
Comment 43•9 years ago
|
||
Comment 44•9 years ago
|
||
Comment 45•9 years ago
|
||
Comment 46•9 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/246e0ff966fc
https://hg.mozilla.org/mozilla-central/rev/68295c584858
Status: REOPENED → RESOLVED
Closed: 9 years ago → 9 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•