compositor window blocks direct manipulation from working
Categories
(Core :: Graphics, defect, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox77 | --- | fixed |
People
(Reporter: tnikkel, Assigned: tnikkel)
References
(Blocks 1 open bug)
Details
Attachments
(3 files)
I found that disabling the compositor window allows my direct manipulation code to work. If I add WS_EX_LAYERED here
then my dmanip code works. We used to have WS_EX_LAYERED there but we removed it in bug 1544074 due to a blank window on startup reported in at least on configuration.
Let's rewind back a bit to why we had WS_EX_LAYERED there in the first place. In https://bugs.chromium.org/p/chromium/issues/detail?id=871257 Chrome had problems with laggy scrolling (late 2018). From comment 55 in that bug https://bugs.chromium.org/p/chromium/issues/detail?id=871257#c55 a contact from Microsoft provided this explanation:
"the speed hit-test is finding a window in one chrome process [A] (class “Intermediate D3D Window”), but the secondary hit-test once inside that process is finding a different one in a different chrome process ... We fixed one bug on the OS side that was causing issues here, but it seems like there’s another. ... So my recommendation is to figure out why this window is getting hit-tested, and stop it."
And from comment 63 https://bugs.chromium.org/p/chromium/issues/detail?id=871257#c63 the same Microsoft contact:
"Unfortunately, that means that neither of these is a Chrome issue."
Comment 77 https://bugs.chromium.org/p/chromium/issues/detail?id=871257#c77 from the Microsoft contact:
"Ross suggest us using 'WS_EX_LAYERED | WS_EX_TRANSPARENT | WS_EX_NOREDIRECTIONBITMAP' to fix this issue.
The first two should make the window transparent for input, and the third avoids allocating a bitmap that would otherwise be allocated with WS_EX_LAYERED (and is only necessary if using Gdi objects with the window)."
Comment 79 https://bugs.chromium.org/p/chromium/issues/detail?id=871257#c79 explains this further (quoting from msdn):
'"Hit Testing
Hit testing of a layered window is based on the shape and transparency of the window. This means that the areas of the window that are color-keyed or whose alpha value is zero will let the mouse messages through.
If the layered window has the WS_EX_TRANSPARENT extended window style, the shape of the layered window will be ignored and the mouse events will be passed to the other windows underneath the layered window."'
So Chrome lands that workaround and then they here back from Microsoft
"So, in conclusion, we have a fix for 19H1 but in the existing RS4 and RS5 OS, we recommend the workaround from Chrome’s end."
No fast forward to Feb 2019 and in bug 1525183 we are seeing laggy scrolling (I think because we started using Direct Composition which was also the reason in Chrome) and so we import the same 3 flags to our compositor window and it seems to fix the bugs we were seeing.
In April 2019 in bug 1544074 a blank window on startup was reported on some hardware and on a hunch Sotaro suspects WS_EX_LAYERED | WS_EX_TRANSPARENT and removing those fixes the problem. The 19H1 release of Windows 10 wasn't until May, so either the flags didn't fix any problems when they were added or Microsofts fix was backported to earlier versions of Windows 10 by then because there are no linked regressions from removing those flags.
(Chrome also does one thing we don't: put the compositor window on the bottom using HWND_BOTTOM. I tried doing that when we SetParent the window and it doesn't fix damnip.)
So my guess is that "speed hit-testing" that Windows does (mentioned above) is used when determining if a dmanip session should be started, and the "secondary hit-test" is skipped either intentionally or by a bug in Windows. So we'll probably need to use this workaround even if we report it to MS and MS agrees that it is a bug (because it will take a while for that and older versions of Win 10 might not get that fix).
Our options:
-
add back WS_EX_LAYERED and test on the hardware from bug 1544074 if that flag alone causes the problem. If it regresses bug 1544074 then blocklist the hardware from bug 1544074 from some feature so we don't hit this (the hardware is from 2011 so seems quite reasonable to do this). This assumes WS_EX_LAYERED is enough and we don't need WS_EX_TRANSPARENT to fully fix this.
-
Same as 1) but add back both WS_EX_LAYERED and WS_EX_TRANSPARENT
-
Some other solution?
Any other thoughts, anything I missed?
Assignee | ||
Updated•5 years ago
|
Updated•5 years ago
|
Comment 1•5 years ago
|
||
:tnikkel, thank you for detailed explanation! All necessary information is there. It seems that option(1) is reasonable. If it has a problem, we could try (2). By the way, how can we test the hardware from bug 1544074?
Assignee | ||
Comment 2•5 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #1)
By the way, how can we test the hardware from bug 1544074?
We could try to reproduce the original problem on similar hardware in our inventory, or ask the original reporter from that bug to test for us if they are willing, as a last resort we could buy the hardware if we deem it necessary and worthwhile.
Assignee | ||
Comment 3•5 years ago
|
||
Updated•5 years ago
|
Assignee | ||
Comment 4•5 years ago
|
||
Michal, we need to partially revert the fix from bug 1544074. Would you be willing to try a couple of builds to see if the problem from bug 1544074 returns or not?
Assignee | ||
Comment 5•5 years ago
|
||
One more question Sotaro. From msdn:
"Hit testing of a layered window is based on the shape and transparency of the window. This means that the areas of the window that are color-keyed or whose alpha value is zero will let the mouse messages through."
Is the compositor window always going to satisfy that?
Comment 6•5 years ago
|
||
I am actually not sure it. But the following says that the layered window will not become visible until the SetLayeredWindowAttributes or UpdateLayeredWindow function has been called for this window.
And they are not called for compositor windows. Then it seems like same as transparent.
Assignee | ||
Updated•5 years ago
|
Comment 8•5 years ago
|
||
bugherder |
Comment 9•5 years ago
|
||
(In reply to Timothy Nikkel (:tnikkel) from comment #4)
Michal, we need to partially revert the fix from bug 1544074. Would you be willing to try a couple of builds to see if the problem from bug 1544074 returns or not?
Yes I can try. Pls ping me when needed.
Assignee | ||
Comment 10•5 years ago
|
||
Thank you!
Could you test these 4 builds and report back which has the blank window and which don't, if any?
(The builds come from these try pushes https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&author=tnikkel%40gmail.com&fromchange=964a6ee7303e13a2fde3f1c3c8d7fcf359200754&tochange=8c5cc170b91ef9d556b9c4d2c544d54b8ebdf37d )
Assignee | ||
Comment 11•5 years ago
|
||
Also, since that machine has nvidia, try to avoid the latest nvidia drivers, they have problems (bug 1625857). So verify in about:support that you are still getting webrender and your driver is not blocked so it's the same as in bug 1544074. Thanks!
Comment 12•5 years ago
|
||
OK - https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/CCgM3MFpRWeAuA-q3aXohg/runs/0/artifacts/public/build/target.zip
blank window - https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/Hs-8z1W_S_eF8Jv6Ic3yPA/runs/0/artifacts/public/build/target.zip
OK - https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/JdDA2aBdTYW79ZuUu8w0sQ/runs/0/artifacts/public/build/target.zip
blank window - https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/f3cGM2agSRqeIZoi0cNVWQ/runs/0/artifacts/public/build/target.zip
However even those that worked crashed almost immediately after opening arstechnica.com. I have submitted few crash reports.
bp-b1b6ff78-d1c7-421d-a9d3-6ff790200408
bp-c4efb96f-51d0-4bff-9fec-b5b4b0200408
bp-9fee7d80-49a8-4567-a1e4-ae7020200408
bp-8fc11c29-be7f-4685-9814-6ba020200408
bp-15a4f115-c51d-4279-97d4-45b710200408
I am not sure if this is not caused by some problems with my hardware. I have disabled nvidia card some time ago, because it was causing random BSODs. I will try to play with different versions of nvidia drivers and if I see any differences I will report back.
Comment 13•5 years ago
|
||
Just to add to comment 12 :
- after updating chipset drivers your builds will no longer crash
- current nightly 77.0a1 (20200408033650) will only show blank window
- when running on nvidia card the performance is poor. UI is lagging, scrolling is terrible...
Assignee | ||
Comment 14•5 years ago
|
||
(In reply to Michal Kluka from comment #12)
OK - https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/CCgM3MFpRWeAuA-q3aXohg/runs/0/artifacts/public/build/target.zip
blank window - https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/Hs-8z1W_S_eF8Jv6Ic3yPA/runs/0/artifacts/public/build/target.zip
OK - https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/JdDA2aBdTYW79ZuUu8w0sQ/runs/0/artifacts/public/build/target.zip
blank window - https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/f3cGM2agSRqeIZoi0cNVWQ/runs/0/artifacts/public/build/target.zip
Thanks! The two OK builds correspond to a unmodified build and a build with WS_EX_TRANSPARENT. The two blank builds corresponds to adding WS_ES_LAYERED and WS_ES_LAYERED | WS_EX_TRANSPARENT respectively.
(In reply to Michal Kluka from comment #13)
Just to add to comment 12 :
- after updating chipset drivers your builds will no longer crash
Good!
- current nightly 77.0a1 (20200408033650) will only show blank window
current nightly has WS_ES_LAYERED so that is expected.
- when running on nvidia card the performance is poor. UI is lagging, scrolling is terrible...
In which builds is the performance poor? All of them?
Assignee | ||
Comment 15•5 years ago
|
||
Sotaro, the issue returns with WS_ES_LAYERED. So should we blocklist something on that hardware? I don't know much about the compositor window and blocklist stuff. Can you suggest what to do here?
Comment 16•5 years ago
|
||
If WS_ES_LAYERED caused the problem, it seems better to block on the hardware. The following could be one idea for it.
It is a bit tricky to block compositor window usage. Current gecko does not have one gfxVars config just for controlling compositor window creation. But there are some gfxVars configs that enable compositor window if gpu process exists.
- gfxVars::UseDoubleBufferingWithCompositor()
- gfxVars::UseWebRenderDCompWin()
- gfxVars::UseWebRenderFlipSequentialWin()
Then it seems that we could do the following.
- Add FEATURE for blocking compositor window
- Disable the above gfxVars configs when compositor window feature is blocked and gpu process usage is enabled.
- GPU process usage could be checked by gfxConfig::IsEnabled(Feature::GPU_PROCESS).
WinCompositorWidget::EnsureCompositorWindow() creates compositor window.It is called in the following places.
https://searchfox.org/mozilla-central/rev/9120151ddb35f2d4c37bfe613a54a4f10a9a3dc5/gfx/layers/ipc/CompositorBridgeParent.cpp#1602
https://searchfox.org/mozilla-central/rev/9120151ddb35f2d4c37bfe613a54a4f10a9a3dc5/gfx/layers/ipc/CompositorBridgeParent.cpp#1602
Comment 17•5 years ago
|
||
- when running on nvidia card the performance is poor. UI is lagging, scrolling is terrible...
In which builds is the performance poor? All of them?
All of them. But that piece of hardware in not officially supported with Win10 and windows update will bork drivers regularly.
I myself stopped using discrete NVIDIA graphics card because it was causing a lot of troubles (not just with firefox). Firefox works fine with integrated one.
Assignee | ||
Comment 18•5 years ago
|
||
(In reply to Michal Kluka from comment #17)
- when running on nvidia card the performance is poor. UI is lagging, scrolling is terrible...
In which builds is the performance poor? All of them?
All of them. But that piece of hardware in not officially supported with Win10 and windows update will bork drivers regularly.
I myself stopped using discrete NVIDIA graphics card because it was causing a lot of troubles (not just with firefox). Firefox works fine with integrated one.
Oh, so maybe we don't even need to bother blocklisting this because it's already broken? So no one would realistically be using that combination?
Andrew, does that make sense? To bring you up to speed, the machine from https://bugzilla.mozilla.org/show_bug.cgi?id=1544074#c0 when running on the nvidia card will get a blank window after the changes from this bug. But that machine is quite old, should we bother doing the plan outlined in comment 16 or just let this be?
cut/paste of about:support from the machine in question
GPU #1
Active: Yes
Description: Intel(R) HD Graphics 3000
Vendor ID: 0x8086
Device ID: 0x0126
Driver Version: 9.17.10.4459
Driver Date: 5-19-2016
Drivers: igdumd64 igd10umd64 igd10umd64 igdumd32 igd10umd32 igd10umd32
Subsys ID: 21d117aa
RAM: Unknown
GPU #2
Active: No
Description: NVIDIA Quadro 2000M
Vendor ID: 0x10de
Device ID: 0x0dda
Driver Version: 21.21.13.7748
Driver Date: 6-8-2017
Assignee | ||
Comment 19•5 years ago
|
||
Do you also get a blank window with Chrome? They use a similar setup to us after adding the problematic flag.
Assignee | ||
Comment 20•5 years ago
|
||
We added the pref apz.windows.force_disable_direct_manipulation, you can set it to true and that should avoid this problem for you.
Comment 21•4 years ago
|
||
Comment 22•4 years ago
|
||
Comment 23•4 years ago
|
||
(In reply to Timothy Nikkel (:tnikkel) from comment #19)
Do you also get a blank window with Chrome? They use a similar setup to us after adding the problematic flag.
No I do not. It seems that GPU is blacklisted.
Below is from chrome://gpu/
Gpu compositing has been disabled, either via blacklist, about:flags or the command line. The browser will fall back to software compositing and hardware acceleration will be unavailable.
Disabled Features: gpu_compositing
I have attached chrome://gpu/ output for both integrated on NVIDIA graphics.
Seems that integrated is actually able to accelerate something while NVIDIA almost nothing.
Assignee | ||
Comment 24•4 years ago
|
||
We have a possible fix from this try push
https://treeherder.mozilla.org/#/jobs?repo=try&revision=38304ada606c995c553dfb329eea452bd407e136
could you try this build
Thanks!
Assignee | ||
Updated•4 years ago
|
Comment 25•4 years ago
|
||
(In reply to Timothy Nikkel (:tnikkel) from comment #24)
We have a possible fix from this try push
https://treeherder.mozilla.org/#/jobs?repo=try&revision=38304ada606c995c553dfb329eea452bd407e136
could you try this build
Thanks!
It works. Actually performance is pretty good.
Description
•