Closed Bug 1672139 Opened 4 years ago Closed 4 years ago

Addons overlays not painted on webrender

Categories

(Core :: Graphics: WebRender, defect)

x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
84 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox81 --- unaffected
firefox82 --- unaffected
firefox83 --- disabled
firefox84 --- fixed

People

(Reporter: kubrick, Assigned: sotaro)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: nightly-community, regression)

Attachments

(5 files, 1 obsolete file)

Since rev 604e2561311f2aa2ba70fbccdfb54127506ad8b5, with webrender enabled, on wayland (Gnome Arch), addon windows/overlays don't get painted and remain white.

reverting D92445 and D93855 fixes the issue.

Gnome Wayland, Debian Testing, Intel/Mesa
Confirmed. Thanks for the report!
$ MOZ_ENABLE_WAYLAND=1 mozregression --launch 2020-10-19 --pref gfx.webrender.all:true -a https://addons.mozilla.org/firefox/addon/ublock-origin/

(I also noticed that with MOZ_X11_EGL=1 and GLX, uBlock panel content is often shown with a delay of 1.5 seconds, sometimes it's instant. IIRC this occurs with Xwayland, but not with X11.)

https://hg.mozilla.org/mozilla-central/shortlog/604e2561311f2aa2ba70fbccdfb54127506ad8b5

last good (panel content always visible):
$ MOZ_ENABLE_WAYLAND=1 mozregression --repo autoland --launch b5017b0a0b5d10f00957b15fa9081d3f755dad76 --pref gfx.webrender.all:true -a https://addons.mozilla.org/firefox/addon/ublock-origin/

first bad (panel content is intermittently white/empty or shown. If its white/empty, hovering does not help.):
$ MOZ_ENABLE_WAYLAND=1 mozregression --repo autoland --launch 604e2561311f2aa2ba70fbccdfb54127506ad8b5 --pref gfx.webrender.all:true -a https://addons.mozilla.org/firefox/addon/ublock-origin/

Regressed by:

604e2561311f2aa2ba70fbccdfb54127506ad8b5 Sean Feng - Bug 1669239 - Call Tick to paint in PuppetWidget::Paint instead of using WillPaintWindow r=emilio,mattwoodrow

Status: UNCONFIRMED → NEW
Has Regression Range: --- → yes
Has STR: --- → yes
Ever confirmed: true
Regressed by: 1669239
Flags: needinfo?(sefeng)
Severity: -- → S2

Requesting Emilio's help.

Flags: needinfo?(sefeng) → needinfo?(emilio)

Can you confirm this only happens with uBlock? What I see is that uBlock somehow has an opacity: 0 rule applying to the <body>.

Flags: needinfo?(kubrick)

No it happens with all extensions (tested Bitwarden, µBlock, Feedbro, Video Download Helper, etc.)

Flags: needinfo?(kubrick)

Ok, so I think I found out what's going on, but it's a bit sad and I think we just found a pre-existing bug that was getting papered over.

So when we're on WR+Wayland and open a popup like uBlock's (or others for that matter), we hit this chunk of code, which causes the first frame to fail to draw.

Now the main issue is that before the regressing patch Invalidate ended up doing a forceful transaction eventually, but now we just keep waiting for the reply from the compositor (which will never arrive) and keep early-returning here for eternity...

So I suspect this is a pre-existing bug, but I'm not familiar enough with the WebRender compositor code to know what the right solution is.
Should the transaction that the refresh driver is waiting for be acknowledged even though it failed? (that seems like the right fix to me).

Not for landing, probably...

Sotaro, I'm told that you know the relevant code much better, can you take a look at comment 6 and maybe say what's supposed to happen when BeginFrame fails?

We also hit this line of code, which seems problematic.

I thought a patch like this would fix it and be the correct fix per that:

diff --git a/gfx/webrender_bindings/RenderThread.cpp b/gfx/webrender_bindings/RenderThread.cpp
index 7e3922054737..92fb1c4b1248 100644
--- a/gfx/webrender_bindings/RenderThread.cpp
+++ b/gfx/webrender_bindings/RenderThread.cpp
@@ -518,7 +518,7 @@ void RenderThread::UpdateAndRender(
     renderer->WaitForGPU();
   }
 
-  if (!aRender) {
+  if (!aRender || !latestFrameId.IsValid()) {
     // Update frame id for NotifyPipelinesUpdated() when rendering does not
     // happen.
     latestFrameId = renderer->UpdateFrameId();

But it doesn't seem to be (it fixes the assertion failure though). Any advice? Alternatively we could possibly land my wallpaper, but I'd prefer not having to.

Flags: needinfo?(emilio) → needinfo?(sotaro.ikeda.g)

(In reply to Emilio Cobos Álvarez (:emilio) from comment #8)

Sotaro, I'm told that you know the relevant code much better, can you take a look at comment 6 and maybe say what's supposed to happen when BeginFrame fails?

I could reproduce the problem. I am going to look into it.

(In reply to Emilio Cobos Álvarez (:emilio) from comment #6)

Ok, so I think I found out what's going on, but it's a bit sad and I think we just found a pre-existing bug that was getting papered over.

So when we're on WR+Wayland and open a popup like uBlock's (or others for that matter), we hit this chunk of code, which causes the first frame to fail to draw.

It seems like a different problem. I created Bug 1672565.

It seems that PuppetWidget's WebRenderLayerManager of the add-on seemed not do rendering. WebRenderLayerManager::EndTransactionWithoutLayer() was not called when the problem happened.

When I re-added WillPaintWindow(). The problem was addressed for me. Then it seems not like a graphics problem.

Flags: needinfo?(sotaro.ikeda.g)

How not? WillPaintWindow effectively wallpapers it in the same way as comment 7, but that just means that the content process got stuck waiting for a transaction to finish that didn't finish.

Flags: needinfo?(sotaro.ikeda.g)

(In reply to Emilio Cobos Álvarez (:emilio) from comment #13)

How not?

Sorry, it seems to related to graphics area.

The patch seems to address the problem, though the initial pause is necessary for wayland egl.

Then the initial pause seems to trigger the problem.

This patch also addressed the problem for me.

When the problem happened, pop-up widget's content process's WebRenderBridgeParent(non-root) was attached to different CompositorBridgeParent. Then the WebRenderBridgeParent was moved to correct CompositorBridgeParent by CompositorBridgeParent::RecvAdoptChild() call. It triggered NotifyPipelineRendered().

In this case, SendDidComposite() have to be called, since WebRenderBridgeParent(non-root) is moving to different WebRender instance. But it is not called when CompositorBridgeParent. is paused.

Assignee: nobody → sotaro.ikeda.g
Status: NEW → ASSIGNED

D94552 fixes the issue for me, but then hovering on links doesn't trigger events any more (no change of mouse pointer, etc.)

(In reply to Francois Guerraz from comment #20)

D94552 fixes the issue for me, but then hovering on links doesn't trigger events any more (no change of mouse pointer, etc.)

Thank you for checking! Can you create a bug for it?

Flags: needinfo?(sotaro.ikeda.g)
Pushed by sikeda.birchill@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/0cb613d53e83 Remove pause handlig from CompositorBridgeParent::NotifyPipelineRendered() r=nical
Regressions: 1672965
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 84 Branch
No longer regressions: 1672965

Version 83 is affected by this issue. Are there any plans to uplift the fix to beta?

(In reply to Danilo from comment #24)

Version 83 is affected by this issue. Are there any plans to uplift the fix to beta?

Sotaro may be able to assess how risky is that better than me.

Flags: needinfo?(sotaro.ikeda.g)

Comment on attachment 9183382 [details]
Bug 1672139 - Remove pause handlig from CompositorBridgeParent::NotifyPipelineRendered()

The change only to

Beta/Release Uplift Approval Request

  • User impact if declined: Addon popup might becomes white with Linux EGL WebRender.
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: none
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The change affects only to WebRender on Android and EGL WebRender on linux. And on android, such use case seems not happen. It changed as to send DidComposite even when WebRender is pause state and CompositorBridgeParent::RecvAdoptChild() called. It does not happen on normal use case. And send DidComposite at correct timing seems a low risk.
  • String changes made/needed: none
Flags: needinfo?(sotaro.ikeda.g)
Attachment #9183382 - Flags: approval-mozilla-beta?

Comment on attachment 9183382 [details]
Bug 1672139 - Remove pause handlig from CompositorBridgeParent::NotifyPipelineRendered()

Webrender is disabled by default on Linux for Mozilla builds on the release channel, I think this can ride the 84 train, thanks.

Attachment #9183382 - Flags: approval-mozilla-beta? → approval-mozilla-beta-

It would be great if this could make it to ff83. Firefox is the only usable native Wayland browser.

As a reminder, this is the configuration everyone is using to get VAAPI hardware video decoding. This broke WebRender for all Wayland users.

Uplifting this would have been a lot easier than dealing with those duplicates and bug reports on reddit etc.

Attachment #9182987 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: