Open Bug 1572625 Opened 5 years ago Updated 2 years ago

X11 Expose event not propagated with GPU process

Categories

(Core :: Graphics: WebRender, defect, P3)

70 Branch
Desktop
Linux
defect

Tracking

()

Tracking Status
firefox-esr60 --- disabled
firefox-esr68 --- disabled
firefox68 --- disabled
firefox69 --- disabled
firefox70 --- disabled

People

(Reporter: streetwalkermc, Assigned: aosmond)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: nightly-community, regression)

Attachments

(2 files)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:70.0) Gecko/20100101 Firefox/70.0

Steps to reproduce:

  • Enable webrender (default on my hardware, or gfx.webrender.all) or opengl compositing (layers.acceleration.force-enabled=true, gfx.webrender.all=false, gfx.webrender.force-disabled=true)
  • Make sure no window repaints are necessary (throbber animations, blinking cursors, etc)
  • Spawn/drag windows over Firefox or switch workspaces back and forth (Firefox must be unfocused, and the mouse cursor should be outside of its window

Actual results:

Window contents are damaged and no repaint is attempted.

Investigation in bug 1514148 shows that expose_event_cb isn't being called when it should be.

Bisection:

$ mozregression --good 2018-10-14 --bad 2019-08-01 --pref gfx.webrender.force-disabled:true layers.acceleration.force-enabled:true 
...
 9:50.55 INFO: Last good revision: a4daa44cdb9cd0ab8a1870a4105ff8f9103c193e
 9:50.55 INFO: First bad revision: 284dca344fcc2736acc3c2d8bc54befea0a8ce73
 9:50.55 INFO: Pushlog:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=a4daa44cdb9cd0ab8a1870a4105ff8f9103c193e&tochange=284dca344fcc2736acc3c2d8bc54befea0a8ce73

Faulty commit. Setting layers.gpu-process.enabled=false works around the bug on newer nightlies.

Expected results:

The expose event should be propagated to the GPU process so that the window can be redrawn.

I can reproduce this on my desktop with an AMD RX 580 and my laptop with Kaby Lake R integrated graphics, both running Arch Linux, i3wm and up to date nightlies.

A couple more notes that I forgot to include:

  • I do not use a compositor, though I believe that shouldn't matter in the workspace switching case, as the contents of a window are still lost when it is unmapped
  • this bug apparently can't be reproduced on KDE, with compositing disabled
  • inspection with xev shows that expose events are correctly being sent to the window
Blocks: wr-linux
Component: Untriaged → Graphics
Product: Firefox → Core
Attached video 2019-08-11_00-57-28.mp4 (deleted) —

i3 (default config), Debian Testing, Macbook Pro A1502
Just posting my findings to give this more context.

== Workspace switching glitch ==
For me, this issue seems to depend on AlphaVisual usage for top level windows.

When using WebRender with GPU process, switching workspaces on i3 causes the window to become transparent until it's focused. (bug 1514148 comment 46, attachment 9083749 [details])
It was regressed by bug 1498092, but only KDE (without compositor) was fixed by bug 1571331.
(My understanding is that WebRender might have unexpectedly generated too much frames before, thus this bug was not visible at first?)

Screencast: OpenGL + GPU process + AlphaVisual, first bad revision.
mozregression --good 2017-06-02 --bad 2019-02-01 --pref layers.gpu-process.force-enabled:true layers.acceleration.force-enabled:true mozilla.widget.use-argb-visuals:true

23:37.42 INFO: Last good revision: 81c3640eaebc47516247f546b2203ec550fdd37a
23:37.42 INFO: First bad revision: ba0c3051a9ed3b8e3120eb25d770ea459d3f719d
23:37.42 INFO: Pushlog:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=81c3640eaebc47516247f546b2203ec550fdd37a&tochange=ba0c3051a9ed3b8e3120eb25d770ea459d3f719d

ba0c3051a9ed3b8e3120eb25d770ea459d3f719d sotaro — Bug 1481694 - Use GLContextGLX::FindVisual() when webrender is not enabled if possible r=stransky

After this patch you could enforce AlphaVisual for OpenGL compositing by setting mozilla.widget.use-argb-visuals to true.
When switching workspaces, window content often moved pixel by pixel to the right on each switch and left a black area on the left, or the window became corrupted. Focusing Firefox fixed it.

But after the following patch AlphaVisual can only be used when gdk_screen_is_composited or when WebRender is active:
https://hg.mozilla.org/mozilla-central/rev/7d768598fee79819797f560acc4cff7f5b0a18a8
For me, OpenGL compositing seemed unaffected after this patch.
If WebRender is disabled, but gdk_screen_is_composited true, CSD support level matters:
CSD_SUPPORT_NONE is configured for i3. If i3 was not correctly detected, i3/OpenGL could be affected on Nightly, but not on RELEASE_OR_BETA. (I thought i3 is not composited at all so this entry might be unnecessary?)
Basic seemed always unaffected - or I just haven't found the right steps yet.

Attached video 2019-08-10_20-38-05.mp4 (deleted) —

== Glitch when dragging a window over Firefox ==
The GPU process crashed on 2016-11-02. On 2016-11-03 it worked, but showed the window dragging glitch with OpenGL (Screencast) and even with Basic (the restriction of layers.gpu-process.allow-software did not exist at that time). It seems it was broken/not implemented from the beginning of the GPU process.
(Switching workspaces with Basic and OpenGL didn't cause issues for me at this point. WebRender did not exist yet.)

mozregression --good 2016-11-01 --bad 2016-12-01 --pref gfx.webrender.force-disabled:true layers.acceleration.force-enabled:true layers.gpu-process.dev.force-enabled:true

7:36.17 INFO: Last good revision: 3e73fd638e687a4d7f46613586e5156b8e2af846 (2016-11-02)
7:36.17 INFO: First bad revision: ade8d4a63e57560410de106450f37b50ed71cca5 (2016-11-03)
7:36.17 INFO: Pushlog:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=3e73fd638e687a4d7f46613586e5156b8e2af846&tochange=ade8d4a63e57560410de106450f37b50ed71cca5

8:10.50 INFO: There are no build artifacts on inbound for these changesets (they are probably too old).

(Sorry for my uncertain noob language.) bug 1567791 could be related: Shaped Basic popups (main menu, identity panel) are painted with a delay of one frame when using WebRender and GPU process with a non-compositing window manager. OpenGL and Basic compositing are unaffected.

(Jan Andre Ikenmeyer [:darkspirit] from bug 1567791 comment 24)

As long the Sync icon [inside the main menu] is spinning there is no bug. Otherwise it's like an event is only processed when the next event occurs.
If you hover menu entries A, B, C, B, A, then A is shown as hovered when hovering B, B when hovering C, C when hovering B, B when hovering A.

Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Unspecified → Linux
Regressed by: 1549965
Hardware: Unspecified → Desktop

I don't know what is off-topic in this comment and what not, but independently from this there are partly similar looking bugs:

A transparent window glitch (like when switching workspaces with WebRender & GPU process on i3) can also occur when snapping windows on Wayland (no GPU process: bug 1569745) or when taking a screenshot with Wayland&X11 Gnome (bug 1494520).

Shown on the screencast in comment 2, when it was shortly possible to enable top level window AlphaVisual for OpenGL on non-compositing i3:

window content often moved pixel by pixel to the right on each switch and left a black area on the left

Other bizarre Linux window sizing bugs are bug 1489463 (X11+Wayland, with and without GPU process, I see it every day) and bug 1502519.

The priority flag is not set for this bug.
:jbonisteel, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(jbonisteel)
Priority: -- → P3
Priority: P3 → --
Component: Graphics → Graphics: WebRender
Flags: needinfo?(jbonisteel)
Priority: -- → P3

I somehow completely missed the expose event bug getting refiled. Regression shows this is my fault.

Flags: needinfo?(aosmond)

I confirmed that I could reproduce Bug 1572625 with Ubuntu18.04 + awesome window manager.

Depends on: 1611372
Flags: needinfo?(aosmond)
Assignee: nobody → aosmond

This is supposed to be bailing us out:

https://searchfox.org/mozilla-central/rev/a87a1c3b543475276e6d57a7a80cb02f3e42b6ed/widget/gtk/nsWindow.cpp#3795-3799,3807

But we only get GDK_VISIBILITY_UNOBSCURED events when the window appears, and nothing when the window disappears. Thus mIsFullyObscured is always false.

This is true without WebRender or the GPU process actually, so there is another explicit invalidation coming from somewhere else.

_gdk_x11_window_process_expose is getting called, but it bails rather than call _gdk_window_invalidate_for_expose which would indirectly trigger the redraw eventually.

https://git.launchpad.net/ubuntu/+source/gtk+3.0/tree/gdk/x11/gdkgeometry-x11.c?h=applied/ubuntu/bionic-proposed#n245

With the GPU process, the serial number of the item in the queue exceeds the serial number passed into _gdk_x11_window_process_expose. This causes it to subtract the window area from the invalidate region. This results in an empty cairo region so the invalidation is suppressed.

Without the GPU process, the serial number of the item in the queue is less than the serial number passed in. This removes the item from the queue and invalidates as expected.

Blocks: gpu-process-linux-x11
No longer blocks: wr-linux-mvp
No longer blocks: wr-linux
Has Regression Range: --- → yes
Severity: normal → S3
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: