Open Bug 1743051 Opened 3 years ago Updated 3 years ago

MOZ_X11_EGL/Nvidia: Partly broken partial present after suspend&resume

Categories

(Core :: Graphics: WebRender, defect)

x86_64
Linux
defect

Tracking

()

Tracking Status
firefox95 --- disabled
firefox96 --- disabled
firefox97 --- disabled
firefox98 --- verified disabled

People

(Reporter: jan, Unassigned)

References

(Blocks 2 open bugs)

Details

(Keywords: correctness, nightly-community)

Attachments

(4 files)

(Darkspirit from bug 1731172 comment 35)

Some tiles in the vertical middle are not updated correctly. It can be best seen when hovering lines on about:config.
gfx.webrender.allow-partial-present-buffer-age=false doesn't help, only gfx.webrender.max-partial-present-rects=0 helps.

Attached video 2021-11-26 02-29-32.mp4 (deleted) —
Attached video 2021-11-26 02-41-20.mp4 (deleted) —

The same with Gnome partial present debug view enabled (steps to enable: bug 1640858 comment 12):
Everything is displayed correctly in this mode, but the red area shows what would be changed with this mode disabled.

Those lines on about:config that didn't get a blue background on hovering are not within the red box.

This bug seems to occur only after suspend&resume on EGL/Nvidia and persists until Firefox restart.

bug 1712969 is about an existing cross-platform partial present bug and has multiple testcases. Could it be related?

Thanks, that's very interesting. So according to the video from comment 2, it looks like WR updates the buffer correctly but fails to report the correct damage rect in SwapBuffersWithDamage() (note: we currently always report only one combined rect, bug 1640712, so it's unlikely that this is wrongly handled by the driver). And comment 4 points to us not properly combining the damage for multiple tiles.

To me it looks like the culprit must be somewhere between the part where we calculate the damage region from the tile rects in calculate_dirty_rects(), called in draw_frame() here, and passing it down later in draw_frame() via composite_frame()->composite_simple()->set_buffer_damage_region(). Actually I think it has to be in the linked part in calculate_dirty_rects(). I'll try to come up with a build with some debug logging.

xrender suffered from a bug where main menu changes were delayed by one frame.
nvidia + gpu process + i3 seemed to have the same problem 2 months ago: bug 1733094 comment 14.

my naive/incompetent thought:

this bug: Main window might fall back from XShmPutImage to buggy (?) XPutImage on NV suspend&resume?

Hm, we only use that if we fall back to SW-WR, it shouldn't matter for HW-WR at all AFAIK.

When you have time, could you try https://treeherder.mozilla.org/jobs?repo=try&revision=ab4d61c02f60fc953b387cb58843c386a707aa8c and check if the debug output changes? Sorry for still not being able to do it myself :(

(In reply to Darkspirit from comment #6)
This bug was filed with X11-only driver 470.86. I will test 495 with MOZ_ENABLE_WAYLAND and report back.

Jan, just to make sure I'm on the right track, can you confirm that the following build is not affected? https://treeherder.mozilla.org/jobs?repo=try&revision=b9607af2acbc9b5959fddab42b0af29ac642f8c1

(In reply to Robert Mader [:rmader] from comment #7)

When you have time, could you try https://treeherder.mozilla.org/jobs?repo=try&revision=ab4d61c02f60fc953b387cb58843c386a707aa8c and check if the debug output changes? Sorry for still not being able to do it myself :(

Count looks the same before and after.

(In reply to Darkspirit from comment #8)

This bug was filed with X11-only driver 470.86. I will test 495 with MOZ_ENABLE_WAYLAND and report back.

Gnome Wayland/495 stable is somehow broken: Logging in to Wayland does not work: "Failed to grab modeset ownership": https://www.google.com/search?client=firefox-b-d&q=Failed+to+grab+modeset+ownership
Updating to Ubuntu 22.04dev did not help.

(In reply to Robert Mader [:rmader] from comment #9)

Jan, just to make sure I'm on the right track, can you confirm that the following build is not affected? https://treeherder.mozilla.org/jobs?repo=try&revision=b9607af2acbc9b5959fddab42b0af29ac642f8c1

Gnome X11/driver 495: Yes, that build is fine.
Regular Nightly:
(In reply to Darkspirit from comment #0)

gfx.webrender.allow-partial-present-buffer-age=false doesn't help, only gfx.webrender.max-partial-present-rects=0 helps.

Can you also test https://treeherder.mozilla.org/jobs?repo=try&revision=31a13646f62781744aa8c9c564bdeeebd7dbddc3 (and maybe do a quick video with the Gnome partial present debug view)?

Attached video try_comment11.mp4 (deleted) —
Severity: -- → S3

Still reproducible with Gnome X11, Nvidia GTX 1060, driver 495.46, Ubuntu 22.04 jammy.

The update from 495.44 to 495.46 seems to have fixed Gnome Wayland for me. Previously it was somehow not working anymore.
Xwayland window content is displayed, but still uses llvmpipe.
Gnome Wayland suffers from https://gitlab.gnome.org/GNOME/mutter/-/issues/1942 (=texture corruption all over the place), the partial present bug does not seem to occur there (yet?).

Has anyone had the chance to check if the 510 driver series is also affected by this?

Just got a report on Matrix that this still happens on the 510 driver series.

Depends on: 1751252
No longer blocks: 1737428
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: