Partial present is glitchy
Categories
(Core :: Graphics: WebRender, defect, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr78 | --- | unaffected |
firefox-esr91 | --- | wontfix |
firefox88 | --- | unaffected |
firefox89 | --- | unaffected |
firefox90 | --- | wontfix |
firefox91 | --- | wontfix |
firefox92 | --- | wontfix |
firefox93 | --- | wontfix |
firefox94 | --- | wontfix |
firefox95 | --- | wontfix |
firefox98 | --- | wontfix |
firefox99 | --- | wontfix |
firefox100 | --- | fixed |
People
(Reporter: heftig, Assigned: jnicol)
References
(Blocks 3 open bugs, Regression, )
Details
(Keywords: correctness, regression)
Attachments
(2 files)
I'm seeing glitches accountable to poor partial present.
Firefox Nightly, GNOME 40, Arch Linux
Dell XPS 13 7390 2-in-1 (3840x2400, scale 2)
MOZ_ENABLE_WAYLAND=1
To reproduce:
- Open https://c.eev.ee/doom-text-generator/
- If the "Scale" slider is not visible with the page scrolled to the top,
make the window larger. - Drag the "Scale" slider right and left repeatedly.
- As the text changes in size, observe glitches showing text where the
empty background should be.
mozregression results:
2021-04-19: meh: frequently presents entire window, but shows no glitches
Last meh revision: 30af8f80e2754783f1981485dd911a2e341d9afd
First good revision: f8eb1926bfdbb630b4aa17e064c3472c4a32709e
Pushlog: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=30af8f80e2754783f1981485dd911a2e341d9afd&tochange=f8eb1926bfdbb630b4aa17e064c3472c4a32709e
Fixed by bug 1377445
2021-04-20: good
2021-04-21: good
Last good revision: 6c2c039872b3c49c22323e13473c84bab33ebb7c
First bad revision: 1859720a3225cf4d9560ccc240555f1e05dc3a76
Pushlog: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=6c2c039872b3c49c22323e13473c84bab33ebb7c&tochange=1859720a3225cf4d9560ccc240555f1e05dc3a76
Regressed by bug 1706452
2021-04-22: bad: glitches as described above
Reporter | ||
Comment 1•3 years ago
|
||
The two "good" nightlies were only good because basic layers were being used.
I've rerun the regression using LD_PRELOAD=/usr/lib/libgtk-3.so mozregression --good 2021-04-19 --bad 2021-04-22 -a https://c.eev.ee/doom-text-generator/
to work around avoid bug 1706452.
Last good revision: 42c3b61f3db2df3cc5941a9e71875f06e35042f0
First bad revision: 2a384027ee9bcee63121a68d42c8db6644c8d8aa
Pushlog: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=42c3b61f3db2df3cc5941a9e71875f06e35042f0&tochange=2a384027ee9bcee63121a68d42c8db6644c8d8aa
Regressed by bug 1706488
Comment 2•3 years ago
|
||
Glenn, any idea why this could be? We share our partial present support with Android, so it might reproduce there as well.
Reporter | ||
Comment 3•3 years ago
|
||
It does reproduce on Android. Here's a video of Nightly on my Pixel 3 XL.
Comment 4•3 years ago
|
||
No specific ideas, it will need to be debugged with some logging of the dirty rects being supplied by WR.
Assignee | ||
Comment 5•3 years ago
|
||
This also affects Windows if you disable gfx.webrender.compositor
. So I don't think it's to do with EGL partial present specifically.
On Mac the canvas gets drawn with a black background if you disable the OS compositor, so it doesn't reproduce there, but it's buggy in a different way.
Updated•3 years ago
|
Comment 6•3 years ago
|
||
Set release status flags based on info from the regressing bug 1706488
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Reporter | ||
Comment 7•3 years ago
|
||
I've also seen rapid flashing with YouTube (when video playback stops and the throbber appears) and Google Meet (when the dimensions of the shared window change) that look like the same bug.
Assignee | ||
Comment 8•3 years ago
|
||
Does setting gfx.webrender.allow-partial-present-buffer-age
to false prevent that flashing?
Reporter | ||
Comment 9•3 years ago
|
||
That does seem to help. Although it also causes the Doom Text Generator (from comment #0) to have a black canvas background that's not intended. PS: Never mind, I can't reproduce the black background.
Comment 10•3 years ago
|
||
(In reply to Jamie Nicol [:jnicol] from comment #8)
Does setting
gfx.webrender.allow-partial-present-buffer-age
to false prevent that flashing?
I'm able to reproduce comment 0 on EGL/Xwayland.
No, disabling gfx.webrender.allow-partial-present-buffer-age and restarting Nightly doesn't help.
The glitch goes away when scrolling down.
The canvas also had a black background once, then I pressed Ctrl+F5 and it was white again.
Comment 11•3 years ago
|
||
It did not flicker for me. There were just artifacts.
Updated•3 years ago
|
Comment 12•3 years ago
|
||
Default configuration: So far this bug has been shipped on Android and Wayland (Ubuntu, Fedora).
Now this bug is also default on X11 and Xwayland (bug 1695933).
Updated•3 years ago
|
Comment 14•3 years ago
|
||
Please have a look at duplicate bug 1731485 which has another small testcase.
- GLX can be fixed with gfx.webrender.allow-partial-present-buffer-age:false or with gfx.webrender.max-partial-present-rects:0.
- EGL can only be fixed with gfx.webrender.max-partial-present-rects:0.
Updated•3 years ago
|
Updated•3 years ago
|
Reporter | ||
Comment 16•3 years ago
|
||
(In reply to Jan Alexander Steffens [:heftig] from comment #9)
That does seem to help.
Although it also causes the Doom Text Generator (from comment #0) to have a black canvas background that's not intended.PS: Never mind, I can't reproduce the black background.
I can reproduce the black background when Fission is enabled by Ctrl+Clicking the Reload button to open a copy of the tab.
mozregression --launch 2021-11-30 -a https://c.eev.ee/doom-text-generator/ --pref fission.autostart:true
This is independent of gfx.webrender.allow-partial-present-buffer-age
, so it also reproduces with the following:
mozregression --launch 2021-11-30 -a https://c.eev.ee/doom-text-generator/ --pref fission.autostart:true gfx.webrender.allow-partial-present-buffer-age:false
PS: This is now bug 1743656.
Updated•3 years ago
|
Comment 17•3 years ago
|
||
Fedora 35 - Gnome 41.2 wayland - AMDGPU (mesa) - Webrender enabled (by distro package) - else default settings
HTML5 players (Youtube and other streaming sites) show a heavy flicker on every ~10th seek. The issue looks like this: https://streamable.com/item3i
Setting gfx.webrender.max-partial-present-rects:0
fixes the issue completly.
Comment 18•3 years ago
|
||
Not fixed by bug 1747857.
Assignee | ||
Comment 19•3 years ago
|
||
(In reply to Darkspirit from comment #14)
Please have a look at duplicate bug 1731485 which has another small testcase.
- GLX can be fixed with gfx.webrender.allow-partial-present-buffer-age:false or with gfx.webrender.max-partial-present-rects:0.
- EGL can only be fixed with gfx.webrender.max-partial-present-rects:0.
Thanks for that link. that test case is indeed a bit simpler.
What's happening here in either test case is that there is a canvas, which since bug 1706488 gets promoted to a compositor surface. For partial present (both buffer age and max-partial-present-rects/eglSwapBuffersWithDamage), we calculate the current frame's dirty rect by unioning the dirty rect of each tile, for all tiles (including promoted compositor surfaces as well as picture cache tiles). For compositor surfaces, the dirty rect is simply set to the full area of the tile, meaning a) we always render it in composite_simple()
, and b) the OS compositor will composite it because we told it the area is dirty in eglSwapBuffersWithDamage.
However, in these test cases the problem is that the size of the canvas changes, meaning the size and therefore dirty rect of the compositor surface changes. When we calculate_dirty_rects()
for the new frame, the rect only includes the area of the new canvas size, and not the area which the canvas used to occupy. This means we do indeed render the new area of the canvas correctly, but (when using buffer age) we do not render the area the canvas used to occupy, leaving stale content on the framebuffer. And (when using max-partial-present-rects) nor do we tell the OS compositor that the area has been updated, so it may skip compositing that area too. Hence why the bug reproduces with either buffer age, max-partial-present-rects, or both
It's quite trivial to fix this simply for compositor surfaces. We just need to keep track of the union of compositor surface rects for each frame. And then union the combined tiles dirty rect with the previous frame's compositor surface rects to calculate the dirty rect for the new frame. (Which will be a nop if the compositor surfaces have not moved.) This is easy because we always treat the entire area of the compositor surface as dirty.
However, this has me worried that it could go beyond just compositor surfaces.. Is it possible that picture cache slices/tiles can move, and therefore won't be included in the dirty rect calculation? I think this would possibly be more difficult to account for, as we track dirty subregions of picture cache tiles. Do you know of any defence we have against that situation, Glenn?
In any case, I guess we should land the fix for compositor surfaces first.
Comment 20•3 years ago
|
||
I'm not sure I completely follow the question, but I don't know the buffer-age code or requirements very well.
I know that we do skip partial present if any tiles have been added or removed, and this is a hack / workaround to deal with not knowing the previous state of that tile so we just skip partial in those cases (this is relatively rare compared to number of frames we draw where the tile config hasn't changed). There may be other cases where we set dirty_rects_are_valid
to be false
that are relevant here.
In any case, I would not be at all surprised if there are some similar issues with tiles as compositor surfaces. I guess, like you said, maybe we fix the compositor surface case first and then consider the content tile case and see if we can reproduce it.
Would it be helpful (and maybe get rid of the hack I mentioned above) if we had the complete state of tiles (position / size / dirty region) from last frame stored / made available? And then we could perhaps do better diffs to find the exact dirty rect? Or is that orthogonal to the problem here?
Assignee | ||
Comment 21•3 years ago
|
||
(In reply to Glenn Watson [:gw] from comment #20)
I know that we do skip partial present if any tiles have been added or removed, and this is a hack / workaround to deal with not knowing the previous state of that tile so we just skip partial in those cases
Interesting. hopefully this prevents the thing I was worried about happening from happening. Perhaps this hack can be extended to deal with the compositor surface case as well, or else I can just land the fix I already have. Will look in to it a bit.
In any case, I would not be at all surprised if there are some similar issues with tiles as compositor surfaces. I guess, like you said, maybe we fix the compositor surface case first and then consider the content tile case and see if we can reproduce it.
Sounds good.
Would it be helpful (and maybe get rid of the hack I mentioned above) if we had the complete state of tiles (position / size / dirty region) from last frame stored / made available? And then we could perhaps do better diffs to find the exact dirty rect? Or is that orthogonal to the problem here?
Yeah, sounds like that could potentially be used to fix this bug and remove the hack above. But it wouldn't be a prerequisite, we can fix this in the meantime without. Thanks Glenn!
Assignee | ||
Comment 22•3 years ago
|
||
When using webrender's draw compositor with partial present enabled,
we treat the entire area of all promoted compositor surfaces as
dirty. However, if a compositor surface is reduced in size this is
buggy - the new area of the surface will be treated as dirty, but
the additional area it used to occupy will not be. This means that
area may not be updated properly when compositing, leaving stale
content behind.
To fix this, keep track of the ItemUids of primitives that were
promoted to compositor surfaces for each sub slice. If the Uid
changes then set dirty_rects_are_valid to false, ensuring we do a
full-composite on the next frame.
This is perhaps overly conservative in some respects, but it solves
this correctness bug whilst at the same time still allowing partial
present for the common case where there are no changes to the number
or position of compositor surface and just their contents needs to be
updated.
Updated•3 years ago
|
Updated•3 years ago
|
Comment 23•3 years ago
|
||
Comment 24•3 years ago
|
||
bugherder |
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Description
•