[wayland] Crash wl_surface@76: error 2: Buffer size (170x113) is not divisible by scale (2)
Categories
(Core :: Widget: Gtk, defect, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox109 | --- | fixed |
People
(Reporter: luis.pabon, Assigned: stransky)
References
(Blocks 1 open bug)
Details
Attachments
(4 files)
User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:107.0) Gecko/20100101 Firefox/107.0
Steps to reproduce:
- Firefox nightly snap 107.0a1 (2022-10-13)
- Ubuntu 22.10
- Sway (master)
I have 3 outputs: laptop's built in 4k panel (scale 2), external 1080p display (scale 1) and another external 900p display (scale 1)
I've been having a lot of instability while moving tabs on Firefox nightly lately.
If I place the firefox window on the scale 2 output, I can move tabs around without issue. If I move firefox to a scale 1 output and then I try to do the same, firefox crashes.
There doesn't seem to be any correlation to which output firefox initially booted into and the crashes. Just which output I'm using.
Actual results:
https://crash-stats.mozilla.org/report/index/bp-f5058e04-5d12-4dbe-bf47-9f5630221018
Expected results:
Should've worked
Reporter | ||
Comment 1•2 years ago
|
||
On firefox 106 beta I also get crashes, but when moving the screen from output to output or moving a PiP window from output to output:
https://crash-stats.mozilla.org/report/index/ae60fb60-72bd-41f5-b069-1d2b20221018
The MOZ_CRASH reason seems to be the same in both cases though, something like:
wl_surface@47: error 2: Buffer size (1920x1049) is not divisible by scale (2)
Comment 2•2 years ago
|
||
The Bugbug bot thinks this bug should belong to the 'Core::Widget: Gtk' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.
Reporter | ||
Comment 3•2 years ago
|
||
Reporter | ||
Comment 4•2 years ago
|
||
Crash report for that debug file I just uploaded: https://crash-stats.mozilla.org/report/index/e3250020-c4dc-4c46-b913-5ac590221018
Just to point out that the firefox beta installation I have is from deb, not snap. Doesn't seem to be related to the installation medium.
Assignee | ||
Updated•2 years ago
|
Comment 6•2 years ago
|
||
From my duplicate https://bugzilla.mozilla.org/show_bug.cgi?id=1800214:
It's also reproducible on a clean profile.
I see the crash reason is wl_surface@49: error 2: Buffer size (6450x2529) is not divisible by scale (2)
This means that Firefox is setting the scale for a surface to 2 (e.g.: it says this is scaled up 2x), but the surface is not divisible by 2. This pretty much sounds like a bug in Firefox; the data it is sending to the compositor is inconsistent.
I've no idea why it only triggers with two outputs. It also doesn't matter on which output I run Firefox.
It should be noted that my outputs have different scale; one is 2x and the other is 3x. Note that the values above are divisible by 3. Maybe Firefox is only considering the scale for the larger monitor, but then marking surfaces as scale for the other? I admit that I'm guessing at this point
Comment 7•2 years ago
|
||
This USED to work on some compositors, but strictly speaking, this was a bug in the compositor; passing a surface with a size that is not divisible by the scale is a protocol violation.
This is rather intuitive when you think about it: you can't have a surface that's 200x199 and say "this has been scaled to 2x", because that means that the unscaled size should be 100x99.5px, where pixels must be integers.
It seems Firefox currently uses the largest scale of all outputs to render, but it should probably use the scale of the output on which the window has entered anyway. Rendering a 3x a window on a 2x display is pointless (and the compositor has to do the downscaling).
Assignee | ||
Comment 8•2 years ago
|
||
We know scale for every active monitor/window so we can align surface size to scale. But we don't do that now because mutter/kwin (major Wayland compositor) doesn't force it. So it's kind of TODO here.
We may add an assert to your code to make sure we follow this rule.
Comment 9•2 years ago
|
||
I believe yet-to-be-released Mutter versions do enforce this, so it might just be a matter of time. This is enforced by wlroots (sway's underlying library), so KWinFT will have the same issue as dependencies update.
Aside from this issue, keep in mind that rendering at the largest possible scale is a waste of resources. Note that the window in the above example is rendered at 6450x2529. For a window on a 3440x1400 screen. The CPU and memory consumption at that scale are not minor.
Assignee | ||
Comment 10•2 years ago
|
||
(In reply to Hugo Osvaldo Barrera from comment #9)
Aside from this issue, keep in mind that rendering at the largest possible scale is a waste of resources. Note that the window in the above example is rendered at 6450x2529. For a window on a 3440x1400 screen. The CPU and memory consumption at that scale are not minor.
Window size it not set by widget code, widget renders what Firefox/layout sends. Not sure if that's some monitor size reporting bug or so?
Anyway, looking forward to see updated mutter so we can fix that.
Reporter | ||
Comment 11•2 years ago
|
||
The main problem is that Firefox (and other apps with the same issue like mpv) becomes unusable on Sway if you have a mixed dpi display setup, as it's constantly crashing when moving things around.
Comment 12•2 years ago
|
||
Not sure if that's some monitor size reporting bug or so?
It's not, I'll try rephrasing the issue here: the window size in this case is 2150x843. This display has a scale of 2. So the surface area should have been either 2150x843 (with scale=1) or 4300x1686 (with scale=2).
It could also be larger (e.g. 6450x2529 with scale=3) and the compositor would downscale it. That's kinda wasteful, but valid.
For reference, from wl_surface::set_buffer_scale:
Note that if the scale is larger than 1, then you have to attach a buffer that is larger (by a factor of scale in each dimension) than the desired surface size.
So, given that Firefox is calling set_buffer_scale with value 2, the buffer should be 4300x1686, and not 6450x2529.
My other monitor (which was not the window on which Firefox was rendering) has a scale of 3, and it seems that Firefox is using that value.
If you multiply 2150x843 by 3, you get 6450x2529. Firefox is scaling surfaces up by X, where X=3 is the largest scale of all available displays, but it is reporting that it has scaled up the surfaces by a scale equal to the output's scale (in this case, 2). These values are inconsistent. Firefox is sending a surface of 6450x2529 with scale=2. So this surface scales to 3225x1264.5px, a non-integer size. This is a protocol violation for a reason, and swaywm just crashes when receiving invalid input. Mutter seems to be more lenient and work around it (there's likely some visual artefact though, since the numbers simply don't add up).
This results in two big problems:
- As mentioned above, it renders something gigantic which the compositor then has to downscale. This is problematic but non-fatal.
- The main topic at hand in this issue: Firefox says "here's a 6450x2529 surface which is scale 2x", but 2529 is CLEARLY not divisible by 2. The input is invalid, and swaywm rejects this invalid input.
This bug is fatal. IF the window has a size Y such that Y/3 is integer but Y/2 is not, Firefox crashes immediately (in this case, 3 and 2 are the scales of each of my displays). It is an EXTREMELY annoying bug, because it means that I basically have to chose between running Firefox, or using two displays, but I cannot do both.
To summarise this in shorter terms:
- Firefox scales the buffer by 3x.
- Firefox tells the compositor that the buffer is scaled up by 2x.
- Firefox send the buffer data.
- The numbers don't make sense, and the compositor kills the client because the client is sending bogus data.
Anyway, looking forward to see updated mutter so we can fix that.
The fact that Mutter accepts this input is really non-standard behaviour. It's being lenient, ignoring the error, and doing something other than what the spec specifies (maybe it's just discarding a column of pixels? maybe it's overflowing somewhere?). The result should have visual artefacts, because the math just doesn't add up.
Why is Mutter's behaviour here a blocker for this issue?
Window size it not set by widget code, widget renders what Firefox/layout sends.
I don't quite understand this statement.
Comment 14•2 years ago
|
||
I though I'd heard of this same issue on Mutter. It turns out the non-standard behaviour is merely a bug, not by design.
The bug has been fixed in https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/2188. With this change, Firefox also crashes on Mutter.
Comment 15•2 years ago
|
||
Any update on this, Martin? We'll release Sway 1.8 soon, which crashes Firefox because of this bug.
Assignee | ||
Comment 16•2 years ago
|
||
Will test the mutter patch. It's difficult for me to use Sway and even configure two different monitors with different scales.
Comment 17•2 years ago
|
||
I'm not sure how you're testing sway, but keep in mind that you can run it nested inside another compositor. It'll run with a virtual output, and render as a window on the parent compositor.
You can use WLR_WL_OUTPUTS=2 to have it run with multiple displays. Each "display" will render as a separate window on the parent compositor. For this particular case, you can configure each output to have a different scale via the configuration file, e.g.:
cat sway.conf
exec foot # run a terminal
output WL-1 scale 2
output WL-2 scale 3
WLR_WL_OUTPUTS=2 sway --config sway.conf
See also: https://gitlab.freedesktop.org/wlroots/wlroots/-/blob/master/docs/env_vars.md
Comment 18•2 years ago
|
||
Yeah, the above should be enough to reproduce the bug. Additionally, here are instructions to build Sway: https://github.com/swaywm/sway/wiki/Development-Setup#compiling-as-a-subproject
Assignee | ||
Comment 19•2 years ago
|
||
I used mutter with https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/2188 applied and that reliably crashes Firefox.
Assignee | ||
Comment 20•2 years ago
|
||
Updated•2 years ago
|
Assignee | ||
Comment 21•2 years ago
|
||
Depends on D163696
Assignee | ||
Comment 22•2 years ago
|
||
We need to return correct EGLWindow from moz_container_wayland_get_egl_window() with correct scale/size
and also keep the EGLWindow up to date. In this patch we do:
- Implement moz_container_wayland_egl_window_needs_size_update() and use it in nsWindow::SetEGLNativeWindowSize().
Avoid redundant moz_container_wayland_egl_window_set_size()/moz_container_wayland_set_scale_factor() calls as it may lead to resize callback calls to MESA. - If wl_container::eglwindow is present, check its size/scale in moz_container_wayland_get_egl_window() and update size/scale if needed.
- Use nsIntSize single param instead of width/height pairs in some moz_container_* functions.
- Assert when gtk_widget_get_window(container) returns null.
Depends on D163697
Comment 23•2 years ago
|
||
Comment 24•2 years ago
|
||
Thank you!
Comment 25•2 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/75c7055abbc7
https://hg.mozilla.org/mozilla-central/rev/105058a4927e
https://hg.mozilla.org/mozilla-central/rev/a8770c8bc4f0
Updated•2 years ago
|
Updated•2 years ago
|
Description
•