[Intel Arc A770] rendertexturehost tab crash without crash report (Nightly: WR/EGL/fluxbox X11/Intel, VAAPI force-enabled)
Categories
(Core :: Graphics, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr102 | --- | unaffected |
firefox-esr115 | --- | affected |
firefox114 | --- | disabled |
firefox115 | --- | disabled |
firefox116 | --- | wontfix |
firefox117 | --- | wontfix |
firefox118 | --- | fix-optional |
People
(Reporter: zlice555, Assigned: sotaro)
References
(Regression)
Details
(Keywords: crash, regression)
Attachments
(6 files)
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/115.0
Steps to reproduce:
Browser misc sites like Twitch.tv or redfin.com
Actual results:
Random tab crashes. Got this by running from terminal. Didn't get a trace
[GFX1-]: unexpected remote texture size: Size(0,0) expected: Size(256,256)
[GFX1-]: Failed to get RenderTextureHost for extId:32844
[Parent 20328, IPC I/O Parent] WARNING: Message needs unreceived descriptors channel:7feb16998ee0 message-type:11665413 header()->num_handles:1 num_fds:0 fds_i:0: file /builds/worker/checkouts/gecko/ipc/chromium/src/chrome/common/ipc_channel_posix.cc:489
Exiting due to channel error.
Expected results:
No crash
Comment 1•1 years ago
|
||
The Bugbug bot thinks this bug should belong to the 'Core::Widget: Gtk' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.
Comment 2•1 years ago
|
||
Thanks for the report!
Can this crash be prevented by setting webgl.out-of-process.async-present.force-sync to true on about:config and restarting Firefox? (bug 1831548)
lol ya pretty sure that's it. force-sync
true
and i can look around for several minutes, turn it off and the tab crashes looking at houses on redfin almost instantly.
should be good to mark as dupe. thanks!
I guess I spoke too soon? Still getting tab crashes, just not as often. Will try to capture terminal output =/
Comment 5•1 years ago
|
||
(In reply to zlice from comment #4)
I guess I spoke too soon? Still getting tab crashes, just not as often. Will try to capture terminal output =/
To be absolutely sure: Please reboot after setting webgl.out-of-process.async-present.force-sync to true.
I have when I switch between true/false. (Obviously had to restart FF to run from terminal too)
Comment 7•1 years ago
|
||
Try rebooting Linux after the pref change in case the problem is somehow partly outside of Firefox.
Ahhh. Would a drop_cache 3 work ?
echo 3 > /proc/sys/vm/drop_caches && sync
Testing other things (restarting X several times) I have noticed FF thinks it is still running and opening a URI gives the "another process is running" message. Even though I will run FF once, close and give it time, then mess with things. Which does require full reboot iirc
Went a day without a crash so force-sync looks like its fixing for now. Thanks!
Reporter | ||
Comment 10•1 year ago
|
||
I saw that bug was resolved and disabled force-sync
and seem to be getting tab crashes again.
116.0a1 (2023-06-16)
Reporter | ||
Comment 11•1 year ago
|
||
Still seems to be crashing on the same sites without the force-sync
item set to true
.
Not sure how dupes and bugs work, should I open another bug?
Updated•1 year ago
|
Updated•1 year ago
|
Comment 12•1 year ago
|
||
Set release status flags based on info from the regressing bug 1829052
:sotaro, since you are the author of the regressor, bug 1829052, could you take a look?
For more information, please visit BugBot documentation.
Updated•1 year ago
|
Assignee | ||
Updated•1 year ago
|
Assignee | ||
Comment 13•1 year ago
|
||
(In reply to zlice from comment #0)
Steps to reproduce:
Browser misc sites like Twitch.tv or redfin.com
The step was not clear for me. I tried randomly to Twitch.tv or redfin.com, but the crash did not happen. But I could reproduce the Bug 1839314. Then I am going to look into Bug 1839314 for now.
Reporter | ||
Comment 14•1 year ago
|
||
It can take some time. Redfin is easier to reproduce for me, just zooming in/out or moving around with filters on houses seems to crash faster than twitch. It looks like the new bug has a similar reproduce step, just have to try several times. Thanks, I'll follow that!
Assignee | ||
Comment 15•1 year ago
|
||
Hi zlice, can you re-check if the problem is addressed with latest nightly? Thank you.
Assignee | ||
Updated•1 year ago
|
Reporter | ||
Comment 16•1 year ago
|
||
Sweet, that didn't take long lol. Ya still crashing, just applied updates and restarted nightly.
Updated•1 year ago
|
Comment 17•1 year ago
|
||
Set release status flags based on info from the regressing bug 1829052
Assignee | ||
Comment 18•1 year ago
|
||
Similar to Bug 1841380 Comment 11, during visiting redfin.com, I saw cases that there were cases that more than 2000 DrawTargetWebgls and Framebuffers were created. Though it did not cause out of fd problem for me. But I wonder if it might cause the problem with some drivers.
Assignee | ||
Comment 19•1 year ago
|
||
fd count became very large easily for me when pref widget.dmabuf.force-enabled was true.
Assignee | ||
Comment 20•1 year ago
|
||
Hi zlice, can you check if the problem is addressed with latest nightly?
Reporter | ||
Comment 21•1 year ago
|
||
Ya =/ didn't take long for a crash. Saw big black/gray checkered squares for a split second this time before the tab crash page.
Built from https://hg.mozilla.org/mozilla-central/rev/196cda3a105202c8969a926a0637db0e0014c07d
Assignee | ||
Comment 22•1 year ago
|
||
Hi zlice, can you check if the following address the problem for you?
- Set pref gfx.canvas.accelerated.max-draw-target-count = 20 from about:config
- Restart firefox.
Can you also check if the following address the problem for you?
- Set pref widget.dmabuf-webgl.enabled = false from about:config
- Restart firefox.
Reporter | ||
Comment 23•1 year ago
|
||
I turned off the previous force-sync
fix and am having trouble getting it at all with a quick 2min test, let alone with either of those 2 options. I'll leave it off and see how it goes this week. If it pops up I'll set 1 of those.
Reporter | ||
Comment 24•1 year ago
|
||
I left these custom options off yesterday and had only 1 tab crash and it took a while to happen. Previously it was very easy to poke around redfin to a crash. After that crash I set dmabuf
false and didn't see any more crashes throughout the day. I can't really say if dmabuf
or draw-target-count
have an effect since it seems like it was harder to trigger to begin with. I have draw-target-count
at 20 right now, if I see any crashes I'll let you know.
Assignee | ||
Comment 25•1 year ago
|
||
Great! Thank you.
Reporter | ||
Comment 26•1 year ago
|
||
Did have draw-target-count
crash with 20
Assignee | ||
Comment 27•1 year ago
|
||
Thank you for checking! Hmm, async RemoteTexture seemed to have the problem of "out of file descriptors" with IntelArc A770 Graphics with dmabuf enabled.
dmabuf could also be used for hardware video decoding. It might also be related to the problem.
Updated•1 year ago
|
Comment 28•1 year ago
|
||
The severity field is not set for this bug.
:bhood, could you have a look please?
For more information, please visit BugBot documentation.
Updated•1 year ago
|
Reporter | ||
Comment 29•1 year ago
|
||
I saw a comment on https://bugzilla.mozilla.org/show_bug.cgi?id=1839314 about nightly being fixed. But without any of the about:config variables set 117.0a1 (2023-07-29) is still crashing on redfin.
Assignee | ||
Comment 30•1 year ago
|
||
Hi zlice, can you check again if the problem is addressed with latest nightly? I wonder if Bug 1845697 addressed the problem. Thank you.
Assignee | ||
Comment 32•1 year ago
|
||
Thank you for checking. hmm.
Updated•1 year ago
|
Comment 33•1 year ago
|
||
No longer disabled since bug 1832480 (116). And bug 1777430 (115) shipped hardware decoding, it might or might not be related.
(In reply to zlice from comment #31)
Still crashing =/
Can the crash be prevented by disabling VAAPI and restarting Firefox?
media.hardware-video-decoding.enabled=false
media.hardware-video-decoding.force-enabled=false
media.ffmpeg.vaapi.enabled=false
Reporter | ||
Comment 34•1 year ago
|
||
Still happens without vaapi
Reporter | ||
Comment 35•1 year ago
|
||
Actually had my auto-cfg set to vaapi true, need to re-test.
Reporter | ||
Comment 36•1 year ago
|
||
That was quick. Without vaapi there isn't a 'tab crash' but a clear freeze on redfin
(see attached)
the left is not loading or moving and the right scrolls instead of the left zooming. No pictures or loading taking place
Reporter | ||
Comment 37•1 year ago
|
||
Comment 38•1 year ago
|
||
Would setting webgl.threadsafe-gl.force-disabled to true and restarting Firefox help?
Reporter | ||
Comment 39•1 year ago
|
||
disable? not enable? is that similar to force-sync?
I can try later
Comment 40•1 year ago
|
||
(That feature/assumption is the opposite of what it says. If THREADSAFE_GL is enabled (default), different threads are used for WebRender and WebGL, which is problematic (bug 1845765, bug 1847822, GLX: bug 1777849 comment 21). If THREADSAFE_GL is disabled (default for Nouveau + proprietary Nvidia), then GL is used threadsafe by using the same thread for WebRender and WebGL.)
Reporter | ||
Comment 41•1 year ago
|
||
Got it. Did some short testing with
webgl.threadsafe-gl.force-disabled
=true
media.hardware-video-decoding.enabled
=true
media.hardware-video-decoding.force-enabled
=false
media.ffmpeg.vaapi.enabled
=true
didn't see anything weird happen. Leaving threadsafe disabled=true and I'll see if anything 'crashes' or I hit a page freeze
Reporter | ||
Comment 42•1 year ago
|
||
fwiw with those settings over the past few days i have noticed actual tab crashes that want to send a report, no freezes or non-report crashes.
Assignee | ||
Comment 43•1 year ago
|
||
:stransky, is there a case that hardware decoding with dma buf consumes a lot of file descriptors?
Comment 44•1 year ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #43)
:stransky, is there a case that hardware decoding with dma buf consumes a lot of file descriptors?
The code is optimized to close unused file descriptors so under normal conditions it should work fine (unless there's a bug somewhere).
As this is Intel(R) Arc(tm) hardware which is new and a bit rare I'd expect we may see a driver bug or so. The code is the same for all Intel devices but we don't see such reports for regular Intel ones.
zlice, can you test different VA-API client like mpv player or so? run
mpv --hwdec=vaapi test_clip
you can use direct YT url here.
You can also flip media.ffmpeg.vaapi.force-surface-zero-copy pref to '1' to match firefox and mpv VA-API playback mode and test Firefox then.
Thanks.
Updated•1 year ago
|
Comment 45•1 year ago
|
||
If you can reproduce it with mpv please file a bug at Mesa (https://gitlab.freedesktop.org/mesa/mesa/-/issues).
Reporter | ||
Comment 46•1 year ago
|
||
intel-gpu-top shows hwdec working with mpv and firefox.
i have filed bugs for intel's 'media-driver' for other issues before, but those were reproducible errors. and this started happening in firefox a bit before i posted this bug. before that everything was working fine and if there were crashes they were able to be sent to firefox with a bug-report-box popup. there haven't been any changes in ffmpeg or intel's 'media-driver' recently.
Reporter | ||
Comment 47•1 year ago
|
||
Reporter | ||
Comment 48•1 year ago
|
||
ofc the screenshot i took froze at the wrong spot, but the 'video' part has usage while playing back video w/ hwaccel
Assignee | ||
Comment 49•1 year ago
|
||
Hi zlice, can you check again if the problem still happens with latest nightly? I wonder if bug 1848171 might affect to the problem.
Assignee | ||
Comment 51•1 year ago
|
||
Hmm, thank you for checking.
Assignee | ||
Comment 52•1 year ago
|
||
(In reply to Martin Stránský [:stransky] (ni? me) from comment #44)
As this is Intel(R) Arc(tm) hardware which is new and a bit rare I'd expect we may see a driver bug or so. The code is the same for all Intel devices but we don't see such reports for regular Intel ones.
It might be better to block DMABUF with mesa and with Intel Arc A.
Assignee | ||
Comment 53•1 year ago
|
||
Assignee | ||
Comment 54•1 year ago
|
||
Hi zlice, can you check again if the following build addresses the problem for you? And can you also check DMABUF is blocked in about:support?
https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/Wf17NQYrSNiuNjsVilKjIA/runs/0/artifacts/public%2Fbuild%2Ftarget.tar.bz2
https://treeherder.mozilla.org/jobs?repo=try&revision=22b9f9ba5219b5d88b8f42bdd0f82200343b70e0
Reporter | ||
Comment 55•1 year ago
|
||
Ended up doing the freezing thing on redfin like disabling vaapi did
attaching screenshots, looks like dma disabled in about:support and only webgl dma enabled in about:config
Reporter | ||
Comment 56•1 year ago
|
||
Reporter | ||
Comment 57•1 year ago
|
||
Assignee | ||
Comment 58•1 year ago
|
||
(In reply to zlice from comment #55)
Ended up doing the freezing thing on redfin like disabling vaapi did
attaching screenshots, looks like dma disabled in about:support and only webgl dma enabled in about:config
Thank you for checking! Can you explain more about "like disabling vaapi did" part?
Reporter | ||
Comment 59•1 year ago
|
||
https://bugzilla.mozilla.org/show_bug.cgi?id=1835275#c36 the page freezes but there's no indication of a crashed tab, just blank squares where things should be and no loading or clicking actions
Comment 60•1 year ago
|
||
Is the tab crash reproducible with mozregression?
$ pip3 install mozregression
$ ~/.local/bin/mozregression --good 2023-04-01 --bad 2023-05-26 -P stdout -a https://twitch.tv -a https://redfin.com
Reporter | ||
Comment 61•1 year ago
|
||
i have swapped back to my amd gpu for the time being. not sure when/if i'll go back to the intel card as it seems they have very little linux priority (xe driver which may or may not have vaapi support apparently, vulkan sparse residency memory, av1 decode doesnt work still afaik)
Updated•1 year ago
|
Description
•