[NVIDIA] Crash in [@ XDisplayString] from VAAPIIsSupported: Consider blocking vdpau_drv_video.so from being loaded
Categories
(Core :: Widget: Gtk, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr91 | --- | unaffected |
firefox-esr102 | --- | disabled |
firefox98 | --- | unaffected |
firefox99 | --- | disabled |
firefox100 | --- | disabled |
firefox101 | --- | disabled |
firefox102 | --- | disabled |
firefox103 | --- | fixed |
People
(Reporter: gcp, Assigned: rmader)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: crash, regression)
Crash Data
Attachments
(1 file)
(deleted),
text/x-phabricator-request
|
Details |
Crash report: https://crash-stats.mozilla.org/report/index/8078454a-bf79-45bb-be8c-0624e0220307
Reason: SIGSEGV / SEGV_MAPERR
Top 10 frames of crashing thread:
0 libX11.so.6 XDisplayString
1 vdpau_drv_video.so __vaDriverInit_1_13
2 libva.so.2 va_infoMessage
3 libva.so.2 vaInitialize
4 libxul.so mozilla::widget::VAAPIIsSupported widget/gtk/VAAPIUtils.cpp:151
5 libxul.so gfxPlatformGtk::InitVAAPIConfig gfx/thebes/gfxPlatformGtk.cpp:260
6 libxul.so gfxPlatformGtk::gfxPlatformGtk gfx/thebes/gfxPlatformGtk.cpp:115
7 libxul.so gfxPlatform::Init gfx/thebes/gfxPlatform.cpp:903
8 libxul.so gfxPlatform::GetPlatform gfx/thebes/gfxPlatform.cpp:465
9 libxul.so gfxPlatform::InitializeCMS gfx/thebes/gfxPlatform.cpp:2083
Reporter | ||
Comment 1•3 years ago
|
||
Bit of a guess as to the regressor. Was looking for fallout from bug 1129492.
Reporter | ||
Updated•3 years ago
|
Reporter | ||
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Comment 3•3 years ago
|
||
:stransky, since you are the author of the regressor, bug 1751987, could you take a look?
For more information, please visit auto_nag documentation.
Comment 4•3 years ago
|
||
Set release status flags based on info from the regressing bug 1751987
Comment 5•3 years ago
|
||
This is a problem with NVIDIA proprietary drivers.
I see the same crashes from RDD process:
https://crash-stats.mozilla.org/report/index/8b6a51fe-5d6c-4e24-9b69-f8a360220313
Comment 6•3 years ago
|
||
Yes, we check the VA-API status only when it's enabled by StaticPrefs::media_ffmpeg_vaapi_enabled() here so it's disabled by default.
Comment 7•3 years ago
|
||
btw. This may be reported to NVIDIA proprietary drivers bugtracker - no idea why it open X display connection from vaDriverInit, we use DRM VA-API device for decode, not X11 one. Maybe some bug in VA-API -> vdpau bridge.
Reporter | ||
Comment 8•3 years ago
|
||
I don't think a disabled feature should be an S2 bug. Reducing until we ship.
Comment 9•3 years ago
|
||
It might be this one: https://salsa.debian.org/multimedia-team/attic/vdpau-video/-/blob/63450ffea86143d418c6e83cb8d2828d3a7beb25/src/vdpau_driver.c#L188
const char * const x11_dpy_name = XDisplayString(driver_data->x11_dpy);
https://bugs.archlinux.org/task/72241#comments
vaGetDisplayDRM() doesn't fill ->x11_dpy
VAAPI should be blocked for vdpau_drv_video.so.
vdpau_drv_video.so is deprecated and has been removed from Debian.
Debian Buster (oldstable) was the last release that had a package for it: https://packages.debian.org/oldstable/vdpau-va-driver
https://tracker.debian.org/pkg/vdpau-video
https://salsa.debian.org/multimedia-team/attic/vdpau-video
Updated•2 years ago
|
Updated•2 years ago
|
Assignee | ||
Comment 10•2 years ago
|
||
This is a classic "driver so broken that touching it risks crashing everything". That historically the role for glxtest
, so I think what we should do here is move VAAPIIsSupported()
and its dependencies (VAAPIGetAcceleratedProfiles()
and VAAPIGetAcceleratedProfilesInternal()
) there. I only see them used for the initial test, so we don't seem to need them anywhere else.
That will still make FF fall back to software rendering, but at least the browser starts.
Assignee | ||
Comment 11•2 years ago
|
||
Some VAAPI drivers are so broken that trying to use them
crashes Firefox. This is nothing entirely new for GPU drivers
and which is why we have glxtest
.
Thus move the test from VAAPIUtils
there. This has the
additional benefit of only doing the test once.
Updated•2 years ago
|
Comment 12•2 years ago
|
||
IIUC, your patch would cause the crash (at a different place) even for Nvidia users who haven't force-enabled VAAPI. (?)
At the moment, VAAPIIsSupported()
is only called if EGL is used and if the VAAPI feature is enabled by whitelist rule (Intel/AMD).
It's not enabled on devices that are recognized as Nvidia.
This crash can only occur
- if an Nvidia user force-enables VAAPI while having deprecated vdpau_drv_video.so installed.
- in case there is a so far unknown bug that Intel or AMD are detected as GPU but the DRM device is Nvidia.
Assignee | ||
Comment 13•2 years ago
|
||
Hm, that's indeed a valid point. however we'd at least be consistent: if a system has EGL drivers installed that crash, we'll also not wait for the GfxInfo blocklist. IMHO systems with crashing drivers should always be seen as broken and get software rendering, and we shouldn't try everything we can to make things work. Hard call though.
Comment 14•2 years ago
|
||
I don't think we should do this.
The crash here comes from RDD process which is designed for it - when it crashes we just restart it. It enables only approved and potentially working drivers which reduces crash probability. It also has strong sadbox which is different than glxtest environment so it's not guaranteed that glxtest pass will mean RDD pass (Bug 1771382).
I think the better solution is to track RDD crashes and disable VA-API entirely after some initial crashes (2-3?) - Bug 1762544.
Comment 15•2 years ago
|
||
(In reply to Martin StrΓ‘nskΓ½ [:stransky] (ni? me) from comment #14)
I don't think we should do this.
The crash here comes from RDD process which is designed for it - when it crashes we just restart it. It enables only approved and potentially working drivers which reduces crash probability. It also has strong sadbox which is different than glxtest environment so it's not guaranteed that glxtest pass will mean RDD pass (Bug 1771382).
I think the better solution is to track RDD crashes and disable VA-API entirely after some initial crashes (2-3?) - Bug 1762544.
Hm, the sandbox argument is invalid as we run VAAPIIsSupported() from parent process which is not sandboxed too.
Comment 16•2 years ago
|
||
Robert, is is possible to check VA-API in glxtest according to nsIGfxInfo::FEATURE_VAAPI status?
Updated•2 years ago
|
Assignee | ||
Comment 17•2 years ago
|
||
(In reply to Martin StrΓ‘nskΓ½ [:stransky] (ni? me) from comment #16)
Robert, is is possible to check VA-API in glxtest according to nsIGfxInfo::FEATURE_VAAPI status?
Tough, given that glxtest is needed to determine on which hardware we're running. However, I'm looking into letting the child process fork again before doing the VA-API check so a crash there doesn't impact the rest.
Comment 18•2 years ago
|
||
(In reply to Robert Mader [:rmader] from comment #17)
(In reply to Martin StrΓ‘nskΓ½ [:stransky] (ni? me) from comment #16)
Robert, is is possible to check VA-API in glxtest according to nsIGfxInfo::FEATURE_VAAPI status?
Tough, given that glxtest is needed to determine on which hardware we're running. However, I'm looking into letting the child process fork again before doing the VA-API check so a crash there doesn't impact the rest.
Yeah, AFAIK NVIDIA is supported now so we can't throw away WebRender on NVIDIA just because it crashes on VA-API init which is disabled by default anyway.
Updated•2 years ago
|
Assignee | ||
Comment 19•2 years ago
|
||
Updated the patch to do the VA-API test in an own process, logging appropriate errors in case of crashes. Try: https://treeherder.mozilla.org/jobs?repo=try&revision=dc471f57e006cb55c1604a4b27b499db75292cfe
Comment 20•2 years ago
|
||
Pushed by robert.mader@posteo.de: https://hg.mozilla.org/integration/autoland/rev/6d7485c81e90 Move VA-API test into glxtest, r=stransky
Comment 21•2 years ago
|
||
bugherder |
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Description
•