Closed Bug 1815528 Opened 2 years ago Closed 2 years ago

force-enabled VAAPI with deprecated libva-vdpau-driver: RDD Crash in [@ XDisplayString]. Consider blocking vdpau_drv_video.so #4

Categories

(Core :: Widget: Gtk, defect, P3)

Firefox 111
Unspecified
Linux
defect

Tracking

()

RESOLVED WONTFIX
Tracking Status
firefox-esr102 --- unaffected
firefox109 --- unaffected
firefox110 --- unaffected
firefox111 --- disabled
firefox112 --- disabled

People

(Reporter: dmeehan, Assigned: stransky)

References

(Blocks 2 open bugs, Regression)

Details

(Keywords: crash, regression)

Crash Data

Crash report: https://crash-stats.mozilla.org/report/index/daab842d-17df-4baf-98c3-c922b0230207

Reason: SIGSEGV / SEGV_MAPERR

Top 10 frames of crashing thread:

0  libX11.so.6  XDisplayString  /usr/src/debug/libx11/libX11-1.8.3/src/Macros.c:119
1  vdpau_drv_video.so  __vaDriverInit_1_13  
2  libva.so.2  <.text ELF section in libva.so.2.1700.0>  
3  libva.so.2  vaInitialize  
4  libxul.so  mozilla::FFmpegVideoDecoder<46465650>::CreateVAAPIDeviceContext  dom/media/platforms/ffmpeg/FFmpegVideoDecoder.cpp:231
5  libxul.so  mozilla::FFmpegVideoDecoder<46465650>::InitVAAPIDecoder  dom/media/platforms/ffmpeg/FFmpegVideoDecoder.cpp:295
6  libxul.so  mozilla::FFmpegVideoDecoder<46465650>::Init  dom/media/platforms/ffmpeg/FFmpegVideoDecoder.cpp:431
7  libxul.so  mozilla::MediaDataDecoderProxy::Init const  dom/media/platforms/wrappers/MediaDataDecoderProxy.cpp:18
7  libxul.so  mozilla::detail::ProxyFunctionRunnable<mozilla::MediaDataDecoderProxy::Init  xpcom/threads/MozPromise.h:1674
8  libxul.so  mozilla::TaskQueue::Runner::Run  xpcom/threads/TaskQueue.cpp:259

Timing matches will the landing of Bug 1799747

This is the 4th report for the same bug (bug 1787182, bug 1758473, bug 1777927).

(Martin Stránský [:stransky] from bug 1799747 comment #14)

(In reply to Darkspirit from comment #13)

Wouldn't comment 11 mean that vdpau_drv_video.so with force-enabled VAAPI wouldn't crash in glxtest anymore and instead in the RDD process?

Yes, exactly.

My understanding:

  • nvidia-vaapi-driver does not crash and codec detection works because glxtest does not run in any sandbox. (Or is it broken for you?)
  • Only deprecated libva-vdpau-driver crashes in XDisplayString.

Yes, that's correct.

If the user has an Nvidia GPU and if the presence of vdpau_drv_video.so is detected with dl_iterate_phdr (bug 1804974 comment 2)
vaapitest() should be early returned,
childvaapitest() could not crash,
glxtest would not print VAAPI_SUPPORTED true,
mIsVAAPISupported would be false in Gfxinfo.cpp:
My question was if this code would then realibly force-disable VAAPI and ignore media.hardware-video-decoding.force-enabled=true, so that's impossible to force-enable VAAPI as long as vdpau_drv_video.so is present on the Nvidia system.

That's perhaps possible. But I don't see worth to do such extensive diagnostics for NVIDIA how. It's still experimental and rather unstable solution which may crash/be broken for various reasons. Even working vaapitest() doesn't meat the decoding works so the vaapitest() doesn't give much value here (only advantage is correct states at about:support page).

Regressed by: 1799747
Summary: Crash in [@ XDisplayString] → force-enabled VAAPI with deprecated libva-vdpau-driver: RDD Crash in [@ XDisplayString]. Consider blocking vdpau_drv_video.so #4

:stransky, since you are the author of the regressor, bug 1799747, could you take a look?

For more information, please visit auto_nag documentation.

Flags: needinfo?(stransky)
Keywords: regression

This is expected and it's not a regression. We just moved the crash from every browser start to place where it's used. But it still crashes of course.

Flags: needinfo?(stransky)

The bug is linked to a topcrash signature, which matches the following criterion:

  • Top 10 desktop browser crashes on nightly

:stransky, could you consider increasing the severity of this top-crash bug?

For more information, please visit auto_nag documentation.

Flags: needinfo?(stransky)
Keywords: topcrash

It's hardly topcrash. From the https://crash-stats.mozilla.org/signature/?product=Firefox&signature=XDisplayString it comes from 13 installations. We didn't catch the crashes before because we don't get crashes for glxtest process where it crashed before.
Note that it crashes every time a video clip is played.

Flags: needinfo?(stransky)
Priority: -- → P3

We encountered this RDD topcrash (few users, massive reports) before and that's why a VAAPI check has been put into glxtest.
Instead of making the check reliable, bug 1799747 removed it, and Fedora has even removed the crash reporter.

(In reply to Darkspirit from comment #7)

We encountered this RDD topcrash (few users, massive reports) before and that's why a VAAPI check has been put into glxtest.

Crashes in glxtest are hidden from Firefox perspective but floods system with coredumps and we don't have any info about that.

Instead of making the check reliable bug 1799747 removed it

Unfortunately I don't know how to make it reliable yet.

and Fedora has even removed the crash reporter.

Fedora has disabled crashreporter due to python build issues. We'll enable it again when it's fixed. Right now we use Fedora crash tool (https://retrace.fedoraproject.org/faf/reports/).

Flags: needinfo?(stransky)

I'll try to use the example from https://gist.github.com/kinichiro/66191f27963c9efe25d0 to implement the check.

Assignee: nobody → stransky
Flags: needinfo?(stransky)
Flags: needinfo?(stransky)

We can't use dl_iterate_phdr() here. Driver library is opened at vaInitialize() call and it crashes directly in the call when driver init is called. dl_iterate_phdr() can't catch it that because libva-vdpau-driver.so is not loaded before vaInitialize() which crashes.
So there's no way how to block libva-vdpau-driver.so from load.

We may set LIBVA_DRIVER_NAME directly when NVIDIA/Wayland is present but that's also an ugly hack.
As this crash happens on Wayland/NVIDIA and you need to explicitly force-enable it and use broken driver I'm not going to investigate it more.

Flags: needinfo?(stransky)

Set release status flags based on info from the regressing bug 1799747

Sorry for removing the keyword earlier but there is a recent change in the ranking, so the bug is again linked to a topcrash signature, which matches the following criterion:

  • Top 10 desktop browser crashes on nightly

For more information, please visit auto_nag documentation.

Keywords: topcrash

Based on the topcrash criteria, the crash signature linked to this bug is not a topcrash signature anymore.

For more information, please visit auto_nag documentation.

Keywords: topcrash

Broken drivers, we can't do anything about it.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.