Closed Bug 1733680 Opened 3 years ago Closed 3 years ago

[Linux/GPU] Crash at GPUProcessManager::SimulateDeviceReset()

Categories

(Core :: Graphics, defect)

defect

Tracking

()

RESOLVED FIXED
95 Branch
Tracking Status
firefox95 --- fixed

People

(Reporter: stransky, Assigned: stransky)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

Sometimes we crash at GPUProcessManager::SimulateDeviceReset() when GPU process is enabled:

#8 0x00007ffff7f95a20 in <signal handler called> () at /lib64/libpthread.so.0
#9 mozilla::ipc::IProtocol::CanSend() const (this=0x0) at /raid/src/objdir/dist/include/mozilla/ipc/ProtocolUtils.h:251
#10 0x00007fffe6812f7a in mozilla::ipc::IProtocol::ChannelSend(IPC::Message*, IPC::Message*) (this=0x0, aMsg=0x7fffce88f430, aReply=0x7fffffff39c8)
at /raid/src/ipc/glue/ProtocolUtils.cpp:533
#11 0x00007fffe6b57b37 in mozilla::gfx::PGPUChild::SendSimulateDeviceReset(mozilla::gfx::GPUDeviceData*) (this=0x0, status=0x7fffffff3a80) at PGPUChild.cpp:723
#12 0x00007fffe7a976cf in mozilla::gfx::GPUProcessManager::SimulateDeviceReset() (this=0x7fffcde0ba80) at /raid/src/gfx/ipc/GPUProcessManager.cpp:476
#13 0x00007fffe7a97635 in mozilla::gfx::GPUProcessManager::ResetCompositors() (this=0x7fffcde0ba80) at /raid/src/gfx/ipc/GPUProcessManager.cpp:466
#14 0x00007fffeb8f144c in screen_composited_changed_cb(_GdkScreen*, void*) (screen=0x7ffff7828000, user_data=0x0) at /raid/src/widget/gtk/nsWindow.cpp:8061

It's because mGPUChild is null:

469	void GPUProcessManager::SimulateDeviceReset() {
470	  // Make sure we rebuild environment and configuration for accelerated
471	  // features.
472	  gfxPlatform::GetPlatform()->CompositorUpdated();
473	
474	  if (mProcess) {
475	    GPUDeviceData data;
476	    if (mGPUChild->SendSimulateDeviceReset(&data)) {  << here
477	      gfxPlatform::GetPlatform()->ImportGPUDeviceData(data);
478	    }
479	    OnRemoteProcessDeviceReset(mProcess);
480	  } else {
481	    OnInProcessDeviceReset(/* aTrackThreshold */ false);
482	  }
483	}

2 years ago there was bug 1314711/bug 1405812/bug 1568291 which could kill/logout the desktop session.

This happens right after start when we get screen_composited_changed_cb() event before GPU process is initialized.

Assignee: nobody → stransky
Status: NEW → ASSIGNED

Out of curiosity why are you using the GPU process on Linux?

Flags: needinfo?(stransky)

(In reply to Jeff Muizelaar [:jrmuizel] from comment #4)

Out of curiosity why are you using the GPU process on Linux?

Background:

  • VAAPI hardware video decoding needs GPU access and a more permissive sandbox. It can be used on X11 and Wayland.
  • bug 1698778 comment 10: The RDD sandbox blocks VAAPI.
  • Users then disable media.rdd-ffvpx.enabled, media.rdd-vpx.enabled to use VAAPI in the content process or they disable the RDD sandbox.
  • bug 1619585, https://github.com/intel/media-driver/issues/854: In the content process it works with the old Intel VAAPI driver, but the newer driver is blocked by the sandbox because it wants SysV IPC. Intel Xe users can only use the newer driver, some can only decide between disabling the sandbox (what they shouldn't do) or waiting.
  • bug 1610199 comment 75, bug 1732951: stransky then suggested to move VAAPI into the GPU process, which requires implementing one for Wayland.
  • Robert mentioned in #gfx-firefox on chat.mozilla.org that a Wayland GPU process could make gfx.webrender.compositor more complicated, the parent process would need to act as some kind of Wayland proxy server.
  • Offtopic: bug 1713276 will remove another copy for software decoding.

What do you prefer?

(In reply to Jeff Muizelaar [:jrmuizel] from comment #4)

Out of curiosity why are you using the GPU process on Linux?

Yes, HW video decoding is the main reason here.

Flags: needinfo?(stransky)
Flags: needinfo?(jmuizelaar)

Wouldn't it make sense to keep using RDD process (but slightly changing its sandbox: bug 1698778 comment 10) - until the apparently upcoming GPUFallback utility process can be used - to make things consistent across platforms?
https://firefox-source-docs.mozilla.org/dom/ipc/process_model.html#data-decoder-rdd-process

Data Decoder (RDD) Process
This process is in the process of being restructured into a generic “utility” process type for running untrusted code in a maximally secure sandbox. After these changes, the following new process types will exist, replacing the RDD process:

  • Utility: A maximally sandboxed process used to host untrusted code which does not require access to OS resources. This process will be even more sandboxed than RDD today on Windows, where the RDD process has access to Win32k.
  • UtilityWithWin32k: A Windows-only process with the same sandboxing as the RDD process today. This will be used to host untrusted sandboxed code which requires access to Win32k to allow decoding directly into GPU surfaces.
  • GPUFallback: A Windows-only process using the GPU process’ sandboxing policy which will be used to run Windows Media Foundation (WMF) when the GPU process itself is unavailable, allowing UtilityWithWin32k to re-enable Arbitrary Code Guard (ACG) on Windows.

Yes, I agree that best short term solution is to allow VAAPI from the RDD. RDD has GPU access on macOS so giving on Linux it has some precedent.

Flags: needinfo?(jmuizelaar)

Jeff, is there any reason why not implement GPU process on Linux/Wayland? I think it was added to improve overall stability with GPU drivers and it's used on Windows or am I wrong?

Flags: needinfo?(jmuizelaar)

No good reason. All things equal, we'd definitely prefer to have a GPU process on Linux/Wayland. There was just some worry about how easy it would be to make work and about introducing bugs.

Flags: needinfo?(jmuizelaar)
Pushed by stransky@redhat.com: https://hg.mozilla.org/integration/autoland/rev/9c05a84a1059 [Linux] Check mGPUChild at GPUProcessManager::SimulateDeviceReset(), r=jrmuizel
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 95 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: