Closed Bug 1588904 Opened 5 years ago Closed 4 years ago

[Linux/EGL] Use correct rendering device in multi-GPU setup

Categories

(Core :: Widget: Gtk, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
86 Branch
Tracking Status
firefox86 --- fixed

People

(Reporter: stransky, Assigned: rmader)

References

(Blocks 2 open bugs)

Details

Attachments

(1 file)

We need to select correct GPU device in multi-GPU setup, see:
https://lists.freedesktop.org/archives/wayland-devel/2018-November/039660.html

We don't need that.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WONTFIX

We don't need that.

If possible could you please clarify, in case of laptop with iGPU and dGPU or/and eGPU Firefox will always run on iGPU, right?

I asking because sometimes HDMI/DP output is wired only to dGPU (some Optimus laptops) and more frequently people attach several displays to eGPU. So if Firefox will always run on iGPU I afraid it could hit same bug as Gnome Shell: https://gitlab.gnome.org/GNOME/mutter/issues/348

Okay, we can investigate it then.

Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Blocks: 1622132

https://searchfox.org/mozilla-central/rev/baf1cd492406a9ac31d9ccb7a51c924c7fbb151f/widget/gtk/nsWaylandDisplay.cpp#368-380

  // TODO - Better DRM device detection/configuration.
  const char* drm_render_node = getenv("MOZ_WAYLAND_DRM_DEVICE");
  if (!drm_render_node) {
    drm_render_node = "/dev/dri/renderD128";
  }

This is how glmark2 determines its drm device:
https://github.com/glmark2/glmark2/blob/b1cf0649a18fb4278d9d29a947a19afdffa0126b/src/native-state-drm.cpp#L472-L475
https://github.com/glmark2/glmark2/blob/b1cf0649a18fb4278d9d29a947a19afdffa0126b/src/native-state-drm.cpp#L275

This sounds better (sorry, I'm not a developer):

https://gitlab.freedesktop.org/wayland/wayland/-/issues/59

However, Mesa also depends on wl_drm being exposed even if zwp_linux_dmabuf_v1 is being used. wl_drm also provides a device event, passing a DRM FD from compositor to client, so the client can infer additional placement parameters. For instance, if Mesa is rendering on a different GPU than the compositor is using for its rendering, then it must make sure the memory is accessible between devices, e.g. being placed in GTT space rather than hidden dedicated VRAM.

https://wayland.freedesktop.org/architecture.html

The open source stack uses the drm Wayland extension, which lets the client discover the drm device to use and authenticate and then share drm (GEM) buffers with the compositor.

This gets the device fd via wl_drm listener (wl_drm_add_listener) and then uses it with gbm_create_device(simple_gbm->fd):
https://github.com/anderco/simple-gbm/blob/4b663ecc4caabd6679a1dcea02ef86eae8e2e041/simple-gbm.c#L162
https://github.com/freedesktop/xorg-xserver/blob/5dc16f6fec672c10fc26dcaaf00df0f6bed0ef9a/hw/xwayland/xwayland-glamor-gbm.c#L755

On X11 (bug 1580166 comment 5) it works similar: Get drm_fd with dri3_open() and use it with gbm_create_device().

vaGetDisplayWl internally utilizes nearly¹ the same DRM device detection mechanism as above simple-gbm² example, that WebGL and vaGetDisplayDRM would need.

¹) wl_display_roundtrip_queue instead of wl_display_roundtrip
²) wl_drm, wl_drm_add_listener, wl_drm_authenticate

(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #5)

On X11 (bug 1580166 comment 5) it works similar: Get drm_fd with dri3_open() and use it with gbm_create_device().

Chromium/X11 does exactly the same: https://source.chromium.org/chromium/chromium/src/+/master:ui/gfx/linux/gpu_memory_buffer_support_x11.cc;l=55;drc=bdc187230ad81f6abc0a5ba3d3b59c6b02604cbc

Related: bug 1652249

Blocks: 1652783

I'm seeing problems due to this. I have an Intel Integrated card and a NVIDIA card. However Firefox only sees the Intel card.

% tree /dev/dri 
/dev/dri
├── by-path
│   ├── pci-0000:00:02.0-card -> ../card0
│   ├── pci-0000:00:02.0-render -> ../renderD128
│   ├── pci-0000:01:00.0-card -> ../card1
│   └── pci-0000:01:00.0-render -> ../renderD129
├── card0
├── card1
├── renderD128
└── renderD129

Firefox only sees:

GPU #1
Active: Yes
Description: Mesa DRI Intel(R) HD Graphics 4600 (HSW GT2)
Vendor ID: 0x8086
Device ID: 0x0416
Driver Vendor: mesa/i965
Driver Version: 20.1.3.0
RAM: 1536

I tried running MOZ_WAYLAND_DRM_DEVICE=/dev/dri/renderD128 firefox and MOZ_WAYLAND_DRM_DEVICE=/dev/dri/renderD129 firefox with no observable change.

It is worth noting that chromium detects both cards but uses the Intel anyways.

GPU0: VENDOR= 0x10de, DEVICE=0x139b
GPU1: VENDOR= 0x8086, DEVICE=0x0416 ACTIVE

I don't know if that helps, but another way to do this would be to use the EGL_EXT_device_query plus EGL_EXT_device_drm EGL extensions to grab the DRM device currently used by EGL.

Example: https://github.com/swaywm/wlroots/pull/2374/commits/4f28e57920f40f6ac99255568bb3dae373c54297

Blocks: 1661572
Status: REOPENED → NEW
Summary: [Wayland] Use correct rendering device in multi-GPU setup → [Linux/EGL] Use correct rendering device in multi-GPU setup

(In reply to emersion from comment #11)

I don't know if that helps, but another way to do this would be to use the EGL_EXT_device_query plus EGL_EXT_device_drm EGL extensions to grab the DRM device currently used by EGL.

Example: https://github.com/swaywm/wlroots/pull/2374/commits/4f28e57920f40f6ac99255568bb3dae373c54297

This should indeed work AFAICS. We likely need that anyway for bug 1640053 (it was suggested to me as substitution for MESA_ACCELERATED and is already in the WIP patch). So we could set the device in GfxInfo.

Fetch the DRM device in the EGL version of glxtest, set it in gfxInfo and pass
it to gfxVars. Finally, use it in nsDMABufDevice::Configure().

Depends on D97861

Here is a somewhat quick and dirty patch for that - it properly detects the right device on my single-gpu system, but unfortunately I don't have the hardware to test multi-gpu.

Could someone with the appropriate hardware try the following try push and confirm it works? https://treeherder.mozilla.org/jobs?repo=try&revision=d49eb828b8d2681ff262e987d9e586f45911062b

MOZ_ENABLE_WAYLAND=1 mozregression --repo try --launch d49eb828b8d2681ff262e987d9e586f45911062b -a https://webglsamples.org/aquarium/aquarium.html
or
MOZ_X11_EGL=1 mozregression --repo try --launch d49eb828b8d2681ff262e987d9e586f45911062b -a https://webglsamples.org/aquarium/aquarium.html

Instead of opening the primary FD, I'd recommend searching for the device name in drmGetDevices2: https://github.com/swaywm/wlroots/blob/44cea53e7285fe25a81b3ce2dfb470daec27e6e4/render/egl.c#L869

Indeed, in some cases EGL clients don't have the permission to open /dev/dri/cardN files (see https://github.com/swaywm/wlroots/pull/2497).

If you want more reference code, Xwayland is doing something similar: https://gitlab.freedesktop.org/xorg/xserver/-/blob/master/hw/xwayland/xwayland-glamor-gbm.c#L708

Thanks emersion, that's good to know! Not sure what the current state for the headless mode is and if we would even make use of DMAbuf - AFAIK we don't. Also for FF it's not a hard-fail either way.

The issue may not only happen when running headless. Could also happen when running in an unprivileged container, for instance.

Martin, can I ask you to:

  • comment on whether you think this is a good approach (using glxtest)
  • quickly check if this works for you (comment 15)? IIRC you have some multi-gpu device to test?

If both applies I'd head over to bug 1640053 and finish the egl-only tests and then polish up this one :)

Flags: needinfo?(stransky)

Will look at it. My test box died a week ago and I'm expecting a replacement tomorrow.

(In reply to Kevin Cox [:kevincox] from comment #10)

I'm seeing problems due to this. I have an Intel Integrated card and a NVIDIA card. However Firefox only sees the Intel card.

% tree /dev/dri 
/dev/dri
├── by-path
│   ├── pci-0000:00:02.0-card -> ../card0
│   ├── pci-0000:00:02.0-render -> ../renderD128
│   ├── pci-0000:01:00.0-card -> ../card1
│   └── pci-0000:01:00.0-render -> ../renderD129
├── card0
├── card1
├── renderD128
└── renderD129

Firefox only sees:

GPU #1
Active: Yes
Description: Mesa DRI Intel(R) HD Graphics 4600 (HSW GT2)
Vendor ID: 0x8086
Device ID: 0x0416
Driver Vendor: mesa/i965
Driver Version: 20.1.3.0
RAM: 1536

I tried running MOZ_WAYLAND_DRM_DEVICE=/dev/dri/renderD128 firefox and MOZ_WAYLAND_DRM_DEVICE=/dev/dri/renderD129 firefox with no observable change.

It is worth noting that chromium detects both cards but uses the Intel anyways.

GPU0: VENDOR= 0x10de, DEVICE=0x139b
GPU1: VENDOR= 0x8086, DEVICE=0x0416 ACTIVE

It looks almost the same on my setup (Dell XPS 9500), except that Firefox shows this for the second GPU:

GPU #2
Active	No
Vendor ID	0x10de
Device ID	0x1f95
Depends on: 1640053
Assignee: nobody → robert.mader
Attachment #9190110 - Attachment description: Bug 1588904 - [Linux/EGL] Use correct rendering device in multi-GPU setup - WIP → Bug 1588904 - [Linux/EGL] Use correct rendering device in multi-GPU setup, r?stransky
Status: NEW → ASSIGNED

(In reply to Kai Mast from comment #21)

(In reply to Kevin Cox [:kevincox] from comment #10)

I'm seeing problems due to this. I have an Intel Integrated card and a NVIDIA card. However Firefox only sees the Intel card.
...
...
It looks almost the same on my setup (Dell XPS 9500), except that Firefox shows this for the second GPU:

Kevin, Kay: I think if you want your nvidia card to be used you'd need to do that by using DRI_PRIME or such (switcheroo?). That configures which GPU you'll use to run Firefox in general. This bug, however, is about making sure the dmabuf stuff works correctly for the given GPU. Assuming you're using prop. drivers, that's not yet of interest for nvidia, as they don't yet support dmabuf - it's in the works, however.

If you're using nouveau, then this bug is about making sure that we don't try to use the wrong device for buffer sharing, which would simply break webgl and hardware video decoding.

I've changed my setup since then but I was using PRIME with nvidia and Intel. I'm 99% sure I tried using DRI_PRIME as well with no success.

Pushed by malexandru@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/dde159ea0c4c [Linux/EGL] Use correct rendering device in multi-GPU setup, r=stransky,emilio,jgilbert

Hm, something was wrong with my hg checkout. Updated, thanks.

https://treeherder.mozilla.org/jobs?repo=try&revision=3601b3d257ead43c0d4091315405d070e1d262d9

Pushed by cbrindusan@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/03c509adfe55 [Linux/EGL] Use correct rendering device in multi-GPU setup, r=stransky,emilio,jgilbert

Duh, other platform. Updated, thanks. Apparently needs more review now :(

https://treeherder.mozilla.org/#/jobs?repo=try&revision=5690aaeb37a974f6ec54a2c78e12b65ec9ae405c

Pushed by smolnar@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/0dfb27633c1d [Linux/EGL] Use correct rendering device in multi-GPU setup, r=stransky,emilio,jgilbert,geckoview-reviewers,m_kato

Martin, can I ping about giving this a quick check once this lands? You said you might have appropriate hardware to test this by now? Thanks!

Yes, I'll test that. Thanks.

Flags: needinfo?(stransky)
Flags: needinfo?(stransky)

Thank you so much for working on this!

I can also test, if additional feedback is needed.

(In reply to Kai Mast from comment #35)

Thank you so much for working on this!

I can also test, if additional feedback is needed.

Thanks! And yes, testing would be very welcome as I can't do it myself, unfortunately.

Just as a reminder and to be clear: this is not about how Firefox chooses the GPU to drive it in general, but only to not mess up when using DMABUF. IIUC, nowadays it should be easy to choose the right GPU from the DE, at least on Gnome and KDE - see PrefersNonDefaultGPU in the spec [1][2].

1: https://specifications.freedesktop.org/desktop-entry-spec/desktop-entry-spec-latest.html
2: https://www.gamingonlinux.com/articles/the-linux-desktop-entry-specification-gets-a-way-to-automatically-use-a-discrete-gpu-merged-into-gnome.16598?module=articles_full

Status: ASSIGNED → RESOLVED
Closed: 5 years ago4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 86 Branch

Works for me, correct DRM device is used (renderD129).
Thanks a lot!

Flags: needinfo?(stransky)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: