EGL gets blocklisted on nVidia in multi-gpu setups (symbol eglGetDisplayDriverName not defined)
Categories
(Core :: Graphics: WebRender, defect)
Tracking
()
People
(Reporter: arnolds, Assigned: rmader)
References
(Blocks 2 open bugs)
Details
Attachments
(5 files, 1 obsolete file)
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:94.0) Gecko/20100101 Firefox/94.0
Steps to reproduce:
I’m using nVidia Geforce GT1030 with 470.82.00 driver and get the message “[GFX1-]: glxtest: libEGL missing eglGetDisplayDriverName” on starting firefox-94. I’m afraid this symbol is not declared in the nVidia 470.82 and 495.44 libEGL?
Comment 1•3 years ago
|
||
The Bugbug bot thinks this bug should belong to the 'Core::Graphics: WebRender' component, and is moving the bug to that component. Please revert this change in case you think the bot is wrong.
Comment 2•3 years ago
|
||
Thanks for the report!
eglGetDisplayDriverName (EGL_MESA_query_driver) does not seem to be supported by the proprietary Nvidia driver.
As long as there are no negative consequences, current behavior seems to be expected.
Please open about:support, click on "Copy text to clipboard" and paste it here.
x11_egltest is tried first:
If pci_count determined by get_pci_status is not exactly 1,
- then require_driver parameter of get_egl_status is true,
- eglGetDisplayDriverName does not exist in your case because it is a Mesa-only function provided by EGL_MESA_query_driver,
- get_egl_status returns false,
- x11_egltest returns false,
- glxtest is then tried.
Reporter | ||
Comment 3•3 years ago
|
||
Hi Darkspirit,
thanks for your explanation of x11_egltest. By knowing this it's clear: I have two graphic adapters. The other one (Intel) is on board, so I can't remove it. What's about checking the vendor (/sys/bus/pci/devices/0000:01:00.0/vendor) too and not using eglGetDisplayDriverName if vendor ID == nVidia (0x10de)? Or, isn't there another singular attribute of the nVidia libEGL?
thanks again and have a nice weekend,
Ado
Reporter | ||
Comment 4•3 years ago
|
||
Reporter | ||
Comment 5•3 years ago
|
||
What's about '/sys/bus/pci/devices/0000:01:00.0/driver/module/version' -> 470.82.00. Let me know if I can be of any help.
Comment 6•3 years ago
|
||
How does about:support of https://nightly.mozilla.org look like?
Reporter | ||
Comment 7•3 years ago
|
||
Thanks for your quick reply (fix?)! Unfortunately I'm no longer in touch with the equipment having the problem for today. I will test your fix as soon as possible on Monday morning. Thanks and have a nice weekend, Ado
Reporter | ||
Comment 8•3 years ago
|
||
Reporter | ||
Comment 9•3 years ago
|
||
Good morning Darkspirit, just tested Nightly/96.0a1. It behaves like 94.0 with my problem: libEGL + nVidia (active) + Intel (inactive, onboard).
[GFX1-]: glxtest: libEGL missing eglGetDisplayDriverName
[GFX1-]: glxtest: libEGL missing eglGetDisplayDriverName
I'm afraid another heuristic for identifying the proprietary nVidia driver is needed.
Kind regards, Ado
Updated•3 years ago
|
Comment 11•3 years ago
|
||
The severity field is not set for this bug.
:jimm, could you have a look please?
For more information, please visit auto_nag documentation.
Updated•3 years ago
|
Reporter | ||
Comment 12•3 years ago
|
||
Dear Developers,
after the first very fast response by Darkspirit with an explanation why the problem exists, it's a bit disapointing to see the low severity "Small/Trivial" now. Without a fix it's not possible to use firefox + nvidia graphis device with nNvidia's drivers + libEGL just cause nVidia's libEGL doesn't support the symbol eglGetDisplayDriverName. Full performance settings are not enabled although libEGL is available. So firefox is using only fractions of the computer/graphis possibilies. E.g.: Webex reports "Video is not currently available due to low bandwidth". Video is shown in this context without any problem with the onboard Intel graphic interface or with google chrome.
Would be great to see a fix fir this problem in a not to far future. If I can be of any help, just let me know.
Cheers, Ado
Comment 13•3 years ago
|
||
Still a Nightly-only bug: You have Nightly, EGL should be enabled there, but the egl test does not succeed.
Default config works: GLX works according to your about:support and EGL isn't shipped yet to X11 on Nvidia.
This bug blocks shipping. It will be looked at.
Updated•3 years ago
|
Assignee | ||
Comment 14•3 years ago
|
||
There's already a TODO
for this case in the code: https://searchfox.org/mozilla-central/source/toolkit/xre/glxtest.cpp#613-616
Note that this only affects multi-gpu setups.
Reporter | ||
Comment 15•3 years ago
|
||
Thanks to let me know, Robert!
Just to clarify: "multi-gpu setup" in my case means the onboard Intel device is not used (but can't be removed naturally) cause I needed an additionally nVidia card to control a screen with higher resolution.
Have a nice day, Ado
Assignee | ||
Comment 16•3 years ago
|
||
While this disables EGL on some devices, this doesn't block bug 1737428.
Assignee | ||
Updated•3 years ago
|
Comment 17•3 years ago
|
||
Is there at least a temporary workaround for this bug, so we can force EGL when using nvidia? Even if it's a ugly hack (or a patch), would be sufficient so we can use EGL while we wait for the fix. Thanks!
Comment 18•3 years ago
|
||
Does it work if you start Firefox with MOZ_ENABLE_WAYLAND=1 environment variable when using Wayland
or if you set gfx.x11-egl.force-enabled=true on about:config when using X11?
Comment 19•3 years ago
|
||
(In reply to Darkspirit from comment #18)
Does it work if you start Firefox with MOZ_ENABLE_WAYLAND=1 environment variable when using Wayland
or if you set gfx.x11-egl.force-enabled=true on about:config when using X11?
No, it doesn't work with gfx.x11-egl.force-enabled=true (same error "glxtest: libEGL missing eglGetDisplayDriverName").
I'm unable to test with Wayland, since the Wayland session is broken and I get only a black screen, so I'm stuck with X11.
Comment 20•3 years ago
|
||
IIUC, glxtest should only be relevant for decision making. Do you see "EGL_VENDOR" in WebGL info on about:support? Then you are using EGL.
Comment 21•3 years ago
|
||
(In reply to Darkspirit from comment #20)
IIUC, glxtest should only be relevant for decision making. Do you see "EGL_VENDOR" in WebGL info on about:support? Then you are using EGL.
No, I don't see "EGL_VENDOR" in WebGL. I see only "GLX_VENDOR":
GLX_VENDOR(client): NVIDIA Corporation
GLX_VENDOR(server): NVIDIA Corporation
Assignee | ||
Comment 22•3 years ago
|
||
(In reply to Darkspirit from comment #20)
IIUC, glxtest should only be relevant for decision making. Do you see "EGL_VENDOR" in WebGL info on about:support? Then you are using EGL.
IIRC we currently hard-block EGL if egltest in glxtest fails. We really should fix this one, will try to have a look soon.
Comment 23•3 years ago
|
||
Can you update to Nvidia driver 495 and use Ubuntu 22.04 with Wayland?
(In reply to Dan from comment #17)
Even if it's a ugly hack (or a patch)
$ sudo apt purge *nvidia*
Comment 24•3 years ago
|
||
(In reply to Darkspirit from comment #20)
IIRC we currently hard-block EGL if egltest in glxtest fails. We really should fix this one, will try to have a look soon.
Ok, if you need testing for some patch or whatever, just ask. Thank you!
Comment 25•3 years ago
|
||
(In reply to Darkspirit from comment #23)
Can you update to Nvidia driver 495 and use Ubuntu 22.04 with Wayland?
Yes, I'm already using the latest NVIDIA 495.46 driver.
Regarding Ubuntu, I use my own installation, so it wouldn't help.
I was looking at the relevant code below:
if (eglGetDisplayDriverName) {
// TODO(aosmond): If the driver name is empty, we probably aren't using Mesa
// and instead a proprietary GL, most likely NVIDIA's. The PCI device list
// in combination with the vendor name is very likely sufficient to identify
// the device.
const char* driDriver = eglGetDisplayDriverName(dpy);
if (driDriver) {
record_value("DRI_DRIVER\n%s\n", driDriver);
}
} else if (require_driver) {
record_warning("libEGL missing eglGetDisplayDriverName");
What exactly eglGetDisplayDriverName(dpy) should return in Nvidia case? Just so I know how bad the situation is.
Or I could hard code the expected value just as a quick workaround...
Reporter | ||
Comment 26•3 years ago
|
||
(In reply to Dan from comment #25)
(In reply to Darkspirit from comment #23)
...
What exactly eglGetDisplayDriverName(dpy) should return in Nvidia case? Just so I know how bad the situation is.Or I could hard code the expected value just as a quick workaround...
Hi Dan and all, good to see that there is new life with this topic. When you look at my first report on this, you'll see that this if condition is never fulfilled with NVIDIA since the symbol "eglGetDisplayDriverName" is not declared in the NVIDIA dirver. I'am afraid that the PCI device list has to be scanned to verify that we are dealing with a NVIDIA device [e.g. for my desktop: 01:00.0 VGA compatible controller: NVIDIA Corporation GP108 [GeForce GT 1030] (rev a1)] if eglGetDisplayDriverName is not defined.
Cheers and all the best for 2022, Ado
Assignee | ||
Comment 27•3 years ago
|
||
Just for the record: the main issue here is that we can't rely on users using recent drivers. If everyone was on recent Mesa or Nvidia drivers it would be easy to solve. But we have to take e.g. Mesa versions without eglGetDisplayDriverName
into account as well. What we could easily do is make gfx.x11-egl.force-enabled
or at least MOZ_X11_EGL=1
force EGL even if the EGL test failed.
Reporter | ||
Comment 28•3 years ago
|
||
(In reply to Robert Mader [:rmader] from comment #27)
gfx.x11-egl.force-enabled
or at leastMOZ_X11_EGL=1
force EGL even if the EGL test failed.
Would be great to have this!
Reporter | ||
Comment 29•3 years ago
|
||
If this helps in any way (from about:support):
Failure Log
(#0) Error: glxtest: libEGL no display
(#1) Error: glxtest: No visuals found
(#2) Error: glxtest: libEGL no display
(#3) Error: More than 1 GPU vendor detected via PCI, cannot deduce vendor
(#4) Error: PCI candidate 0x8086/0x5912 --> Intel onbord device
(#5) Error: PCI candidate 0x10de/0x1d01 --> NVIDIA
Assignee | ||
Comment 30•3 years ago
|
||
After bug 1751252 and bug 1742994 this should be the only case where Nvidia users get HW-WR with GLX. Otherwise it should be all EGL or SW-WR, thus this is the last blocker to close all Nvidia+GLX-only bugs.
Comment 31•2 years ago
|
||
(In reply to Robert Mader [:rmader] from comment #30)
After bug 1751252 and bug 1742994 this should be the only case where Nvidia users get HW-WR with GLX. Otherwise it should be all EGL or SW-WR, thus this is the last blocker to close all Nvidia+GLX-only bugs.
Hello! Any idea when this bug will be fixed? It's been 5 months since it's opened... Thank you!
Assignee | ||
Comment 32•2 years ago
|
||
We already run the EGL test before the GLX one for a while now. Some
reordering and ignoring the case of multi-GPU systems with outdated
Mesa, combined with the fact that the only non-Mesa driver where we
enable HW-WR is the Nvidia one, which again we only support on driver
versions with EGL support, allows us to do a bunch of cleanups.
- Stop requiring
EGL_MESA_query_driver
support for EGL on multi-GPU
systems. - Make use of the fact that we always run the EGL test first, stop
doing it after the GLX one. - Lots of cleanups that become possible as the result.
Potential issues to have an eye on:
- EGL on Nvidia-Prime should now get HW-WR on EGL (including dmabuf
etc.). This was previously blocked and thus needs testing. - Multi-GPU system with old Mesa version between 17.0 and 19.0 may
loose HW-WR. - Mesa users on Xorg using 30bit color depth now run the EGL GL test
fully (no issues expected here).
Assignee | ||
Comment 33•2 years ago
|
||
Here is a try-build with the patch from above. Testing on an affected system would be highly appreciated, given that it was not possible to force-enable and test EGL on Prime on Nvidia so far.
https://treeherder.mozilla.org/jobs?repo=try&revision=5441652461a16ddc00921f5c52eb13ed86228e61
Edit: direct download link https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/OSvlUuJ0RsaR29SkJLHJpw/runs/0/artifacts/public/build/target.tar.bz2
Comment 34•2 years ago
|
||
(In reply to Robert Mader [:rmader] from comment #33)
https://treeherder.mozilla.org/jobs?repo=try&revision=5441652461a16ddc00921f5c52eb13ed86228e61
I tested your patch here and it works perfectly!
Thank you very much!
Assignee | ||
Comment 35•2 years ago
|
||
(In reply to Dan from comment #34)
I tested your patch here and it works perfectly!
Thank you very much!
Thanks! Mind attaching your about:support
("copy text to clipboard" -> paste in a comment here -> bz will ask to make it an attachment -> yes) here so I can have a quick check? :)
Comment 36•2 years ago
|
||
(In reply to Robert Mader [:rmader] from comment #35) > Thanks! Mind attaching your `about:support` ("copy text to clipboard" -> paste in a comment here -> bz will ask to make it an attachment -> yes) here so I can have a quick check? :) No problem ;-)
Assignee | ||
Comment 37•2 years ago
|
||
Version: 105.0
Hm, are you sure that the build from above?
Edit: ah, is that your own build?
Comment 38•2 years ago
|
||
(In reply to Robert Mader [:rmader] from comment #37)
Version: 105.0
Hm, are you sure that the build from above?
I applied your patch directly to my repository (since I always compile Firefox from scratch).
So it's your patch against the latest 105 release.
Assignee | ||
Comment 39•2 years ago
|
||
Right, makes sense :)
Great, looks good!
Comment 40•2 years ago
|
||
Hello, as requested by :rmader attached is a copy of the about:support page. Webrender fails to enable. The console output:
$ __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia ./firefox
[GFX1-]: No GPUs detected via PCI
[GFX1-]: glxtest: process failed (received signal 11)
Assignee | ||
Comment 41•2 years ago
|
||
(In reply to killercontact1.7.4.0 from comment #40)
Created attachment 9296084 [details]
about:support output with :rmader's patch enabled running driver 515.65.01Hello, as requested by :rmader attached is a copy of the about:support page. Webrender fails to enable. The console output:
$ __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia ./firefox [GFX1-]: No GPUs detected via PCI [GFX1-]: glxtest: process failed (received signal 11)
Thanks! This looks like bug 1759315. Can you test again with MOZ_ENABLE_WAYLAND=0
or in an X11 session? Wayland is already enabled by default on nightly but not on release/beta.
Comment 42•2 years ago
|
||
(In reply to Robert Mader [:rmader] from comment #41)
(In reply to killercontact1.7.4.0 from comment #40)
Created attachment 9296084 [details]
about:support output with :rmader's patch enabled running driver 515.65.01Hello, as requested by :rmader attached is a copy of the about:support page. Webrender fails to enable. The console output:
$ __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia ./firefox [GFX1-]: No GPUs detected via PCI [GFX1-]: glxtest: process failed (received signal 11)
Thanks! This looks like bug 1759315. Can you test again with
MOZ_ENABLE_WAYLAND=0
or in an X11 session? Wayland is already enabled by default on nightly but not on release/beta.
I did it surely as requested. Running on Fedora 36, Gnome Xorg. Now:
__NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia MOZ_ENABLE_WAYLAND=0 ./firefox
"[GFX1-]: glxtest: VA-API test failed: failed to initialise VAAPI connection.",
"[GFX1-]: Failed to create EGLSurface!: 0x3009",
"[GFX1-]: Failed to create EGLSurface. 1 renderers, 0 active.",
"[GFX1-]: Handling webrender error 3",
"[GFX1-]: Fallback WR to SW-WR"
Comment 43•2 years ago
|
||
Assignee | ||
Comment 44•2 years ago
|
||
(In reply to killercontact1.7.4.0 from comment #42)
...
I did it surely as requested. Running on Fedora 36, Gnome Xorg. Now:
...
Thanks! Can you shortly confirm that other EGL apps do work with the same environment variables, such as glmark2-es2
?
Assignee | ||
Updated•2 years ago
|
Comment 45•2 years ago
|
||
Pushed by robert.mader@posteo.de: https://hg.mozilla.org/integration/autoland/rev/d254e2dd277b Cleanups for glxtest, r=lsalzman
Assignee | ||
Comment 46•2 years ago
|
||
Going forward with the patch despite the issue above - prop. Nvidia in a multi-gpu setup on Wayland is just too niche to be a blocker here (and might be caused by driver issues). Lets continue with that in a follow-up.
Comment 47•2 years ago
|
||
bugherder |
Comment 48•2 years ago
|
||
(In reply to Robert Mader [:rmader] from comment #46)
Going forward with the patch despite the issue above - prop. Nvidia in a multi-gpu setup on Wayland is just too niche to be a blocker here (and might be caused by driver issues). Lets continue with that in a follow-up.
Apologies for the late reply. Running the benchmarks results in the same error, unfortunately this means it is either a misconfiguration or a driver bug. It is Fedora 36 with packages that are up-to-date, even reinstalling driver does nothing.
Even with the new patch the problem remains, but as updates continue coming, trying it will be done again.
Description
•