startup crash without crash report (works in safe mode) caused by mesa_glthread=true in Mesa config file
Categories
(Core :: Graphics, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr91 | --- | unaffected |
firefox95 | --- | unaffected |
firefox96 | + | fixed |
firefox97 | + | fixed |
firefox98 | --- | fixed |
People
(Reporter: norbert.pfeiler, Unassigned)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: crash, nightly-community, regression)
Attachments
(1 file)
(deleted),
text/plain
|
Details |
since upgrade to nightly 97 i can only get firefox to start in safe mode
every second start crashes, and every other start it offers to start in safe mode which i gladly accept
disabling addons from safe mode doesn’t help
i find 2 suspicious threads, that are not poll/syscall/recvmsgor __futex_abstimed_wait_common64
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f506d8687c0 in ?? () from /usr/lib/libEGL_mesa.so.0
[Current thread is 1 (Thread 0x7f50592fe640 (LWP 2251478))]
(gdb) bt
#0 0x00007f506d8687c0 in ?? () from /usr/lib/libEGL_mesa.so.0
#1 0x00007f508778e13c in ?? () from /usr/lib/dri/radeonsi_dri.so
#2 0x0000000000000000 in ?? ()
[Switching to thread 39 (Thread 0x7f50a1883780 (LWP 2251274))]
#0 0x00007f50991b0902 in ?? () from /opt/firefox-nightly/libxul.so
(gdb) bt
#0 0x00007f50991b0902 in ?? () from /opt/firefox-nightly/libxul.so
#1 0x00007fff86562fd0 in ?? ()
#2 0x00007f5099145f08 in ?? () from /opt/firefox-nightly/libxul.so
#3 0x00007fff86565620 in ?? ()
#4 0x00007f503990f8d8 in ?? ()
#5 0x00007fff86563010 in ?? ()
#6 0x00007f509913cd55 in ?? () from /opt/firefox-nightly/libxul.so
#7 0x00007f500000000b in ?? ()
#8 0x00007fff86563038 in ?? ()
#9 0x00007fff86565620 in ?? ()
#10 0x00007f503990f8d8 in ?? ()
#11 0x00007fff865630e0 in ?? ()
#12 0x00007f5091b1b900 in ?? ()
#13 0x7b0c0d6934588600 in ?? ()
#14 0x00007fff865642d0 in ?? ()
#15 0x00007fff865642d0 in ?? ()
#16 0x0000000000000002 in ?? ()
#17 0xc01a56afea8dc885 in ?? ()
#18 0x0000000000000015 in ?? ()
#19 0x00007fff86563080 in ?? ()
#20 0x00007f5098041797 in ?? () from /opt/firefox-nightly/libxul.so
#21 0x0000000100000001 in ?? ()
#22 0x00007fff865642d0 in ?? ()
#23 0x0000000000000002 in ?? ()
#24 0x0000000000000002 in ?? ()
#25 0x00000000ffffffff in ?? ()
#26 0x0000000000000015 in ?? ()
#27 0x00007fff865630b0 in ?? ()
#28 0x00007f50991b5e59 in ?? () from /opt/firefox-nightly/libxul.so
#29 0x0000000060000008 in ?? ()
#30 0x0000000000000002 in ?? ()
#31 0x00007fff865631a8 in ?? ()
#32 0x00007fff865641c0 in ?? ()
#33 0x00007fff86563160 in ?? ()
#34 0x00007f50980432e7 in ?? () from /opt/firefox-nightly/libxul.so
#35 0x00007fff865630e8 in ?? ()
#36 0x0000000286566200 in ?? ()
#37 0x00007f503990f8d8 in ?? ()
#38 0x00007fff865631a8 in ?? ()
#39 0x0000000100000007 in ?? ()
#40 0x00007f5000000008 in ?? ()
#41 0x0000000000000000 in ?? ()
Comment 1•3 years ago
|
||
Reporter | ||
Comment 2•3 years ago
|
||
(In reply to Andre Klapper from comment #1)
Please see https://support.mozilla.org/en-US/kb/troubleshoot-firefox-crashes-closing-or-quitting
It works with hwacc disabled.
During testing just now, although i had crashes, about:crashes doesn’t list anything dated today.
this is with arch linux+gnome+x11+amdgpu
Updated•3 years ago
|
Comment 3•3 years ago
|
||
Thanks for the report! Please open about:support in your address bar, click on "Copy text to clipboard" and paste it here.
Are you able to find a regression range? You should get a pushlog URL at the end:
$ pip3 install --user mozregression
$ ~/.local/bin/mozregression --good 95 --bad 2021-12-12
Reporter | ||
Comment 4•3 years ago
|
||
Reporter | ||
Comment 5•3 years ago
|
||
hm, so with a new (linux) user nightly starts fine using either Wayland or X11
the 2021-12-12 mozregression test also doesn’t crash
but even
firefox-nightly -P
crashes in my session
when hardware acceleration is disabled i still get a segfault in the debugger:
gdb --args firefox-nightly -P
even though
firefox-nightly -P
works
Comment 6•3 years ago
|
||
Does a crash reporter open? Do you have any recent unsent crash reports on about:crashes
? Please submit and post some IDs (bp-XXXXX).
Is the crash reproducible if you run mozregression with your settings? For example:
$ ~/.local/bin/mozregression --launch 2021-12-12 --pref gfx.webrender.all:true fission.autostart:true general.autoScroll:true image.jxl.enabled:true layout.frame_rate:120 media.hardware-video-decoding.force-enabled:true mousewheel.default.delta_multiplier_y:200
Reporter | ||
Comment 7•3 years ago
|
||
no crash reporter and my last about:crashes is from the 9th, which was still nightly v96
mozregression still works with those args
Comment 8•3 years ago
|
||
Does it reproduce if you open 3 windows and then restart Firefox via about:restartrequired
(browser.startup.page:3)?
$ ~/.local/bin/mozregression --launch 2021-12-12 --pref gfx.webrender.all:true fission.autostart:true general.autoScroll:true image.jxl.enabled:true layout.frame_rate:120 media.hardware-video-decoding.force-enabled:true mousewheel.default.delta_multiplier_y:200 browser.startup.page:3
Reporter | ||
Comment 9•3 years ago
|
||
now we’re getting somewhere
INFO: Last good revision: ecc17529126cd64d51c6f0c4842395a9ee93ec22
INFO: First bad revision: 799a0280b2137880417e093103f136643506ac60
799a0280b2137880417e093103f136643506ac60 crashes on startup
9eb74149f75b2444be4d13b049ce4d8dd4d894a5 crashes when opening a 2nd window (ctrl+n)
to reproduce i paste about:restartrequired
upon startup, press ctrl+n twice and click the restart button with the mouse
the 3rd window overlaps the 1st
when i tried to figure out if it has to do with window overlap, where the cursor is or if it also happens with 2 windows the results were too inconsistent to make something out
i.e. it also managed to restart with 3 windows sometimes, but when i did it like described above immediately, it was deterministic
Comment 10•3 years ago
|
||
Thanks for this information. Unfortunately, nothing in the pushlog for that range sticks out or even seems remotely related :/
Just to confirm: This does not reproduce with a fresh profile, just your existing one?
Reporter | ||
Comment 12•3 years ago
|
||
mozregression doesn’t use anything from my user, does it?
Comment 13•3 years ago
|
||
The severity field is not set for this bug.
:jimm, could you have a look please?
For more information, please visit auto_nag documentation.
Updated•3 years ago
|
Comment 14•3 years ago
|
||
(In reply to norbert.pfeiler from comment #12)
mozregression doesn’t use anything from my user, does it?
Right, mozgression should use its own profile.
Also, am I understanding this correctly that you are using our official version of Firefox with Crash Reporting enabled, but this crash is not creating a crash report that shows up in about:crashes
? If that is the case, that's yet another thing we should investigate, and we might be able to progress with a different build. Thanks in advance for checking!
Reporter | ||
Comment 15•3 years ago
|
||
I’m on Arch Linux and use https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=firefox-nightly
which downloads https://download-installer.cdn.mozilla.net/pub/firefox/nightly/latest-mozilla-central
I think the only »modified« thing is that it has automatic updates disabled.
Since i have crash reports up until 2021-12-09 (which was still nightly 96) i’m guessing there is nothing explicitly preventing it.
Does mozregression have a crash reporter running (or can it provide some more crash info)?
If only for this issue, i could upload a coredump collected by systemd.
Comment 16•3 years ago
|
||
A core dump would be very useful, but don't upload it on the bug as it might contain sensitive information. Send it to me via e-mail and I'll analyze the crash.
Comment 17•3 years ago
|
||
when hardware acceleration is disabled i still get a segfault in the debugger:
Are you actually crashing or does GDB stop with SIGSYS?
Reporter | ||
Comment 18•3 years ago
|
||
it’s always SIGSEGV
e.g.
~> gdb --args firefox-nightly -P
[…]
Thread 175 "firefox-ni:gl0" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff8cdfe640 (LWP 983222)]
0x00007fffa33d3fd4 in ?? () from /usr/lib/libEGL_mesa.so.0
Comment 19•3 years ago
|
||
I've tried extracting a stack trace from the core file that was sent me from the reporter but sadly went nowhere because Arch doesn't provide separate debug information for its packages (I should have remembered that, I already stumbled upon this issue in the past). I find it odd that we're tripping over a Mesa failure before we have a chance to set the exception handler and catch it.
That being said I found similar crashes also happening on Arch, see this query. They seem to be happening on Mesa 21.2.5.0, could you try a different version and see if the problem persists?
Reporter | ||
Comment 20•3 years ago
|
||
Since official binaries are used, aren’t there symbols available for it somewhere?
To try a different version of Mesa, you mean?
Comment 21•3 years ago
|
||
(In reply to norbert.pfeiler from comment #20)
Since official binaries are used, aren’t there symbols available for it somewhere?
No, I think that it's possible to build them if you build packages locally (see this bug) but I don't think they're distributed for pre-built packages. Or at least I couldn't find them.
To try a different version of Mesa, you mean?
Yes, to figure out if it's an issue on their side rather than on ours.
Comment 22•3 years ago
|
||
Confirming the bug in the meantime.
Reporter | ||
Comment 23•3 years ago
|
||
firefox-nightly is not built (compiled), it’s just downloaded and extracted
i was thinking the symbols in the stripped binary can be matched to those in the unstripped version
but anyway, i can also disable stripping
and i can confirm Mesa versions change things:
20.3.4-3 works
21.0.0-1 doesn’t
Comment 24•3 years ago
|
||
EGL is enabled on X11 for Mesa >= 21.
Ideas: Could this be one of
- bug 1740260/bug 1735905 (https://bugs.debian.org/998108, https://bugzilla.opensuse.org/show_bug.cgi?id=1192067)
- bug 1741997
- bug 1745805
- bug 1743551, bug 1741956
- Can the crash be fixed by starting Firefox with mesa_glthread=false environment variable?
$ mesa_glthread=false path/to/firefox
Then it's bug 1670545. (bug 1744389 likely made force-disabling mesa_glthread less effective, it might no longer work when glthread is set in a Mesa config file and cause a startup crash)
?
Comment 25•3 years ago
|
||
(In reply to norbert.pfeiler from comment #23)
firefox-nightly is not built (compiled), it’s just downloaded and extracted
i was thinking the symbols in the stripped binary can be matched to those in the unstripped version
but anyway, i can also disable stripping
Yes, I can retrieve the debug information for that Firefox binary, but the crash is happening in libEGL_mesa.so.0 so without the debug information for that library I can't get a proper stack trace.
Reporter | ||
Comment 26•3 years ago
|
||
using mesa_glthread=false works
but fyi it behaves the same regardless of X11 or Wayland session
idk if that’s only because it’s running in xwayland at the end
Comment 27•3 years ago
|
||
(In reply to norbert.pfeiler from comment #26)
using mesa_glthread=false works
Just to be sure: This crash can be fixed by starting Nightly with mesa_glthread=false env var, right?
Please try to find out why it is even enabled on your system. It should be false by default.
Reporter | ||
Comment 28•3 years ago
|
||
Just to be sure: This crash can be fixed by starting Nightly with mesa_glthread=false env var, right?
yes
Please try to find out why it is even enabled on your system. It should be false by default.
any pointers?
~> sudo grep -lr mesa_glthread /etc/* /usr/share/*
only gives /usr/share/drirc.d/00-mesa-defaults.conf
and i don’t find anything suspicious in there
~> env mesa_glthread=false glinfo
also gives the ATTENTION output, so it seems to be set there as well
Reporter | ||
Comment 29•3 years ago
|
||
ha!
alright, i have found a ~/.drirc that sets it
Comment 30•3 years ago
|
||
How did this line look like? Was it upper- and lowercase mixed?
Comment 31•3 years ago
|
||
bug 1744389 should be backed out of Beta and Nightly and possibly not attempted again this way. Better be absolutely sure.
Maybe MESA_DEBUG=silent
could be set if MESA_DEBUG env var is not set?
https://gitlab.freedesktop.org/mesa/mesa/-/blob/946bd90a097e8bf4f060f7a18d04f1df1c23275f/src/util/xmlconfig.c#L391
https://gitlab.freedesktop.org/mesa/mesa/-/blob/946bd90a097e8bf4f060f7a18d04f1df1c23275f/src/util/xmlconfig.c#L68-76
https://gitlab.freedesktop.org/search?search=%22MESA_DEBUG%22&group_id=1155&project_id=176&scope=&search_code=true&snippets=false&repository_ref=main&nav_source=navbar
https://docs.mesa3d.org/envvars.html
Comment 32•3 years ago
|
||
[Tracking Requested - why for this release]:
Please consider backing out bug 1744389 from Beta 96 and Nightly 97 because it reintroduced a startup crash without crash report. It was previously fixed by bug 1670545.
Reporter | ||
Comment 33•3 years ago
|
||
How did this line look like? Was it upper- and lowercase mixed?
do you mean the .drirc?
it contained
<driconf>
<device screen="0" driver="radeonsi">
<application name="Default">
<option name="mesa_glthread" value="true" />
</application>
</device>
</driconf>
Updated•3 years ago
|
Updated•3 years ago
|
Comment 34•3 years ago
|
||
(In reply to Darkspirit from comment #32)
[Tracking Requested - why for this release]:
Please consider backing out bug 1744389 from Beta 96 and Nightly 97 because it reintroduced a startup crash without crash report. It was previously fixed by bug 1670545.
Robert, what are your thoughts on this?
Comment 35•3 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #34)
(In reply to Darkspirit from comment #32)
[Tracking Requested - why for this release]:
Please consider backing out bug 1744389 from Beta 96 and Nightly 97 because it reintroduced a startup crash without crash report. It was previously fixed by bug 1670545.Robert, what are your thoughts on this?
We can back out on 96, but I'd like to keep it in nightly.
I was already fearing something like this would happen. The option mesa_glthread
is not enabled for a reason. Users who unconditionally enable it are asking for trouble. The fix from bug 1670545 adds noisy warning for almost all our users and fills up our bug reports so I'm against keeping it forever. But we may find some better way to handle this.
Comment 36•3 years ago
|
||
(In reply to Robert Mader [:rmader] from comment #35)
But we may find some better way to handle this.
How about comment 31?
Reporter | ||
Comment 37•3 years ago
|
||
How about asserting (with a nice message) that mesa_glthread
is disabled rather than crashing?
Now that i know what the issue is, I’m fine removing that it was set for my user.
I think this was created by some configuration utility wrt mangohud (gl/vulkan stats overlay).
Comment 38•3 years ago
|
||
(In reply to norbert.pfeiler from comment #37)
How about asserting (with a nice message) that
mesa_glthread
is disabled rather than crashing?
The issue is that there are multiple ways to enable it - we didn't catch this one so the driver went crashing :/
Comment 39•3 years ago
|
||
(In reply to Darkspirit from comment #36)
(In reply to Robert Mader [:rmader] from comment #35)
But we may find some better way to handle this.
How about comment 31?
Hm, not a fan of MESA_DEBUG=silent
- the point is that mesa usually only prints stuff if it wants users to be aware that there's something odd with their setup. That's the reason why they print a loud warning when setting mesa_glthread
via env var - sadly it doesn't when the option is set via driconf. Setting MESA_DEBUG=silent
would mean more hard to debug reports because people wouldn't be able to see such warnings.
Maybe there's some simple way to detect the setting, however overall I'm somewhat against us trying to work around all kinds of buggy setups. IIUC enabling the gpu process should prevent crashes in this such scenario and allow a fallback to software rendering. Given that mesa_glthread
seems to work fine on Wayland (where enabling the gpu process is way harder), I hope we can do that soon on X11.
Comment 40•3 years ago
|
||
The regressing bug 1744389 was Backed out for 96.0rc2 see below:
Backout link
Comment 41•3 years ago
|
||
I believe I'm now seeing something related to this with Ubuntu 20.04.3. It doesn't crash, so much as the profile manager just kind of hangs and renders a transparent window.
Comment 42•3 years ago
|
||
(In reply to Christopher Smith from comment #41)
If you have the Nvidia driver installed, you are seeing bug 1745172.
Updated•3 years ago
|
Updated•3 years ago
|
Comment 43•3 years ago
|
||
(In reply to Darkspirit from comment #42)
(In reply to Christopher Smith from comment #41)
If you have the Nvidia driver installed, you are seeing bug 1745172.
I do, and it was. Thanks.
Comment 44•3 years ago
|
||
ATTENTION: default value of option mesa_glthread overridden by environment.
was EGL/X11-only.
- As no crash report is generated, we don't know how many X11 users ran into bug 1670545.
- Could X11 users who are annoyed by the log entry just set MESA_DEBUG=silent themselves (at least for the moment) or switch to Wayland?
- Nightly+Beta Xwayland users have been switched to Wayland now (bug 1749174).
- bug 1653444: Shouldn't X11 and Wayland be as close as possible? IIUC, a corresponding Wayland GPU process (bug 1732951) would break gfx.webrender.compositor.force-enabled (bug 1617498) as it is right now because the parent process would need to act as Wayland proxy server?
Comment 45•3 years ago
|
||
This bug can be closed as bug 1744389 was backed out from trunk. In case we don't find a nice solution it'll be another case of "fixed by Wayland" ¯_(ツ)_/¯
Updated•3 years ago
|
Updated•3 years ago
|
Comment 46•3 years ago
|
||
Backed out for 97.0b7 also.
https://hg.mozilla.org/releases/mozilla-beta/rev/b7ee40d2052c
Description
•