Crash in [@ g_log_writer_default]
Categories
(Core :: Widget: Gtk, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr68 | --- | unaffected |
firefox-esr78 | --- | unaffected |
firefox77 | --- | unaffected |
firefox78 | --- | unaffected |
firefox79 | --- | wontfix |
firefox80 | --- | fixed |
People
(Reporter: calixte, Assigned: wekerbugs)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: crash, regression)
Crash Data
Attachments
(3 files)
This bug is for crash report bp-d609cc2b-da42-4591-a556-c371b0200615.
Top 10 frames of crashing thread:
0 libglib-2.0.so.0 g_log_writer_default
1 libglib-2.0.so.0 g_log_structured_array
2 libglib-2.0.so.0 g_log_structured_standard
3 libgdk-3.so.0 gdk_x11_grab_server
4 libX11.so.6 _XError
5 libX11.so.6 _XFreeX11XCBStructure
6 libX11.so.6 _XReply
7 libX11.so.6 XGetWindowProperty
8 libxul.so nsWindow::GetWorkspaceID widget/gtk/nsWindow.cpp:1908
9 libxul.so nsGlobalWindowInner::GetWorkspaceID dom/base/nsGlobalWindowInner.cpp:6615
There are 2 crashes (from 1 installation) in nightly 79 with buildid 20200614092422. In analyzing the backtrace, the regression may have been introduced by patch [1] to fix bug 1628749.
[1] https://hg.mozilla.org/mozilla-central/rev?node=70c8e33b5710
Reporter | ||
Updated•4 years ago
|
Reporter | ||
Updated•4 years ago
|
While I cannot reproduce the error exactly, I can induce this failstate by calling XGetWindowProperty with an intentional wrong atom or window.
I strongly suspect, that this fails on non EWHM compliant window managers, as there are no checks if the atom returned by XInternAtom is valid. This would explain, why this crash is not reproduceable and relativly rare as one needs an EWHM non compliant WM. After a bit of googling I did not find any such WM.
To be safe there should be error handling added for invalid atoms returned by XInternAtom.
Martin Stránský can you kindly review my patch again?
This could also be unrelated as X erors are submitted asynchronously.
I see some crashreports [1] that are eerily similar but in a version before the patch [2].
[1] https://hg.mozilla.org/mozilla-central/rev?node=70c8e33b5710
[2] https://crash-stats.mozilla.org/report/index/10224663-3992-49cc-9c3c-001de0200608
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Comment 5•4 years ago
|
||
bugherder |
Comment 6•4 years ago
|
||
The issue can still be observed with recent builds, see e.g. this report
wekerbugs, can you take another look, please?
Comment 7•4 years ago
|
||
I think we also need check wmWindow != nullptr
Comment 8•4 years ago
|
||
Updated•4 years ago
|
Assignee | ||
Comment 10•4 years ago
|
||
(In reply to Martin Stránský [:stransky] from comment #7)
I think we also need check wmWindow != nullptr
That should only happen, when the wm is not ewmh compliant. Hmmn, I think something else is going on. Normally there is a log produced in the console when firefox crashes in this way but I cannot reproduce the crash.
Hopefully the wmWindow check fixes everything but I would not be suprised if it did not...
Assignee | ||
Comment 11•4 years ago
|
||
(In reply to wekerbugs from comment #10)
(In reply to Martin Stránský [:stransky] from comment #7)
I think we also need check wmWindow != nullptr
That should only happen, when the wm is not ewmh compliant. Hmmn, I think something else is going on. Normally there is a log produced in the console when firefox crashes in this way but I cannot reproduce the crash.
Hopefully the wmWindow check fixes everything but I would not be suprised if it did not...
Forgot to say: That should only happen, when the wm is not ewmh compliant but pretends to be.
Comment 12•4 years ago
|
||
bugherder |
Comment 13•4 years ago
|
||
We're still seeing this in 79 beta and 80 nightly, e.g. bp-d816d43c-ad1f-4d9e-8d95-141b50200702.
Comment 14•4 years ago
|
||
looking at _g_log_abort | g_log_writer_default crash stat signatures there's a some general X error which leads to _gdk_x11_display_error_event() and it's triggered by various X routines. Given the asynchronous nature of X11 protocol we can't say XGetWindowProperty() really causes that.
Comment 15•4 years ago
|
||
Assignee | ||
Comment 16•4 years ago
|
||
This crash signature also appeared before but less often.
Now with every GetWorkspaceID() we invoke XGetWindowProperty() and can catch the error.
I really don't know how this can still crash otherwise.
Maybe the when the WM exits/crashes before Firefox?
One optimization that could reduce the frequency of crashes with XGetWindowProperty() is to only check DesktopUsesDynamicWorkspaces() once and save the result. That would fix it if it is something like the WM exiting before Firefox but idk if it is worth it.
Comment 17•4 years ago
|
||
(In reply to Razvan Maries from comment #12)
Looks like this regressed the number of crashes. Should we back this out and look for a different fix?
Comment 18•4 years ago
|
||
Should we backout the patches that landed? It seems the crash rate has gotten worse since they landed.
Assignee | ||
Comment 19•4 years ago
|
||
(In reply to Jim Mathies [:jimm] from comment #17)
(In reply to Razvan Maries from comment #12)
Looks like this regressed the number of crashes. Should we back this out and look for a different fix?
The linked changeset cannot regress the number of crashes, as it is only a nullpointer check. It backs out before the function is called that could lead to g_log_writer_default.
Equally https://phabricator.services.mozilla.com/D79728 cannot regress the number of crashes, as it just checks if integers are zero and also backs out of the function.
So if this should be backed out then all of them must go:
https://hg.mozilla.org/mozilla-central/rev?node=70c8e33b5710
https://hg.mozilla.org/mozilla-central/rev/00b1e2a78573e809ef86a70d72b9cb7253f0caea
https://hg.mozilla.org/mozilla-central/rev/b3d9f2fe4edd
I suspect that the patches are not the root cause but I cannot be sure as I cannot reproduce this and nobody seemed to report a bug for this.
Comment 20•4 years ago
|
||
So if this should be backed out then all of them must go:
https://hg.mozilla.org/mozilla-central/rev?node=70c8e33b5710
The other patches make sense as they're null checks, but this seems unrelated, or am I missing something?
Updated•4 years ago
|
Assignee | ||
Comment 21•4 years ago
|
||
(In reply to Gian-Carlo Pascutto [:gcp] from comment #20)
So if this should be backed out then all of them must go:
https://hg.mozilla.org/mozilla-central/rev?node=70c8e33b5710The other patches make sense as they're null checks, but this seems unrelated, or am I missing something?
No this introduced the function, that invokes XGetWindowProperty which sometimes somehow leads to g_log_writer_default.
The other two patches introduce null checks and cannot regress the number of patches, therefore if something should be backed out, then all 3. That would of course regress Bug 1628749.
Comment 22•4 years ago
|
||
I think we should call GetWindowManagerName() only when XDG_CURRENT_DESKTOP is empty, which performs the check only for "suspected" desktops. We may also provide a preference to force disable the workspace restoration. I'll look at it this week.
Assignee | ||
Comment 23•4 years ago
|
||
(In reply to Martin Stránský [:stransky] from comment #22)
I think we should call GetWindowManagerName() only when XDG_CURRENT_DESKTOP is empty, which performs the check only for "suspected" desktops. We may also provide a preference to force disable the workspace restoration. I'll look at it this week.
As I've noted in other places XDG_CURRENT_DESKTOP is not for the WM but for the whole DE.
I think a preference to force disable the workspace restoration would then be a must.
Maybe detecting the WM wouldn't even be needed with this preference as it is rather ugly i.e. every "incompatible" WM must be added manually.
Comment 24•4 years ago
|
||
Comment 25•4 years ago
|
||
Comment 26•4 years ago
|
||
bugherder |
Comment 27•4 years ago
|
||
I haven't tested this, but doesn't the inner bool usesDynamicWorkspaces
shadow the outer one now and therefore the mutter setting isn't respected? (I'm a bit surprised there's no compiler warning though if that's true...)
Comment 28•4 years ago
|
||
(In reply to Johannes Pfrang [:johnp] from comment #27)
I haven't tested this, but doesn't the inner
bool usesDynamicWorkspaces
shadow the outer one now and therefore the mutter setting isn't respected? (I'm a bit surprised there's no compiler warning though if that's true...)
Ahh, you're right. I'll fix that, Thanks.
Updated•4 years ago
|
Comment 29•4 years ago
|
||
Reports with this signature seem to be increasing again since July, with reports from all channels. Should we reopen this or file a new bug?
Comment 30•4 years ago
|
||
Please file a new bug and attach the new signatures here.
Thanks.
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Description
•