Closed
Bug 1401721
Opened 7 years ago
Closed 5 years ago
Crash in mozilla::dom::ContentChild::~ContentChild
Categories
(Core :: DOM: Content Processes, defect, P3)
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: marcia, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: crash, regression, topcrash, Whiteboard: [AV:Webroot SecureAnywhere][AV:K7][inj+])
Crash Data
Attachments
(2 files)
This bug was filed from the Socorro interface and is
report bp-47d0651d-bb3d-4dc3-a66f-d77b20170907.
=============================================================
Seen while looking at crash stats: http://bit.ly/2wHyMCJ. Crashes started using 20170905220108. 20 crashes/23 installs according to crash stats.
Possible regression range based on crash stats: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=632e42dca494ec3d90b70325d9c359f80cb3f38a&tochange=f64e2b4dcf5eec0b4ad456c149680a67b7c26dc4
MOZ_CRASH(Content Child shouldn't be destroyed.)
Comment 1•7 years ago
|
||
Hi Bill,
Do you have any idea about how to deal with this bug?
Flags: needinfo?(wmccloskey)
I looked through the regression range. The only think suspicious I see is bug 1229829. I'll needinfo Alex in case he has ideas.
We could try to add some additional assertions to figure out what's causing this. It seems like we're exiting our content process event loop in a way we don't expect. Normally we go through here:
http://searchfox.org/mozilla-central/rev/f6dc0e40b51a37c34e1683865395e72e7fca592c/dom/ipc/ContentChild.cpp#2333
We could assert that no one calls MessageLoop::Quit() on the main thread in a content process in opt builds. That might turn something up.
Flags: needinfo?(wmccloskey) → needinfo?(agaynor)
Comment 3•7 years ago
|
||
I don't understand how alternate desktops would trigger this.
There's one other regression caused by them, bug 1400637. That's been determined to be a bad interaction with some anti-virus. Sampling a handful of the crash reports (I don't know how to do it automatically), I see that they almost all have either k7pswsen.dll or WRusr.dll loaded, which are the two dlls assosciated with the AV in the other crash, so it seems like a decent bet that this is another formulation of that problem.
Alternate desktops was disabled on beta already: https://bugzilla.mozilla.org/show_bug.cgi?id=1402340#c9, and in that bug we're exploring either blocking the injection of DLLs into our process, or getting the AV vendor to fix their software.
Flags: needinfo?(agaynor)
Comment 4•7 years ago
|
||
Caused by bad interaction between anti-virus and a new content process isolation feature. We're working with vendors in related bug 1400637.
Blocks: injecteject
Priority: P2 → P3
Updated•7 years ago
|
Updated•7 years ago
|
Whiteboard: [AV:Webroot SecureAnywhere]
Updated•7 years ago
|
Whiteboard: [AV:Webroot SecureAnywhere] → [AV:Webroot SecureAnywhere][AV:K7]
Comment 5•7 years ago
|
||
We've only had 3 reports in the past week, which is good. Let's keep an eye on this as 58 rolls into Beta.
Updated•7 years ago
|
Comment 6•7 years ago
|
||
status-firefox59:
--- → ?
Updated•7 years ago
|
Updated•7 years ago
|
Whiteboard: [AV:Webroot SecureAnywhere][AV:K7] → [AV:Webroot SecureAnywhere][AV:K7][inj+]
This is showing up at the top of early Nightly crashes for 62, but for only a few installs. Looks like Nightly 61 was affected too.
Updated•6 years ago
|
Comment 8•6 years ago
|
||
This crash signature recently spiked from 1 to 200 crashes per day (May 6 - 10) and has become the #1 Top Crasher on Nightly 62.0. It affects Windows, Mac and Linux.
Top Crashers for Firefox 62.0a1
Top 50 Crashing Signatures. 7 days ago
1 6.42% -2.63% mozilla::dom::ContentChild::~ContentChild 201 39 20 142 37 0 2012-11-08
Keywords: topcrash
Comment 9•6 years ago
|
||
I'm able to reproduce this consistently in 61/mac by loading this embed-twitch.html page via File > Open File... (no crash when it's served over the network).
Comment 10•6 years ago
|
||
I'm not sure I understand how this ever *doesn't* crash. When the nsAutoPtr at [1] goes out of scope it will destroy the ContentProcess, which destroys its ContentChild member. Do we normally just exit instead of returning from the run loop?
[1] https://searchfox.org/mozilla-central/rev/93d2b9860b3d341258c7c5dcd4e278dea544432b/toolkit/xre/nsEmbedFunctions.cpp#652
Comment 11•6 years ago
|
||
(In reply to Jed Davis [:jld] (⏰UTC-6) from comment #10)
> I'm not sure I understand how this ever *doesn't* crash. When the nsAutoPtr
> at [1] goes out of scope it will destroy the ContentProcess, which destroys
> its ContentChild member. Do we normally just exit instead of returning from
> the run loop?
ContentChild::ActorDestroy() calls QuickExit(), which does an exit in non-debug builds.
Comment 12•6 years ago
|
||
Comment #2 also answers the question asked in comment #10, now that I've read it more closely. The suggestion in the last sentence of comment #2 might be useful combined with the repro in comment #9.
(But I'm still wondering why the crash spikes line up with the past two release dates.)
Comment 13•6 years ago
|
||
Not sure if this is helpful.
Had crash 1e970bec-fe93-4040-8211-166fa0180706 and it linked to this bug.
I tend to have this crash/similar crash every time I update firefox in the background. e.g I just upgraded from 60.0.2 to 61 using the built in package manager in neon. It seems firefox doesn't do graceful upgrades like chrome.
In this instance it didn't completely crash the browser but would load new webpages (I'm guessing to do with new/existing child processes) and would show the crashed tab page. In the past It would send 1 crash report per upgrade. This time it was more like 6 before I realised I had updated firefox and needed to restart it.
Comment 14•6 years ago
|
||
There is a known problem with updates (especially with Linux distribution packages, or with Firefox's own updater if multiple profiles are running) where the old browser tries to create a child process by launching the new executable. The spikes around release times suggested this had something to do with that somehow, but “somehow” is the key word there.
Bug 1366808 added code to handle that case more gracefully and is shipping in 62, which I think means we won't get decent UI for this until the 62->63 upgrade. The 61->62 upgrade will detect the lack of `-parentBuildID` and exit relatively early… but apparently after the message loop is created, so it might still run into this; I'd need to stare at the code some more. For 62->63 and later, the child should call QuickExit, which will hopefully avoid this bug: https://searchfox.org/mozilla-central/rev/c579ce13ca7864c5df9711eda730ceb00501aed3/dom/ipc/ContentChild.cpp#688
Before 62, we should be hitting the MOZ_RELEASE_ASSERT added in bug 1345978, which would crash the parent process and might explain content processes exiting in an unexpected way (maybe; I haven't tried tracing through the IPC code to see exactly what happens when the other end hangs up the connection). But that doesn't explain reports of tab crashes and no browser crash, like comment #13.
There's also bug 1463960, which was filed about the Linux distribution case but also about media plugin processes, which weren't covered by the change in bug 1366808.
Reporter | ||
Comment 15•6 years ago
|
||
Adding 63 as affected, as this has risen to the top browser crash in nightly: https://crash-stats.mozilla.com/topcrashers/?product=Firefox&version=63.0a1
status-firefox63:
--- → affected
Updated•6 years ago
|
tracking-firefox63:
--- → +
Reporter | ||
Comment 16•6 years ago
|
||
So I am not sure we have to track this any longer, as the crashes stopped in 63 nightly in the 20180715014912 build. But it is interesting in the 20180714102053 build that we had 182 crashes/47 installs.
Comment 17•6 years ago
|
||
(In reply to Marcia Knous [:marcia - needinfo? me] from comment #16)
> So I am not sure we have to track this any longer, as the crashes stopped in
> 63 nightly in the 20180715014912 build. But it is interesting in the
> 20180714102053 build that we had 182 crashes/47 installs.
Good news then, untracking :)
tracking-firefox63:
+ → ---
Updated•6 years ago
|
Comment 18•6 years ago
|
||
Even, I had the same crash.
29a33eaf-418a-4ab3-96fa-b7e8a0180907
Comment 19•6 years ago
|
||
PoC for crashing!
Comment 20•6 years ago
|
||
Hello I get this issue too,
Report ID is bp-4abf29af-b161-4d5a-8bc2-6c4a90180911
What I do:
I use Ubuntu 16.4 and try to open a local file stored on the path:
/home/developer/ConanCache/fep_sdk/2.0.1/aev25/testing/package/b5525a6e08c3997401fd88eb2c65832150657452/doc/fep-sdk.html
the content is really simple:
<html>
<head>
<meta http-equiv="refresh" content="0; URL=html/index.html"
</head>
</html>
It is a Doxygen documentation, in my case.
Hope it helps.
There are no crashes on 62 release or 63 beta. From comment 14 it sounds like we may still see this crash when people update, so I'll leave it marked affected for 63 for now. The later comments are all about 61 or older versions.
status-firefox64:
--- → ?
Comment 22•6 years ago
|
||
No crashes on 63, we have a few crashes on 62.0.3 only, if we don't have new crashes in a couple of weeks, we probably should close is as WFM.
Reporter | ||
Comment 23•5 years ago
|
||
No crashes in recent builds, closing out as WFM.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•