Closed Bug 1329966 Opened 8 years ago Closed 8 years ago

Crash in shutdownhang | mozilla::CondVar::Wait | nsEventQueue::GetEvent | nsThread::nsChainedEventQueue::GetEvent

Categories

(Core :: DOM: Workers, defect, P1)

x86
Windows 8
defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox52 + affected
firefox53 --- affected
firefox54 --- affected

People

(Reporter: marcia, Unassigned)

References

Details

(Keywords: crash)

Crash Data

This bug was filed from the Socorro interface and is report bp-91f22727-5a65-4e4e-8b15-c97f12170110. ============================================================= Seen while looking at Aurora crash stats. This is the #3 overall browser crash on Aurora: http://bit.ly/2jq2wgY in the last seven days. Although it affects 53 there are only 2 crashes there. Perhaps we should figure out why the rate is so much higher on Aurora. Comments: Crash two times when doing a video interview for a job opening... Thanks!
I only looked through a couple of these crashes, but it looks like the common thread here is that we're waiting on some worker thread to finish running JS, usually in the form of timeouts, but sometimes handling DOM events. I don't know that there's much we can do about this; would it be feasible to simply forcibly terminate JS running on worker threads when we shut down, so we don't have to wait for it? ISTR that we do something like that for normal content...?
Component: XPCOM → DOM: Workers
Flags: needinfo?(bkelly)
Flags: needinfo?(amarchesini)
I don't think we can safely nuke the thread without orphaning a bunch of c++ objects and breaking their invariants. We need to figure out why the worker thread is not completing. Andrea, do you think it could be related to that infinite GC loop we saw in a previous bug?
Flags: needinfo?(bkelly)
Yes, it can be. I don't remove my NI and I'll work on it next week.
tracking this crash for 52, shows up on the top browser crashers list (#2 for the last 7 days)
Keywords: regression
This crash was pretty high in B1 - 2977 crashes. There are a few comments, but nothing particularly useful - some of the comments mention updating, Crashes opening certificates and problem scripts. :baku - will you be able to look at this soon? Thanks.
Keywords: topcrash
Priority: -- → P1
Andrea, did you get a chance to look into this hang?
Andrew, can you please help move this investigation along? It's one of the top crashes for Fx52 at the moment.
Flags: needinfo?(overholt)
baku told me this is a difficult area to fix because it's super-complicated (and it's not even clear what caused the problem). It's also almost impossible to reproduce this crash locally which of course doesn't make it any easier to fix. There were some major-ish related changes about 6 months ago but that is unlikely to explain the recent spike. There's also been some work around timer events and bkelly's timer queue. Anyway, baku's going to keep looking at this. If anyone has ideas on how to reproduce, you will be our favourite person if you tell them us :)
Flags: needinfo?(overholt)
I don't think these crashes are just workers. For example, this one in FF53.0a2 is waiting on compisitor thread to shutdown: https://crash-stats.mozilla.com/report/index/04f7fc6d-43f1-4c9f-981c-9ec5a2170205#allthreads AFAICT, thought there is no compositor thread in the list. This one in FF52.0b4 is waiting for QuotaManager to shutdown: https://crash-stats.mozilla.com/report/index/bd4a8396-3520-4c56-8fba-d949b2170208#allthreads Which looks like its delayed due to Cache API writing data out to disk. Slow disk or network drive, most likely.
This crash is a lot of different crashes and it owould be good to start to separate them, then it would be easier to say what cause the increase. We should start using better tools (that we already have) to separate them (I did not have time till now to learn these new tools :) I was planning to). So this crash can be separated into: worker RuntimeService 420 crashes https://crash-stats.mozilla.com/search/?signature=%3Dshutdownhang%20%7C%20mozilla%3A%3ACondVar%3A%3AWait%20%7C%20nsEventQueue%3A%3AGetEvent%20%7C%20nsThread%3A%3AnsChainedEventQueue%3A%3AGetEvent&proto_signature=RuntimeService&product=Firefox&version=53.0a2&version=53.0a1&version=52.0b&version=52.0b4&version=52.0b3&version=52.0b2&version=52.0b1&version=52.0a2&date=%3E%3D2017-02-02T10%3A53%3A00.000Z&date=%3C2017-02-09T10%3A53%3A00.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature ServiceWorkerRegistrar 198 crashes: https://crash-stats.mozilla.com/search/?signature=%3Dshutdownhang%20%7C%20mozilla%3A%3ACondVar%3A%3AWait%20%7C%20nsEventQueue%3A%3AGetEvent%20%7C%20nsThread%3A%3AnsChainedEventQueue%3A%3AGetEvent&proto_signature=ServiceWorkerRegistrar&product=Firefox&version=53.0a2&version=53.0a1&version=52.0b&version=52.0b4&version=52.0b3&version=52.0b2&version=52.0b1&version=52.0a2&date=%3E%3D2017-02-02T10%3A52%3A00.000Z&date=%3C2017-02-09T10%3A52%3A00.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature SharedThreadPool 231 crashes: https://crash-stats.mozilla.com/search/?signature=%3Dshutdownhang%20%7C%20mozilla%3A%3ACondVar%3A%3AWait%20%7C%20nsEventQueue%3A%3AGetEvent%20%7C%20nsThread%3A%3AnsChainedEventQueue%3A%3AGetEvent&proto_signature=SharedThreadPool&product=Firefox&version=53.0a2&version=53.0a1&version=52.0b&version=52.0b4&version=52.0b3&version=52.0b2&version=52.0b1&version=52.0a2&date=%3E%3D2017-02-02T10%3A38%3A00.000Z&date=%3C2017-02-09T10%3A38%3A00.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature nsUrlClassifierDBService 442 crashes: https://crash-stats.mozilla.com/search/?signature=%3Dshutdownhang%20%7C%20mozilla%3A%3ACondVar%3A%3AWait%20%7C%20nsEventQueue%3A%3AGetEvent%20%7C%20nsThread%3A%3AnsChainedEventQueue%3A%3AGetEvent&proto_signature=nsUrlClassifierDBService&product=Firefox&version=53.0a2&version=53.0a1&version=52.0b&version=52.0b4&version=52.0b3&version=52.0b2&version=52.0b1&version=52.0a2&date=%3E%3D2017-02-02T10%3A36%3A00.000Z&date=%3C2017-02-09T10%3A36%3A00.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature QuotaManager 225 crashes: https://crash-stats.mozilla.com/search/?signature=%3Dshutdownhang%20%7C%20mozilla%3A%3ACondVar%3A%3AWait%20%7C%20nsEventQueue%3A%3AGetEvent%20%7C%20nsThread%3A%3AnsChainedEventQueue%3A%3AGetEvent&proto_signature=QuotaManager&product=Firefox&version=53.0a2&version=53.0a1&version=52.0b&version=52.0b4&version=52.0b3&version=52.0b2&version=52.0b1&version=52.0a2&date=%3E%3D2017-02-02T10%3A34%3A00.000Z&date=%3C2017-02-09T10%3A34%3A00.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature js crashes: RunScript 458 crashes: https://crash-stats.mozilla.com/search/?signature=%3Dshutdownhang%20%7C%20mozilla%3A%3ACondVar%3A%3AWait%20%7C%20nsEventQueue%3A%3AGetEvent%20%7C%20nsThread%3A%3AnsChainedEventQueue%3A%3AGetEvent&proto_signature=RunScript&product=Firefox&version=53.0a2&version=53.0a1&version=52.0b&version=52.0b4&version=52.0b3&version=52.0b2&version=52.0b1&version=52.0a2&date=%3E%3D2017-02-02T11%3A17%3A00.000Z&date=%3C2017-02-09T11%3A17%3A00.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature EnterBaselineAtBranch 330 crashes: https://crash-stats.mozilla.com/search/?signature=%3Dshutdownhang%20%7C%20mozilla%3A%3ACondVar%3A%3AWait%20%7C%20nsEventQueue%3A%3AGetEvent%20%7C%20nsThread%3A%3AnsChainedEventQueue%3A%3AGetEvent&proto_signature=EnterBaselineAtBranch&product=Firefox&version=53.0a2&version=53.0a1&version=52.0b&version=52.0b4&version=52.0b3&version=52.0b2&version=52.0b1&version=52.0a2&date=%3E%3D2017-02-02T10%3A35%3A00.000Z&date=%3C2017-02-09T10%3A35%3A00.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature XMLHttpRequestMainThread 135 crashes: (This is partially resolved, but I think not completely) https://crash-stats.mozilla.com/search/?signature=%3Dshutdownhang%20%7C%20mozilla%3A%3ACondVar%3A%3AWait%20%7C%20nsEventQueue%3A%3AGetEvent%20%7C%20nsThread%3A%3AnsChainedEventQueue%3A%3AGetEvent&proto_signature=XMLHttpRequestMainThread&product=Firefox&version=53.0a2&version=53.0a1&version=52.0b&version=52.0b4&version=52.0b3&version=52.0b2&version=52.0b1&version=52.0a2&date=%3E%3D2017-02-02T11%3A17%3A00.000Z&date=%3C2017-02-09T11%3A17%3A00.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature nsThreadManager 95 crashes: https://crash-stats.mozilla.com/search/?signature=%3Dshutdownhang%20%7C%20mozilla%3A%3ACondVar%3A%3AWait%20%7C%20nsEventQueue%3A%3AGetEvent%20%7C%20nsThread%3A%3AnsChainedEventQueue%3A%3AGetEvent&proto_signature=nsThreadManager&product=Firefox&version=53.0a2&version=53.0a1&version=52.0b&version=52.0b4&version=52.0b3&version=52.0b2&version=52.0b1&version=52.0a2&date=%3E%3D2017-02-02T11%3A17%3A00.000Z&date=%3C2017-02-09T11%3A17%3A00.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature WaitForThreadShutdown 142 crashes: https://crash-stats.mozilla.com/search/?signature=%3Dshutdownhang%20%7C%20mozilla%3A%3ACondVar%3A%3AWait%20%7C%20nsEventQueue%3A%3AGetEvent%20%7C%20nsThread%3A%3AnsChainedEventQueue%3A%3AGetEvent&proto_signature=WaitForThreadShutdown&product=Firefox&version=53.0a2&version=53.0a1&version=52.0b&version=52.0b4&version=52.0b3&version=52.0b2&version=52.0b1&version=52.0a2&date=%3E%3D2017-02-02T11%3A17%3A00.000Z&date=%3C2017-02-09T11%3A17%3A00.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature GeckoMediaPluginServiceParent 26 crashes: https://crash-stats.mozilla.com/search/?signature=%3Dshutdownhang%20%7C%20mozilla%3A%3ACondVar%3A%3AWait%20%7C%20nsEventQueue%3A%3AGetEvent%20%7C%20nsThread%3A%3AnsChainedEventQueue%3A%3AGetEvent&proto_signature=GeckoMediaPluginServiceParent&product=Firefox&version=53.0a2&version=53.0a1&version=52.0b&version=52.0b4&version=52.0b3&version=52.0b2&version=52.0b1&version=52.0a2&date=%3E%3D2017-02-02T11%3A15%3A00.000Z&date=%3C2017-02-09T11%3A15%3A00.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature nsStreamTransportService 204 crashes: https://crash-stats.mozilla.com/search/?signature=%3Dshutdownhang%20%7C%20mozilla%3A%3ACondVar%3A%3AWait%20%7C%20nsEventQueue%3A%3AGetEvent%20%7C%20nsThread%3A%3AnsChainedEventQueue%3A%3AGetEvent&proto_signature=nsStreamTransportService&product=Firefox&version=53.0a2&version=53.0a1&version=52.0b&version=52.0b4&version=52.0b3&version=52.0b2&version=52.0b1&version=52.0a2&date=%3E%3D2017-02-02T11%3A06%3A00.000Z&date=%3C2017-02-09T11%3A06%3A00.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature nsSocketTransportService 320 crashes: https://crash-stats.mozilla.com/search/?signature=%3Dshutdownhang%20%7C%20mozilla%3A%3ACondVar%3A%3AWait%20%7C%20nsEventQueue%3A%3AGetEvent%20%7C%20nsThread%3A%3AnsChainedEventQueue%3A%3AGetEvent&proto_signature=nsSocketTransportService&product=Firefox&version=53.0a2&version=53.0a1&version=52.0b&version=52.0b4&version=52.0b3&version=52.0b2&version=52.0b1&version=52.0a2&date=%3E%3D2017-02-02T10%3A40%3A00.000Z&date=%3C2017-02-09T10%3A40%3A00.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature nsHttpConnectionMgr 2385 crashes: https://crash-stats.mozilla.com/search/?signature=%3Dshutdownhang%20%7C%20mozilla%3A%3ACondVar%3A%3AWait%20%7C%20nsEventQueue%3A%3AGetEvent%20%7C%20nsThread%3A%3AnsChainedEventQueue%3A%3AGetEvent&proto_signature=nsHttpConnectionMgr&product=Firefox&version=53.0a2&version=53.0a1&version=52.0b&version=52.0b4&version=52.0b3&version=52.0b2&version=52.0b1&version=52.0a2&date=%3E%3D2017-02-02T10%3A44%3A00.000Z&date=%3C2017-02-09T10%3A44%3A00.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature without Windows XP 855 crashes: https://crash-stats.mozilla.com/search/?signature=%3Dshutdownhang%20%7C%20mozilla%3A%3ACondVar%3A%3AWait%20%7C%20nsEventQueue%3A%3AGetEvent%20%7C%20nsThread%3A%3AnsChainedEventQueue%3A%3AGetEvent&proto_signature=nsHttpConnectionMgr&platform_pretty_version=%21XP&product=Firefox&version=53.0a2&version=53.0a1&version=52.0b&version=52.0b4&version=52.0b3&version=52.0b2&version=52.0b1&version=52.0a2&date=%3E%3D2017-02-02T10%3A44%3A00.000Z&date=%3C2017-02-09T10%3A44%3A00.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature The rest 288 crashes (this are different some are js but not all): https://crash-stats.mozilla.com/search/?signature=%3Dshutdownhang%20%7C%20mozilla%3A%3ACondVar%3A%3AWait%20%7C%20nsEventQueue%3A%3AGetEvent%20%7C%20nsThread%3A%3AnsChainedEventQueue%3A%3AGetEvent&proto_signature=%21RuntimeService&proto_signature=%21ServiceWorkerRegistrar&proto_signature=%21SharedThreadPool&proto_signature=%21nsUrlClassifierDBService&proto_signature=%21QuotaManager&proto_signature=%21nsStreamTransportService&proto_signature=%21nsSocketTransportService&proto_signature=%21nsHttpConnectionmgr&proto_signature=%21GeckoMediaPluginServiceParent&proto_signature=%21nsThreadManager&proto_signature=%21XMLHttpRequestMainThread&proto_signature=%21RunScript&proto_signature=%21EnterBaselineAtBranch&product=Firefox&version=52.0b&version=54.0a1&version=53.0a2&version=53.0a1&version=52.0b4&version=52.0b3&version=52.0b2&version=52.0b1&version=52.0a2&date=%3E%3D2017-02-02T10%3A29%3A00.000Z&date=%3C2017-02-09T10%3A29%3A00.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature nsHttpConnectionMgr crashes increased a bit the use to be around 1800 (I think on beta and around 50 on aurora). This are really hard to resoled. I notice a lot of hangs in PR_Connect -> WSABind on xp. This are not cause by us.
thanks for your analysis dragana. so it looks, like this signature is now just a catch-all for all shutdownhangs... here is a comparison between shutdown crash signatures on 51.0b vs 52.0b: https://mozilla.github.io/stab-crashes/scomp.html?common=product%3DFirefox%26date%3D%3E2016-10-01%26submitted_from_infobar%3D!__true__%26process_type%3Dbrowser%26shutdown_progress%3D!__null__&p1=version%3D51.0b&p2=version%3D52.0b
Keywords: regression, topcrash
There is another report with a long stack: bug 1323515, maybe that helps. It is a hang in js.
(In reply to Ben Kelly [:bkelly] from comment #2) > I don't think we can safely nuke the thread without orphaning a bunch of c++ > objects and breaking their invariants. We need to figure out why the worker > thread is not completing. > > Andrea, do you think it could be related to that infinite GC loop we saw in > a previous bug? Not claiming to know anything here, but I recall recently (within last 2 months) seeing C++ updates from Microsoft on Windows 7 machines. Maybe Windows 8 has something as well. Hope this helps somehow.
Crash Signature: [@ shutdownhang | mozilla::CondVar::Wait | nsEventQueue::GetEvent | nsThread::nsChainedEventQueue::GetEvent] → [@ shutdownhang | mozilla::CondVar::Wait | nsEventQueue::GetEvent | nsThread::nsChainedEventQueue::GetEvent] [@ shutdownhang | mozilla::CondVar::Wait | nsEventQueue::GetEvent | nsThread::nsChainedEventQueue::GetEvent | nsThread::GetEvent]
this catch-all signature for shutdownhangs got split up in bug 1338288 by now, so it's no longer showing up in crash stats.
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: needinfo?(amarchesini)
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.