Closed
Bug 1380530
Opened 7 years ago
Closed 7 years ago
Intermittent devtools/client/canvasdebugger/test/browser_profiling-canvas.js | application crashed [@ MessageLoop::DeletePendingTasks()]
Categories
(Core :: IPC, defect)
Core
IPC
Tracking
()
RESOLVED
FIXED
mozilla56
Tracking | Status | |
---|---|---|
firefox-esr52 | --- | unaffected |
firefox54 | --- | unaffected |
firefox55 | --- | unaffected |
firefox56 | --- | fixed |
People
(Reporter: intermittent-bug-filer, Assigned: lsalzman)
References
Details
(Keywords: crash, intermittent-failure)
Crash Data
Attachments
(1 file)
(deleted),
patch
|
billm
:
review+
bobowen
:
feedback+
|
Details | Diff | Splinter Review |
This appears to have started when bug 1376026 landed:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&fromchange=23814434a6bc690b51bddd1ddd2b86382c32fe47&noautoclassify&filter-searchStr=b684e13e95340ff6010a5dcb9d887238b7b83181&group_state=expanded&tochange=633c98dc992ff13168710725ea94a8ac49f6d48a
Flags: needinfo?(lsalzman)
Assignee | ||
Comment 2•7 years ago
|
||
Initial investigation points at the runnable stuck on the work queue to be of ObjectWatcher::Watch, which following the chain of users seems to point to ProcessWatcher::EnsureProcessTerminated. The most probable user of that to be causing that seems like GeckoChildProcessHost::~GeckoChildProcessHost. Why this would in any way be related to a change to use DirectWrite for fonts on Windows 7 is as yet undetermined, and I am still looking into it. Superficially, it seems unrelated and more likely to be a result of changed timing patterns in the tests exposing a pre-existing problem.
Flags: needinfo?(lsalzman)
Comment hidden (Intermittent Failures Robot) |
Comment 4•7 years ago
|
||
Note that the oldest open report of this assertion failure to date was from 2016-01-03 (bug 1236350).
Lee, are you willing to work on this bug? Looks like you went deeper with hunting this down than me in bug 1355196.
Flags: needinfo?(lsalzman)
Updated•7 years ago
|
Component: Developer Tools: Canvas Debugger → IPC
Product: Firefox → Core
Assignee | ||
Comment 5•7 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #4)
> Note that the oldest open report of this assertion failure to date was from
> 2016-01-03 (bug 1236350).
>
> Lee, are you willing to work on this bug? Looks like you went deeper with
> hunting this down than me in bug 1355196.
I am still in the process of investigating this.
Flags: needinfo?(lsalzman)
Assignee | ||
Comment 7•7 years ago
|
||
The source of this particular instance of the intermittent in DeletePendingTasks was coming from ObjectWatcher queuing a task after the message loop is already in its death throes. We let it queue the task even though it would never actually be processed and merely trigger this assertion.
This adds tracking for whether the MessageLoop is shutting down. Callers sometimes adds tasks before the Run() loop is actually started, and usually during the Run() loop itself, both of which are safe times to add tasks since they will still actually get processed. However, after the Run() loop returns, it is never invoked again, so it is thus not safe to add tasks.
The added assertion in PostTask_Helper here will help us pinpoint the intermittents at the actual time they are triggered by calls to PostTask, rather than waiting all the way till DeletePendingTasks is called.
And in particular with ObjectWatcher::DoneWaiting, we simply skip adding the pending task if the MessageLoop is shutting down. ObjectWatcher only gets used by ChildReaper, and ChildReaper will always call KillProcess() in its destructor in the worst case, to ensure the process actually gets cleaned up.
Attachment #8887624 -
Flags: review?(bobowencode)
Comment 8•7 years ago
|
||
Comment on attachment 8887624 [details] [diff] [review]
don't allow new tasks to be posted to MessageLoop when it is shutting down
Review of attachment 8887624 [details] [diff] [review]:
-----------------------------------------------------------------
I'm not a peer for this code, so redirecting to billm.
(In reply to Lee Salzman [:lsalzman] from comment #7)
...
> And in particular with ObjectWatcher::DoneWaiting, we simply skip adding the
> pending task if the MessageLoop is shutting down. ObjectWatcher only gets
> used by ChildReaper, and ChildReaper will always call KillProcess() in its
> destructor in the worst case, to ensure the process actually gets cleaned up.
Well in the non-forced case, we really have to rely on ChildReaper::WillDestroyCurrentMessageLoop to do the clean-up, because I think we won't get deleted unless that happens anyway.
As far as I can tell in the non-forced case we will never kill a hung/slow-to-shutdown child process, but that is no change from how it works now.
::: ipc/chromium/src/base/message_loop.h
@@ +120,4 @@
> // NOTE: These methods may be called on any thread. The Task will be invoked
> // on the thread that executes MessageLoop::Run().
>
> + bool IsProcessingTasks() const { return !shutting_down_; }
I think this name is slightly misleading, perhaps IsAcceptingTasks would be better?
::: ipc/chromium/src/base/object_watcher.cc
@@ +128,5 @@
> // provided, which in turn ensures our change to did_signal can be observed
> // on the target thread.
> + if (watch->origin_loop->IsProcessingTasks()) {
> + watch->origin_loop->PostTask(addrefedWatch.forget());
> + }
I think that this is OK, because any remaining clean-up should still happen as a result of ChildReaper::WillDestroyCurrentMessageLoop and ObjectWatcher::WillDestroyCurrentMessageLoop in the non-forced case.
In the forced/ref-counted case ~ChildReaper calling KillProcess will do it as a last resort.
Attachment #8887624 -
Flags: review?(wmccloskey)
Attachment #8887624 -
Flags: review?(bobowencode)
Attachment #8887624 -
Flags: feedback+
Attachment #8887624 -
Flags: review?(wmccloskey) → review+
Updated•7 years ago
|
Blocks: work_queue_.empty
Pushed by lsalzman@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/04e3649ca8fc
don't allow new tasks to be posted to MessageLoop when it is shutting down. r=billm
Assignee | ||
Comment 10•7 years ago
|
||
It should be noted again that, provided this landing sticks, that we're going to essentially be trading the assertion inside MessageLoop::DeletePendingTasks() for assertions inside MessageLoop::PostTask_Helper coming from different callers. This will be nothing to be alarmed about, because in general it will point the finger at the thread that decided to queue the bad task where we actually have enough context information to track down things, as opposed to the receiver, which has lost that context.
Comment 11•7 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 7 years ago
status-firefox56:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
Updated•7 years ago
|
status-firefox54:
--- → unaffected
status-firefox55:
--- → unaffected
status-firefox-esr52:
--- → unaffected
Comment hidden (Intermittent Failures Robot) |
You need to log in
before you can comment on or make changes to this bug.
Description
•