Closed
Bug 1371728
Opened 7 years ago
Closed 7 years ago
Intermittent leakcheck | default process: 557416 bytes leaked (CancelableRunnable, CondVar, IdlePeriod, IdleRunnable, Mutex, ...)
Categories
(Core :: XPCOM, defect)
Core
XPCOM
Tracking
()
RESOLVED
FIXED
mozilla56
Tracking | Status | |
---|---|---|
firefox-esr52 | --- | unaffected |
firefox54 | --- | unaffected |
firefox55 | + | fixed |
firefox56 | --- | fixed |
People
(Reporter: intermittent-bug-filer, Assigned: ehsan.akhgari)
References
(Blocks 1 open bug)
Details
(Keywords: intermittent-failure, memory-leak, regressionwindow-wanted, Whiteboard: [MemShrink:P2][stockwell fixed:product])
Attachments
(1 file)
(deleted),
patch
|
smaug
:
review+
jcristau
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
Filed by: wkocher [at] mozilla.com
https://treeherder.mozilla.org/logviewer.html#?job_id=105814624&repo=autoland
https://queue.taskcluster.net/v1/task/fWEoJ2tIRJ-G22YJJqwG-A/runs/0/artifacts/public/logs/live_backing.log
https://hg.mozilla.org/mozilla-central/raw-file/tip/layout/tools/reftest/reftest-analyzer.xhtml#logurl=https://queue.taskcluster.net/v1/task/fWEoJ2tIRJ-G22YJJqwG-A/runs/0/artifacts/public/logs/live_backing.log&only_show_unexpected=1
Comment 1•7 years ago
|
||
We're leaking 2017 timers and 2017 IdleRunnables.
Component: DOM: Content Processes → XPCOM
Comment 2•7 years ago
|
||
Possibly a regression from bug 1361461, which I expect affected how a lot of idle runnables were handled.
Blocks: 1361461
Comment 3•7 years ago
|
||
[Tracking Requested - why for this release]: It would be really bad if we've started leaking thousands of runnables sometimes. We should either make sure this goes away, or bisect down to figure out what has caused it. Or confirm that it isn't a new issue.
status-firefox55:
--- → affected
tracking-firefox55:
--- → ?
Keywords: mlk,
regressionwindow-wanted
Whiteboard: [MemShrink]
Updated•7 years ago
|
Assignee | ||
Comment 4•7 years ago
|
||
It would be nice to confirm if this is a real leak or a leak reported by the reftest harness. IIRC I was seeing intermittent leaks like this before I added the 1 second timeout, so it could be that there are a number of these runnables that aren't processed at the time we shut down the reftest harness.
Comment 5•7 years ago
|
||
2000 is a lot! Could it that the idle queue is starved out by the absurd number of pages opened by reftests? Could we flush it on shutdown (at least in debug builds) before shutdown GCs happen?
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 7•7 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #5)
> 2000 is a lot! Could it that the idle queue is starved out by the absurd
> number of pages opened by reftests?
Yeah, that could certainly happen.
(Not sure where the 2000 number comes from, FWIW. I'm assuming you meant 1000ms!)
> Could we flush it on shutdown (at least
> in debug builds) before shutdown GCs happen?
Can you point me to some place in the code where this needs to happen? I don't know exactly where the right place before which all of these need to run is...
Flags: needinfo?(continuation)
Comment 8•7 years ago
|
||
(In reply to :Ehsan Akhgari (needinfo please, extremely long backlog) from comment #7)
> (Not sure where the 2000 number comes from, FWIW. I'm assuming you meant
> 1000ms!)
I don't mean 1000ms. I mean 2000 different runnables are leaking.
If you look through the log, you'll see a little ASCII art chart of what is leaking:
|<----------------Class--------------->|<-----Bytes------>|<----Objects---->|
| | Per-Inst Leaked| Total Rem|
...
349 |IdleRunnable | 56 101360| 12743 1810|
This means that we are leaking 1810 different runnables.
> Can you point me to some place in the code where this needs to happen? I
> don't know exactly where the right place before which all of these need to
> run is...
You could probably register an observer for "xpcom-shutdown". I'm not sure how shutting down of the idle queue is handled right now.
Flags: needinfo?(continuation)
Comment 9•7 years ago
|
||
Though really, if we're really letting the idle queue get that backed up in reftests, maybe something should be added to the reftest harness to let the idle queue run a bit in between tests? We do something like that for GC/CC already.
Assignee | ||
Comment 10•7 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #8)
> (In reply to :Ehsan Akhgari (needinfo please, extremely long backlog) from
> comment #7)
> > (Not sure where the 2000 number comes from, FWIW. I'm assuming you meant
> > 1000ms!)
>
> I don't mean 1000ms. I mean 2000 different runnables are leaking.
Oh, I see, thanks for the explanation!
(In reply to Andrew McCreight [:mccr8] from comment #9)
> Though really, if we're really letting the idle queue get that backed up in
> reftests, maybe something should be added to the reftest harness to let the
> idle queue run a bit in between tests? We do something like that for GC/CC
> already.
I was asking about the place in the reftest harness where this is already done for GC/CC...
We could just choose to process all of the idle queue before dispatching xpcom-shutdown in debug builds, for example. Or we can add a hack specific to these WindowDestroyEvent runnables. I'm not really sure which one would be preferrable...
Olli, do you have any preferences which way to go?
Flags: needinfo?(bugs)
Comment 11•7 years ago
|
||
Why aren't we processing the idle runnables the same way as other runnables during shutdown?
That is how they should work, I think.
Flags: needinfo?(bugs)
Assignee | ||
Comment 12•7 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #11)
> Why aren't we processing the idle runnables the same way as other runnables
> during shutdown?
> That is how they should work, I think.
As far as I can tell, this is the last place where we process any runnables at shutdown time: <https://searchfox.org/mozilla-central/rev/61054508641ee76f9c49bcf7303ef3cfb6b410d2/xpcom/build/XPCOMInit.cpp#913>
And a bit later we call NS_LogTerm() here <https://searchfox.org/mozilla-central/rev/61054508641ee76f9c49bcf7303ef3cfb6b410d2/xpcom/build/XPCOMInit.cpp#1077> which is the code that triggers leak checking.
NS_ProcessPendingEvents() just calls ProcessNextEvent() in a loop, so if we end up being in an idle period within this time, eventually GetEvent() may return null while there are runnables in our idle queue. Then NS_ProcessPendingEvents() would return, and we finish without having exhausted everything in our idle queue (and of course we leak them all.)
Is there a specific mechanism that should be processing these runnables at shutdown besides the above?
Flags: needinfo?(bugs)
Comment 13•7 years ago
|
||
So sounds like we should just skip idle detection if we're shutting down a thread and process all idle events.
nsThread::GetIdleEvent could check ShuttingDown() and if that returns null, bypass idleness checks.
Flags: needinfo?(bugs)
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 15•7 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #13)
> So sounds like we should just skip idle detection if we're shutting down a
> thread and process all idle events.
> nsThread::GetIdleEvent could check ShuttingDown() and if that returns null,
> bypass idleness checks.
That sounds like a good idea. It should be possible to implement that by making GetIdleDeadline() return TimeStamp::Now() when we're shutting down, I think. I'll give that a shot on the try server to see what happens.
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 17•7 years ago
|
||
(In reply to :Ehsan Akhgari (needinfo please, extremely long backlog) from comment #15)
> (In reply to Olli Pettay [:smaug] from comment #13)
> > So sounds like we should just skip idle detection if we're shutting down a
> > thread and process all idle events.
> > nsThread::GetIdleEvent could check ShuttingDown() and if that returns null,
> > bypass idleness checks.
>
> That sounds like a good idea. It should be possible to implement that by
> making GetIdleDeadline() return TimeStamp::Now() when we're shutting down, I
> think. I'll give that a shot on the try server to see what happens.
That approach seems promising! https://treeherder.mozilla.org/#/jobs?repo=try&revision=de54d4b9e4d60d50d55241f3914930f3f06598a1
Assignee: nobody → ehsan
Assignee | ||
Comment 18•7 years ago
|
||
Attachment #8877385 -
Flags: review?(bugs)
Comment 19•7 years ago
|
||
this seems worth tracking and making the leakcheck complaints go away for 55
Updated•7 years ago
|
Attachment #8877385 -
Flags: review?(bugs) → review+
Assignee | ||
Comment 20•7 years ago
|
||
(In reply to Julien Cristau [:jcristau] from comment #19)
> this seems worth tracking and making the leakcheck complaints go away for 55
Yes, we should also backport this to ensure that we do in fact run all of these idle runnables when shutting down even more importantly.
Comment 21•7 years ago
|
||
Pushed by eakhgari@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f65461f10e0a
Don't honor the idle period during shutdown; r=smaug
Assignee | ||
Comment 22•7 years ago
|
||
Comment on attachment 8877385 [details] [diff] [review]
Don't honor the idle period during shutdown
Approval Request Comment
[Feature/Bug causing the regression]: The idle queue splitting work, spread across many bugs.
[User impact if declined]: We may not run some idle runnables at shutdown time.
[Is this code covered by automated tests?]: Yes, we see intermittent leaks when this happens.
[Has the fix been verified in Nightly?]: It has been fixed on the try server.
[Needs manual test from QE? If yes, steps to reproduce]: No.
[List of other uplifts needed for the feature/fix]: None.
[Is the change risky?]: Not very risky.
[Why is the change risky/not risky?]: It's well understood.
[String changes made/needed]: None.
Attachment #8877385 -
Flags: approval-mozilla-beta?
Comment 23•7 years ago
|
||
Bug 1358898 is related to this, though I think the patch here won't help that, because the issue there is we're spending too much time on idle runnables already.
Updated•7 years ago
|
Whiteboard: [MemShrink] → [MemShrink:P2]
Comment 24•7 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 7 years ago
status-firefox56:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
Comment 25•7 years ago
|
||
Comment on attachment 8877385 [details] [diff] [review]
Don't honor the idle period during shutdown
process idle queue on shutdown, beta55+
Attachment #8877385 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 26•7 years ago
|
||
bugherder uplift |
Updated•7 years ago
|
status-firefox54:
--- → unaffected
status-firefox-esr52:
--- → unaffected
Assignee | ||
Comment 27•7 years ago
|
||
Per <https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1371728&tree=mozilla-central&startday=2017-06-01&endday=2017-06-15>, this indeed seems to be fixed.
Updated•7 years ago
|
Flags: qe-verify-
Updated•7 years ago
|
Whiteboard: [MemShrink:P2] → [MemShrink:P2][stockwell fixed:product]
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
You need to log in
before you can comment on or make changes to this bug.
Description
•