Closed
Bug 932898
Opened 11 years ago
Closed 11 years ago
Bring back the shutdown leak detector
Categories
(Testing :: General, defect)
Testing
General
Tracking
(Not tracked)
RESOLVED
FIXED
mozilla28
People
(Reporter: emorley, Assigned: ttaubert)
References
(Blocks 1 open bug)
Details
(Whiteboard: [MemShrink:P1])
Attachments
(3 files, 6 obsolete files)
(deleted),
patch
|
smaug
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
jaws
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
ted
:
review+
|
Details | Diff | Splinter Review |
The cycle collection analysis added by bug 728294 should have caught the recent issues in bug 932781 and dependents, but didn't.
My money is on bug 863447 regressing it:
https://hg.mozilla.org/mozilla-central/rev/bed3081376ca
Either way, we must fix it & then the inevitable hoards of leaks we have accumulated in the meantime, before the trees can reopen.
Reporter | ||
Comment 1•11 years ago
|
||
(This bug is one of three holding the trees closed - bug 932781 comment 11)
Severity: critical → blocker
Updated•11 years ago
|
Whiteboard: [MemShrink]
Comment 3•11 years ago
|
||
(In reply to Ed Morley [:edmorley UTC+1] from comment #3)
> The cycle collection analysis added by bug 728294 should have caught the
> recent issues in bug 932781 and dependents, but didn't.
>
> My money is on bug 863447 regressing it:
> https://hg.mozilla.org/mozilla-central/rev/bed3081376ca
Try push w/ backout: https://tbpl.mozilla.org/?tree=Try&rev=b86779999d73
I'll try this locally as well but am not sure that'll actually be faster than this try push...
Comment 4•11 years ago
|
||
(In reply to :Gijs Kruitbosch from comment #3)
> (In reply to Ed Morley [:edmorley UTC+1] from comment #3)
> > The cycle collection analysis added by bug 728294 should have caught the
> > recent issues in bug 932781 and dependents, but didn't.
> >
> > My money is on bug 863447 regressing it:
> > https://hg.mozilla.org/mozilla-central/rev/bed3081376ca
>
> Try push w/ backout: https://tbpl.mozilla.org/?tree=Try&rev=b86779999d73
>
> I'll try this locally as well but am not sure that'll actually be faster
> than this try push...
FWIW, locally at least this doesn't help
Comment 5•11 years ago
|
||
Tim and I are looking at this
Assignee: nobody → ttaubert
Status: NEW → ASSIGNED
Assignee | ||
Comment 6•11 years ago
|
||
I wonder, isn't bug 932867 a different case of leak than what the CC graph analyzer is supposed to catch? I thought the analyzer catches when there is some JS property that holds a reference to a DOM Element or Window that has been removed from the DOM Tree.
How ever the FrameWorker stuff inserts an iframe into the hiddenDOMWindow and just never uses it again. So it's a valid DOM Element kept alive but the hiddenWindow, we just never use it again.
Is that a leak we should catch?
Comment 7•11 years ago
|
||
That is not a "leak". We just end up creating more and more iframes, but never remove them from
DOM.
We could and should perhaps count window objects which are alive, and make sure that count doesn't
increase much over the time.
Assignee | ||
Comment 8•11 years ago
|
||
Bug 839193 (which turns up in khuey's log [1] as well) seems to have introduced a test leak by never removing observers, right? Shouldn't that be detected by our leak detection? It also doesn't show up in about:cc.
[1] https://khuey.pastebin.mozilla.org/3378934
Comment 9•11 years ago
|
||
(In reply to Tim Taubert [:ttaubert] from comment #8)
> Bug 839193 (which turns up in khuey's log [1] as well) seems to have
> introduced a test leak by never removing observers, right? Shouldn't that be
> detected by our leak detection? It also doesn't show up in about:cc.
>
> [1] https://khuey.pastebin.mozilla.org/3378934
Status update: Tim and I found that enabling allTraces() did make the test suite report a bunch of stuff as leaking, but that is likely overreporting (ie reporting things which aren't actually leaks). Just blanket using that doesn't seem like it'd help us here.
Tim also noticed that when http://mxr.mozilla.org/mozilla-central/source/dom/base/nsGlobalWindow.cpp#1629 runs, if (MOZ_UNLIKELY(cb.WantDebugInfo())) is false. This by itself may not affect things too much, but it's strange that that flag is not present, as AFAICT it shouldn't be.
Comment 10•11 years ago
|
||
(In reply to Tim Taubert [:ttaubert] from comment #8)
> Bug 839193 (which turns up in khuey's log [1] as well) seems to have
> introduced a test leak by never removing observers, right? Shouldn't that be
> detected by our leak detection? It also doesn't show up in about:cc.
What kind of observers? We try to mark strong observers black "certainly alive", since we have
tons of stuff using observer service and we need to get the normal CC graphs small.
It is a thin line between a leak and valid observer service use.
Comment 11•11 years ago
|
||
I guess we should put back part 1 from bug 728294. That would catch some leaks.
But we need something where we check the number of windows more often.
Comment 12•11 years ago
|
||
fwiw, I don't think bug 863447 has blame here, it is not faking or hiding anything, we care to detect things that are indefinitely leaked, not things that are "leaked" for some milliseconds while async work completes.
(In reply to Marco Bonardo [:mak] from comment #12)
> fwiw, I don't think bug 863447 has blame here, it is not faking or hiding
> anything, we care to detect things that are indefinitely leaked, not things
> that are "leaked" for some milliseconds while async work completes.
I agree. This appears to be a different "type" of leak that we weren't checking before. Based on that I don't think we should block reopening the tree on this.
Assignee | ||
Comment 14•11 years ago
|
||
+1. This is an old feature we should bring back but this could take a little while to get right with all the try runs etc. I don't think we should block trees any further.
Assignee | ||
Comment 15•11 years ago
|
||
This seems to work fine and is mostly code that we removed in bug 728294. I'm not sure how to detect if we're in a debug build (which isn't actually necessary as we just don't have any lines to parse) and we should not run with webapprtChrome=true.
Attachment #825175 -
Flags: feedback?(ted)
Assignee | ||
Comment 16•11 years ago
|
||
Both tests are leaking according to the shutdown leak detector. They're not anymore with these fixes applied.
Attachment #825177 -
Flags: review?(bugs)
Updated•11 years ago
|
Attachment #825177 -
Flags: review?(bugs) → review+
Reporter | ||
Comment 17•11 years ago
|
||
Comment on attachment 825177 [details] [diff] [review]
Fix leaks in dom/ mochitests
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/acd7840da834
Reporter | ||
Updated•11 years ago
|
Whiteboard: [MemShrink] → [MemShrink] [leave open]
Assignee | ||
Comment 18•11 years ago
|
||
Here's a try run with most of the recent leak fixes and the leak detection patch:
https://tbpl.mozilla.org/?tree=Try&rev=2a58c0f29f88
Assignee | ||
Comment 19•11 years ago
|
||
Added some code that removes the false positives caused by the sidebar and the BrowserNewTabPreloader.
Attachment #825175 -
Attachment is obsolete: true
Attachment #825175 -
Flags: feedback?(ted)
Attachment #825296 -
Flags: feedback?(ted)
Comment 20•11 years ago
|
||
Comment on attachment 825296 [details] [diff] [review]
Bring back the shutdown leak detector, v2
Review of attachment 825296 [details] [diff] [review]:
-----------------------------------------------------------------
::: build/automationutils.py
@@ +558,5 @@
> + if line[2:11] == "DOMWINDOW":
> + self._logWindow(line)
> + elif line[2:10] == "DOCSHELL":
> + self._logDocShell(line)
> + elif line.startswith("TEST-START"):
Kinda wish this was using the existing currentTest stuff, but it's not that big of a deal.
::: testing/mochitest/runtests.py
@@ +833,5 @@
> args.append('-foreground')
> if testUrl:
> args.append(testUrl)
>
> + #if debug build and not webapprtChrome
You can say if mozinfo.info["debug"] and not options.webapprtChrome:
@@ +1052,5 @@
> ### output processing
>
> class OutputHandler(object):
> """line output handler for mozrunner"""
> + def __init__(self, harness, utilityPath, symbolsPath=None, dump_screen_on_timeout=True, shutdownLeaks = None):
nit: style here seems to be no spaces around =
Attachment #825296 -
Flags: feedback?(ted) → feedback+
Assignee | ||
Comment 21•11 years ago
|
||
Here's a new patch that fixes the false positive for the TabView window. It doesn't address ted's comments, yet.
Attachment #825296 -
Attachment is obsolete: true
Comment 22•11 years ago
|
||
Updating the summary to what seems to be the current goal.
Summary: Work out why the cycle collection leak analysis from bug 728294 is no longer working → Bring back the shutdown leak detector
Reporter | ||
Updated•11 years ago
|
Severity: blocker → critical
Assignee | ||
Comment 24•11 years ago
|
||
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #20)
> > + #if debug build and not webapprtChrome
>
> You can say if mozinfo.info["debug"] and not options.webapprtChrome:
'options' isn't available in runApp() so I think I'll just pass a 'webapprtChrome' argument?
Comment 25•11 years ago
|
||
Ah. Yeah, do that.
Assignee | ||
Comment 26•11 years ago
|
||
Patch addresses Ted's comments and should fix some more false positives. Needs fx-team tip (until that is merged).
Attachment #825366 -
Attachment is obsolete: true
Assignee | ||
Comment 27•11 years ago
|
||
Latest patch that needs the patch from bug 934091 to work.
Attachment #826089 -
Attachment is obsolete: true
Comment 29•11 years ago
|
||
Comment on attachment 826307 [details] [diff] [review]
Fix leaks in /browser/base/content/test/general/
>+ function windowClosed(subject) {
>+ if (subject == win) {
>+ Services.obs.removeObserver(newDocumentShown, "document-shown");
>+ Services.obs.removeObserver(windowClosed, "domwindowclosed");
>+ }
>+ }
>+
>+ // Make sure to remove the 'document-shown' observer in case the window
>+ // is being closed right after it was opened to avoid leaking.
> Services.obs.addObserver(newDocumentShown, "document-shown", false);
>+ Services.obs.addObserver(windowClosed, "domwindowclosed", false);
Can you use the unload event?
Assignee | ||
Comment 30•11 years ago
|
||
(In reply to Dão Gottwald [:dao] from comment #29)
> >+ // Make sure to remove the 'document-shown' observer in case the window
> >+ // is being closed right after it was opened to avoid leaking.
> > Services.obs.addObserver(newDocumentShown, "document-shown", false);
> >+ Services.obs.addObserver(windowClosed, "domwindowclosed", false);
>
> Can you use the unload event?
I just tried and it doesn't work. Looks like we're closing the window way too early to even receive an unload event.
Updated•11 years ago
|
Attachment #826307 -
Flags: review?(jaws) → review+
Comment 31•11 years ago
|
||
Comment on attachment 826283 [details] [diff] [review]
Bring back the shutdown leak detector, v5
Review of attachment 826283 [details] [diff] [review]:
-----------------------------------------------------------------
::: build/automationutils.py
@@ +603,5 @@
> + self.leakedWindows[id] = self._parseValue(line, "url")
> +
> + def _logDocShell(self, line):
> + created = line[:2] == "++"
> + id = self._parseValue(line, "pid") + "." + self._parseValue(line, "id")
FWIW, I was using this and found a problem here - self._parseValue may return None, causing a TypeError trying to add a string and None. This was probably caused by me applying just this patch - there's no "pid" in docShell lines in my output.
(The symptoms were also really bizarre - exceptions weren't written anywhere and it seemed to cause the Fx process to simply hang writing to stdout - I doubt they are related to this patch though, but it sure confused me and took me quite some time to work out the problem was here...)
Assignee | ||
Comment 32•11 years ago
|
||
Comment 33•11 years ago
|
||
Landed the leak fixes on Aurora/Beta.
https://hg.mozilla.org/releases/mozilla-aurora/rev/dee754949ca0
https://hg.mozilla.org/releases/mozilla-beta/rev/65fd507af2cc
Assignee | ||
Comment 35•11 years ago
|
||
(In reply to Mark Hammond [:markh] from comment #31)
> FWIW, I was using this and found a problem here - self._parseValue may
> return None, causing a TypeError trying to add a string and None. This was
> probably caused by me applying just this patch - there's no "pid" in
> docShell lines in my output.
Good point, I will address this in the next version of the patch and add some better error handling.
Comment 36•11 years ago
|
||
Assignee | ||
Comment 37•11 years ago
|
||
Together with the patches from bug 934091 and bug 934206 we have a green try run without any leaks \o/. I modified the error handling to explicitly print an error message should we somehow ever get interleaved log lines again. I added some cleanup to browser-test.js to avoid false positives for features that explicitly may keep DOMWindows and DocShells alive until shutdown.
https://tbpl.mozilla.org/?tree=Try&rev=571ad96c3957
Attachment #826283 -
Attachment is obsolete: true
Attachment #827536 -
Flags: review?(ted)
Assignee | ||
Comment 38•11 years ago
|
||
Small fix for Android mochitests: added a default value for the new |webapprtChrome| argument in build/automation.py.in. All running fine now:
https://tbpl.mozilla.org/?tree=Try&rev=6dc945b3212e
Attachment #827536 -
Attachment is obsolete: true
Attachment #827536 -
Flags: review?(ted)
Attachment #829885 -
Flags: review?(ted)
Assignee | ||
Comment 39•11 years ago
|
||
Latest try push with the leak detector:
https://tbpl.mozilla.org/?tree=Try&rev=21a5d107d49f
Updated•11 years ago
|
Whiteboard: [MemShrink] [leave open] → [MemShrink:P1] [leave open]
Comment 40•11 years ago
|
||
ttaubert, how close is this to landing?
Assignee | ||
Comment 41•11 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #40)
> ttaubert, how close is this to landing?
This is waiting for review and should be good to go then.
Comment 42•11 years ago
|
||
ted: eight day review ping. This is an important patch. Thanks.
Comment 43•11 years ago
|
||
Comment on attachment 829885 [details] [diff] [review]
Bring back the shutdown leak detector, v7
Review of attachment 829885 [details] [diff] [review]:
-----------------------------------------------------------------
Sorry, I was travelling last week and have been generally behind on reviews.
Attachment #829885 -
Flags: review?(ted) → review+
Assignee | ||
Comment 44•11 years ago
|
||
Whiteboard: [MemShrink:P1] [leave open] → [MemShrink:P1]
Comment 45•11 years ago
|
||
\o/
Comment 46•11 years ago
|
||
Backed out for mochitest-mc perma-purple.
https://hg.mozilla.org/integration/fx-team/rev/e29cc01043fb
https://tbpl.mozilla.org/php/getParsedLog.php?id=30817704&tree=Fx-Team
Assignee | ||
Comment 47•11 years ago
|
||
Sorry about that. I pushed to try before but ran only mochitest-bc tests.
Assignee | ||
Comment 48•11 years ago
|
||
Pushed to try again with some minor fixes to account for metro:
https://tbpl.mozilla.org/?tree=Try&rev=04ab4ad2f6ba
Assignee | ||
Comment 49•11 years ago
|
||
Pushed to try again with a smaller and better fix:
https://tbpl.mozilla.org/?tree=Try&rev=db126ca7ee5a
Seems all green, waiting a little more to make sure.
Assignee | ||
Comment 50•11 years ago
|
||
Try was green-ish. Landed again:
https://hg.mozilla.org/integration/fx-team/rev/45dda1081309
Comment 51•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/45dda1081309
\m/ frlz this time :)
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla28
You need to log in
before you can comment on or make changes to this bug.
Description
•