Closed
Bug 497808
Opened 16 years ago
Closed 12 years ago
GMail window leak
Categories
(Core :: DOM: Core & HTML, defect)
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: roc, Assigned: peterv)
References
Details
(Keywords: memory-leak, Whiteboard: [softblocker])
Attachments
(2 files, 6 obsolete files)
I don't know if this is really our problem or GMail's. But for a while now on trunk, I've been seeing sluggish performance that gradually gets worse over time --- very frequent short pauses that get longer over time. Restart fixes it. Sometimes I've attached a debugger and collected some stack traces; sometimes we're in GMail, with JSON message data on the JS stack, and often we're in the cycle collector.
Just now performance was getting bad again so I tried closing the GMail window. I got the following console output:
JavaScript error: https://mail.google.com/mail/?ui=2&view=js&name=js&ver=aPCJgb7NS-E.en.L&am=b1EopeS3cCHoAb0i0bA2xsRfYkZ2Tw&random=1244409062175, line 22: Fb is not defined
--DOMWINDOW == 231 (0x8780500) [serial = 711] [outer = 0x2529a580] [url = about:blank]
--DOMWINDOW == 230 (0x36fcb9b0) [serial = 7656] [outer = 0x3cade7f0] [url = about:blank]
--DOMWINDOW == 229 (0x23ca4dc0) [serial = 706] [outer = 0xf38eca0] [url = https://mail.google.com/mail/?ui=2&view=bsp&ver=1qygpcgurkovy]
--DOMWINDOW == 228 (0x225d2ad0) [serial = 707] [outer = 0xf38d620] [url = https://mail.google.com/mail/?ui=2&view=bsp&ver=1qygpcgurkovy]
--DOMWINDOW == 227 (0x3cd75860) [serial = 7653] [outer = 0x30036c10] [url = about:blank]
--DOMWINDOW == 226 (0x30036c40) [serial = 7651] [outer = 0x0] [url = https://mail.google.com/mail/?ui=2&ik=17e10128ca&at=xn3j310qfv6pa5kvv7boxakigulx5l&view=up&act=sm&jsid=9ym2eb-79ye1s&cmid=71&rt=h&search=all&zx=s5pe6o-3tocvq]
--DOMWINDOW == 225 (0x367dd970) [serial = 7635] [outer = 0x3e594b50] [url = about:blank]
--DOMWINDOW == 224 (0x3e594b80) [serial = 7634] [outer = 0x0] [url = about:blank]
--DOMWINDOW == 223 (0x3f379390) [serial = 7614] [outer = 0x38eca1f0] [url = about:blank]
--DOMWINDOW == 222 (0x38eca220) [serial = 7612] [outer = 0x0] [url = https://mail.google.com/mail/?ui=2&ik=17e10128ca&at=xn3j310qfv6pa5kvv7boxakigulx5l&view=up&act=sm&jsid=9ym2eb-79ye1s&cmid=70&rt=h&search=inbox&zx=33bdgt-idotca]
... etc etc etc ...
--DOMWINDOW == 97 (0x23dd7230) [serial = 785] [outer = 0x2c8ba870] [url = about:blank]
--DOMWINDOW == 96 (0x2c8ba8a0) [serial = 783] [outer = 0x0] [url = https://mail.google.com/mail/?ui=2&ik=17e10128ca&at=xn3j310qfv6pa5kvv7boxakigulx5l&view=up&act=sm&jsid=9ym2eb-79ye1s&cmid=2&rt=h&cat=WHATWG&search=cat&zx=t4svdl-7bzy3y]
--DOMWINDOW == 95 (0xf3372e0) [serial = 755] [outer = 0xf4569d0] [url = about:blank]
--DOMWINDOW == 94 (0xf456a00) [serial = 753] [outer = 0x0] [url = https://mail.google.com/mail/?ui=2&ik=17e10128ca&at=xn3j310qfv6pa5kvv7boxakigulx5l&view=up&act=sm&jsid=9ym2eb-79ye1s&cmid=1&rt=h&cat=WHATWG&search=cat&zx=oectlf-6bmnp9]
--DOCSHELL 0xcbbab00 == 30
--DOMWINDOW == 93 (0xf38ecd0) [serial = 703] [outer = 0x0] [url = https://mail.google.com/mail/?ui=2&view=bsp&ver=1qygpcgurkovy]
--DOMWINDOW == 92 (0xf38d650) [serial = 704] [outer = 0x0] [url = https://mail.google.com/mail/?ui=2&view=bsp&ver=1qygpcgurkovy]
--DOMWINDOW == 91 (0x2529a5b0) [serial = 710] [outer = 0x0] [url = about:blank]
--DOMWINDOW == 90 (0x3cade820) [serial = 7655] [outer = 0x0] [url = about:blank]
++DOMWINDOW == 91 (0x334842e0) [serial = 7659] [outer = 0x3f64f690]
--DOMWINDOW == 90 (0xf5ac670) [serial = 708] [outer = 0x249ad1c0] [url = https://mail.google.com/mail/?ui=2&view=bsp&ver=1qygpcgurkovy]
--DOMWINDOW == 89 (0x2ad64250) [serial = 702] [outer = 0x0] [url = https://mail.google.com/mail/?ui=2&view=js&name=js&ver=aPCJgb7NS-E.en.L&am=b1EopeS3cCHoAb0i0bA2xsRfYkZ2Tw&random=1244409062175]
--DOMWINDOW == 88 (0xcbbb2c0) [serial = 7] [outer = 0x0] [url = https://mail.google.com/mail/?ui=2&zx=2s9zdrkhzj2w&shva=1]
--DOMWINDOW == 87 (0x3e72ee10) [serial = 7657] [outer = 0x0] [url = https://mail.google.com/mail/?ui=2&ik=17e10128ca&view=tl&start=0&num=120&rt=h&cat=Bugzilla&search=cat]
--DOMWINDOW == 86 (0x2f96d4e0) [serial = 7641] [outer = 0x3f64f690] [url = http://tinderbox.mozilla.org/showbuilds.cgi?tree=Firefox&panel=1]
--DOMWINDOW == 85 (0x226513e0) [serial = 709] [outer = 0x0] [url = https://mail.google.com/mail/?ui=2&view=js&name=js&ver=aPCJgb7NS-E.en.L&am=b1EopeS3cCHoAb0i0bA2xsRfYkZ2Tw&random=1244409062175]
--DOMWINDOW == 84 (0x2ed16a60) [serial = 7658] [outer = 0x0] [url = about:blank]
--DOMWINDOW == 83 (0x832df20) [serial = 701] [outer = 0x0] [url = https://mail.google.com/mail/?ui=2&zx=2s9zdrkhzj2w&shva=1]
So that's about 150-odd DOM windows hanging around associated with GMail ... frequent cycle collection traversal of those windows (perhaps triggered by GMail's polling logic?) might explain the pauses and other symptoms I've been seeing.
This problem is really, really bad. It's been bothering me for a while now, it's hard to pin down, and I haven't had enough information to really file a bug until now. But it's a disastrous user experience.
Reporter | ||
Comment 1•16 years ago
|
||
So next time I notice this happening, what should I do to help figure it out?
I don't understand nsGlobalWindow very well, but it looks like we have a whole lot of outer windows whose URIs correspond to GMail actions, and each one of those has an inner window whose URI is about:blank, but no other inners. That seems weird.
Comment 2•16 years ago
|
||
Do you know what the steps-to-reproduce for getting the window count to increase are?
If you want to generate a lot of maybe-useful data, you might try doing a DEBUG_CC build, but moving this line in nsGlobalWindow:
nsCycleCollector_DEBUG_shouldBeFreed(static_cast<nsIScriptGlobalObject*>(this));
from nsGlobalWindow::CleanUp and FreeInnerObjects and SetDocShell to the constructor.
I think that should give you the explanations of why windows are being kept around (along with lots of other noise).
Reporter | ||
Comment 3•16 years ago
|
||
Right now the only steps to reproduce I have are "use GMail for a few days" :-(
I wonder if we could mozmill our way to STR here...
Er... Don't we (or didn't we?!) have code to prevent CC'ing over windows and associated documents that we know are alive? I would think that would prevent this sort of thing unless the documents in question really are orphaned.
(In reply to comment #5)
> code to prevent CC'ing over windows and associated documents that we
> know are alive
Note that we disable that optimization for DEBUG_CC builds.
Comment 7•16 years ago
|
||
That should only apply if the window is still being presented in a document viewer, etc.
Reporter | ||
Comment 8•16 years ago
|
||
(In reply to comment #2)
> Do you know what the steps-to-reproduce for getting the window count to
> increase are?
>
> If you want to generate a lot of maybe-useful data, you might try doing a
> DEBUG_CC build, but moving this line in nsGlobalWindow:
>
> nsCycleCollector_DEBUG_shouldBeFreed(static_cast<nsIScriptGlobalObject*>(this));
> from nsGlobalWindow::CleanUp and FreeInnerObjects and SetDocShell to the
> constructor.
To the nsGlobalWindow constructor? Are you sure? That sounds weird
Assignee | ||
Comment 9•16 years ago
|
||
(In reply to comment #8)
> To the nsGlobalWindow constructor? Are you sure? That sounds weird
I think so, yes. What it will do is assume that the window should have gone away as soon as it is constructed. So you will get data on who holds it alive every time we cycle collect, starting from when it is created. That way you get the data even if the window is being held alive in a document viewer (which could also be causing this memory growth).
Reporter | ||
Comment 10•16 years ago
|
||
I've got a 4GB log file which contains this when I closed my Gmail window:
--DOMWINDOW == 68 (0x3004c930) [serial = 500] [outer = 0x30be6990] [url = about:blank]
--DOMWINDOW == 67 (0x30be69c0) [serial = 499] [outer = 0x0] [url = https://mail.google.com/mail/?ui=2&ik=17e10128ca&view=tl&star
t=0&num=120&rt=h&search=inbox]
--DOMWINDOW == 66 (0x31f06960) [serial = 497] [outer = 0x0] [url = about:blank]
--DOMWINDOW == 65 (0x8083540) [serial = 19] [outer = 0xc7e8ce0] [url = https://mail.google.com/mail/?nocheckbrowser#inbox]
--DOMWINDOW == 64 (0x9e5e040) [serial = 30] [outer = 0x9e59120] [url = about:blank]
--DOMWINDOW == 63 (0x80d92e0) [serial = 24] [outer = 0x80b7a90] [url = https://mail.google.com/mail/?ui=2&view=bsp&ver=1qygpcgur
kovy]
--DOMWINDOW == 62 (0x80e6990) [serial = 25] [outer = 0x80bbe10] [url = https://mail.google.com/mail/?ui=2&view=bsp&ver=1qygpcgur
kovy]
--DOMWINDOW == 61 (0x33250290) [serial = 498] [outer = 0x0] [url = about:blank]
--DOMWINDOW == 60 (0xc9e8880) [serial = 485] [outer = 0x2590c0d0] [url = about:blank]
--DOMWINDOW == 59 (0x2590c100) [serial = 484] [outer = 0x0] [url = about:blank]
--DOMWINDOW == 58 (0x300c2eb0) [serial = 471] [outer = 0x230371e0] [url = about:blank]
--DOMWINDOW == 57 (0x23037210) [serial = 470] [outer = 0x0] [url = about:blank]
--DOMWINDOW == 56 (0x25340760) [serial = 451] [outer = 0x21d66880] [url = about:blank]
--DOMWINDOW == 55 (0x21d668b0) [serial = 450] [outer = 0x0] [url = about:blank]
--DOMWINDOW == 54 (0x21a578c0) [serial = 421] [outer = 0x9fde470] [url = about:blank]
--DOMWINDOW == 53 (0x9fde4a0) [serial = 420] [outer = 0x0] [url = https://mail.google.com/mail/?ui=2&ik=17e10128ca&view=tl&start
=0&num=120&rt=h&cat=Bugzilla&search=cat]
--DOMWINDOW == 52 (0x31f3dc00) [serial = 330] [outer = 0x27f92cd0] [url = about:blank]
--DOMWINDOW == 51 (0x27f92d00) [serial = 329] [outer = 0x0] [url = https://mail.google.com/mail/?ui=2&ik=17e10128ca&view=cv&th=1
2188cb7e2b04f12&cw=700&rt=h&q=alan+chang+telstra&search=query]
--DOMWINDOW == 50 (0x33226650) [serial = 309] [outer = 0x24888010] [url = about:blank]
--DOMWINDOW == 49 (0x24888040) [serial = 307] [outer = 0x0] [url = https://mail.google.com/mail/?ui=2&ik=17e10128ca&at=xn3j310qf
v6pa5kvv7boxakigulx5l&view=up&act=sm&jsid=kfy400-84ugq4&cmid=5&rt=h&search=inbox&zx=lpr9ss-yallte]
--DOMWINDOW == 48 (0x258331b0) [serial = 218] [outer = 0x235b0930] [url = about:blank]
--DOMWINDOW == 47 (0x235b0960) [serial = 217] [outer = 0x0] [url = https://mail.google.com/mail/?ui=2&ik=17e10128ca&view=tl&star
t=0&num=120&rt=h&cat=WHATWG&search=cat]
--DOMWINDOW == 46 (0x2306d500) [serial = 209] [outer = 0x23af91a0] [url = about:blank]
--DOMWINDOW == 45 (0x23af91d0) [serial = 207] [outer = 0x0] [url = https://mail.google.com/mail/?ui=2&ik=17e10128ca&at=xn3j310qf
v6pa5kvv7boxakigulx5l&view=up&act=sm&jsid=kfy400-84ugq4&cmid=4&rt=h&search=inbox&zx=esmeof-6ga91z]
--DOMWINDOW == 44 (0x224a9c80) [serial = 198] [outer = 0x22655730] [url = about:blank]
--DOMWINDOW == 43 (0x22655760) [serial = 196] [outer = 0x0] [url = https://mail.google.com/mail/?ui=2&ik=17e10128ca&at=xn3j310qf
v6pa5kvv7boxakigulx5l&view=up&act=sm&jsid=kfy400-84ugq4&cmid=3&rt=h&search=inbox&zx=du7bil-jld2ok]
--DOMWINDOW == 42 (0x1060ce80) [serial = 176] [outer = 0x21d18b30] [url = about:blank]
--DOMWINDOW == 41 (0x21d18b60) [serial = 174] [outer = 0x0] [url = https://mail.google.com/mail/?ui=2&ik=17e10128ca&at=xn3j310qf
v6pa5kvv7boxakigulx5l&view=up&act=sm&jsid=kfy400-84ugq4&cmid=2&rt=h&search=inbox&zx=inh54-agp8le]
--DOCSHELL 0xc7e8560 == 18
--DOMWINDOW == 40 (0x80b35a0) [serial = 20] [outer = 0x0] [url = https://mail.google.com/mail/?ui=2&view=js&name=js&ver=J-4BWlKS
S54.en.L&am=b1EopdTncAEFBfnukfAWRlMTj7bl1A]
--DOMWINDOW == 39 (0x80b7ac0) [serial = 21] [outer = 0x0] [url = https://mail.google.com/mail/?ui=2&view=bsp&ver=1qygpcgurkovy]
--DOMWINDOW == 38 (0x80bbe40) [serial = 22] [outer = 0x0] [url = https://mail.google.com/mail/?ui=2&view=bsp&ver=1qygpcgurkovy]
--DOMWINDOW == 37 (0x7d24fe0) [serial = 28] [outer = 0x0] [url = https://mail.google.com/mail/?ui=2&view=js&name=js&ver=J-4BWlKS
S54.en.L&am=b1EopdTncAEFBfnukfAWRlMTj7bl1A]
--DOMWINDOW == 36 (0xc7e8d10) [serial = 7] [outer = 0x0] [url = https://mail.google.com/mail/?nocheckbrowser#inbox]
--DOMWINDOW == 35 (0x9e59150) [serial = 29] [outer = 0x0] [url = about:blank]
What should I do with it? :-)
Reporter | ||
Comment 11•16 years ago
|
||
(In reply to comment #9)
> (In reply to comment #8)
> > To the nsGlobalWindow constructor? Are you sure? That sounds weird
>
> I think so, yes. What it will do is assume that the window should have gone
> away as soon as it is constructed. So you will get data on who holds it alive
> every time we cycle collect, starting from when it is created. That way you get
> the data even if the window is being held alive in a document viewer (which
> could also be causing this memory growth).
So what should I see in the output?
I have lots of occurrences of things like
nsCycleCollector: XPCWrappedNative (XULElement) 0x10297160 was not collected due to 1
external references (2 total - 1 known)
But I can't find any occurrences of things like
nsCycleCollector: nsGlobalWindow ...
Reporter | ||
Comment 12•16 years ago
|
||
So can someone tell me what to do with this DEBUG_CC data?
Assignee | ||
Comment 13•16 years ago
|
||
Are there any lines in the log about nsGlobalWindow?
Reporter | ||
Comment 14•16 years ago
|
||
There are reports which mention nsGlobalWindow like this:
nsCycleCollector: XPCWrappedNative (XULElement) 0x24f3d130 was not collected due to
1
external references (2 total - 1 known)
An object expected to be garbage could be reached from it by the path:
XPCWrappedNative (XULElement) 0x24f3d130
via <unknown edge>
JS Object (XULElement) (global=72d9200) 0x82eb360
via _contentWindow
XPCNativeWrapper (Window) (global=781ae80) 0x7c35e20
via focus
JS Object (Function - XPCNativeWrapper function wrapper) (global=781ae80) 0x7139
e00
via __proto__
JS Object (Function) (global=6ead4c0) 0x7bed508
via __parent__
JS Object (ChromeWindow) (global=6ead4c0) 0x6ead4c0
via xpc_GetJSPrivate(obj)
XPCWrappedNative (ChromeWindow) 0x23277370
via <unknown edge>
nsGlobalWindow 0x21815ff0
via mFocusedNode
nsGenericElement 0x9ebbb80
via mWrapper
XPCWrappedNative (XULElement) 0x23c1a590
via <unknown edge>
JS Object (XULElement) (global=6ead4c0) 0x8862b40
via __parent__
JS Object (XULElement) (global=6ead4c0) 0x88629e0
via __parent__
JS Object (XULElement) (global=6ead4c0) 0x88658e0
via __parent__
JS Object (XULElement) (global=6ead4c0) 0x8865480
via xpc_GetJSPrivate(obj)
XPCWrappedNative (XULElement) 0x22135da0
via <unknown edge>
nsGenericElement 0x8929180
via [via hash] mListenerManager
nsEventListenerManager 0xa08c7d0
via mListeners[i] mListener
nsJSEventListener 0x233f1180
via mContext
nsJSContext 0x2284b590
The 1 known references to it were from:
JS Object (XULElement) (global=72d9200) 0x82eb360 via xpc_GetJSPrivate(obj)
Is that relevant?
Reporter | ||
Comment 15•16 years ago
|
||
This is the patch I used to generate the debug output.
I've uploaded the gzipped log file here:
http://people.mozilla.org/~roc/gmail-foo-bar.gz
It compressed really well ... 4GB to about 70MB.
If you go to the end of the file and search backwards for mail.google.com you'll find the lines that show several DOM windows being released when I closed the tab.
Reporter | ||
Comment 16•15 years ago
|
||
I think I'm still seeing this bug. I'm seeing it in my current session; I've got about 700 DOMWindows with only a few tabs open. Closing the GMail tab does not release the DOMWindows. The browser hangs every few seconds doing cycle collection (I can tell since I have a debugger attached). Anything I can do to help here?
Comment 17•15 years ago
|
||
You could try with https://addons.mozilla.org/en-US/firefox/addon/2490 (also see http://dbaron.org/mozilla/leak-monitor/ ), assuming it still works on trunk.
Reporter | ||
Comment 18•15 years ago
|
||
OK, I'll try that.
After I shut down that browser session, it turned out that the leaked windows were not GMail windows (I guess that's good). They're mostly Firebug windows:
--DOMWINDOW == 147 (0x32c9a960) [serial = 990] [outer = 0x0] [url = chrome://browser/content/browser.xul]
--DOMWINDOW == 146 (0x2e747a00) [serial = 973] [outer = 0x2b84fe70] [url = chrome://firebug/content/panel.html]
--DOMWINDOW == 145 (0x221b47a0) [serial = 972] [outer = 0x288a01a0] [url = chrome://firebug/content/panel.html]
--DOMWINDOW == 144 (0x2b84fea0) [serial = 968] [outer = 0x0] [url = chrome://firebug/content/panel.html]
--DOMWINDOW == 143 (0x288a01d0) [serial = 967] [outer = 0x0] [url = chrome://firebug/content/panel.html]
So maybe this bug is fixed...
Reporter | ||
Comment 19•15 years ago
|
||
Leak monitor runs on trunk, but it pops up multiple windows almost every time I close a chrome window. I'm not sure if these are real issues or not.
Reporter | ||
Comment 20•15 years ago
|
||
I seem to be in a state where closing a leakmonitor window causes a new leak report :-(. I'm going to have to turn this off.
Comment 21•15 years ago
|
||
bug 563826 might help here.
Comment 22•15 years ago
|
||
Roc, bug 563826 landed ~5 days ago, any idea yet as to whether it made things better here or not?
Reporter | ||
Comment 23•15 years ago
|
||
I had a build with this patch running for a few days but before I could close the GMail window to check, my machine crashed :-(
Reporter | ||
Comment 24•15 years ago
|
||
OK, I just shut down and it looks like the leak is still here.
Reporter | ||
Comment 25•15 years ago
|
||
Could we get this debugging patch on trunk? I'd like to stop having to maintain it in my queue, and it could be useful for other people and other problems. I've made it so that you can dump the graph whenever DEBUG or DEBUG_CC is defined.
Attachment #444484 -
Flags: review?(peterv)
Assignee | ||
Comment 26•14 years ago
|
||
(In reply to comment #25)
> I've made it so that you can dump the graph whenever DEBUG or DEBUG_CC is
> defined.
Actually, the patch seems to enable dumping the graph for all builds?
Reporter | ||
Comment 27•14 years ago
|
||
Yes, actually it does, but the cost for non-debug builds when you're not collecting a graph is minimal, I think.
Comment 28•14 years ago
|
||
I've been debugging GMail under DEBUG_CC, mostly concentrating on messages of this type:
> nsCycleCollector: nsGenericElement 0x11924bb70 was not collected due to 2
> external references (5 total - 3 known)
I'm sure my technique is unsophisticated, but here's what I've found.
I made a little function called PrintElement that supplies some default arguments to nsGenericElement::List. It prints out the element name, attributes, and children.
Then, I set a breakpoint on that cycle collector message. Once I catch one, I pull out the pi->mPointer value and cast it to whatever it's supposed to be in gdb.
So far, it's looking like accessibility code is probably the culprit in the elements I've examined so far.
(gdb) call $5.PrintElement()
<div tabindex="-1" role="menu" class="b7 J-M" style="-moz-user-select: none; display: none;" aria-haspopup="true">
</div>
(gdb) call $2.PrintElement()
<input type="text" class="dI dG" label="Search, add, or invite" title="Search, add, or invite" ignoreesc="true" autocomplete="off" id=":vd" aria-haspopup="true">
</input>
(gdb) call $8.PrintElement()
<div class="nH qp" role="navigation">
<div id="gbar">
<nobr>
...
Comment 29•14 years ago
|
||
looking over the /accessible/ directory, I see basically none of it is cycle collected. We should probably write a Dehydra script that prints out nsCOMPtr and other sorts of members that we would expect to be included in a traverse and unlink implementation, but aren't. That might be better than tracking these down in web pages.
Comment 30•14 years ago
|
||
Alright, a11y is not my problem, but it does look fishy that it's not CCed.
Reporter | ||
Comment 31•14 years ago
|
||
(In reply to comment #27)
> Yes, actually it does, but the cost for non-debug builds when you're not
> collecting a graph is minimal, I think.
Peter? Can we get this reviewed and landed please?
Assignee | ||
Comment 32•14 years ago
|
||
Comment on attachment 444484 [details] [diff] [review]
debug patch
We should still always output graphs if DEBUG_CC is set. It looks like this patch disables that.
>diff --git a/dom/base/nsDOMWindowUtils.cpp b/dom/base/nsDOMWindowUtils.cpp
>+nsDOMWindowUtils::GarbageCollect(PRBool aDrawGraph)
>- nsJSContext::CC();
>- nsJSContext::CC();
>+ nsJSContext::CC(aDrawGraph);
>+ nsJSContext::CC(aDrawGraph);
Just drop one of these calls, callers can call twice if needed.
>diff --git a/dom/base/nsJSEnvironment.h b/dom/base/nsJSEnvironment.h
>+ static void CC(PRBool aDrawGraph = PR_FALSE);
I'd rather not use a default argument.
>diff --git a/dom/interfaces/base/nsIDOMWindowUtils.idl b/dom/interfaces/base/nsIDOMWindowUtils.idl
>@@ -254,17 +254,17 @@ interface nsIDOMWindowUtils : nsISupport
>
> /**
> * Force a garbage collection. This will run the cycle-collector twice to
> * make sure all garbage is collected.
> *
> * Will throw a DOM security error if called without UniversalXPConnect
> * privileges in non-debug builds. Available to all callers in debug builds.
> */
>- void garbageCollect();
>+ void garbageCollect(in boolean aDrawGraph);
Add a comment describing the argument.
>diff --git a/xpcom/base/nsCycleCollector.cpp b/xpcom/base/nsCycleCollector.cpp
> #ifdef DEBUG_CC
>+#define DEBUG
This seems weird, I think you should use a new define (DEBUG_DESCRIBE_EDGE_NAME?) that's defined if either DEBUG_CC or DEBUG are defined.
> GCGraphBuilder::GCGraphBuilder(GCGraph &aGraph,
>- nsCycleCollectionLanguageRuntime **aRuntimes)
>+ nsCycleCollectionLanguageRuntime **aRuntimes,
>+ PRBool aDrawGraph)
> : mNodeBuilder(aGraph.mNodes),
> mEdgeBuilder(aGraph.mEdges),
> mRuntimes(aRuntimes)
> {
> if (!PL_DHashTableInit(&mPtrToNodeMap, &PtrNodeOps, nsnull,
> sizeof(PtrToNodeEntry), 32768))
> mPtrToNodeMap.ops = nsnull;
>-#ifdef DEBUG_CC
>- // Do we need to set these all the time?
>- mFlags |= nsCycleCollectionTraversalCallback::WANT_DEBUG_INFO |
>- nsCycleCollectionTraversalCallback::WANT_ALL_TRACES;
>-#endif
>+ if (aDrawGraph) {
>+ mFlags |= nsCycleCollectionTraversalCallback::WANT_DEBUG_INFO |
>+ nsCycleCollectionTraversalCallback::WANT_ALL_TRACES;
Those flags still need to be always set if DEBUG_CC is defined.
> GCGraphBuilder::NoteXPCOMChild(nsISupports *child)
> {
>-#ifdef DEBUG_CC
> nsCString edgeName(mNextEdgeName);
> mNextEdgeName.Truncate();
>-#endif
Shouldn't this be:
nsCString edgeName;
if (WantDebugInfo()) {
edgeName.Assign(mNextEdgeName);
mNextEdgeName.Truncate();
}
(here and below).
> nsCycleCollector::MaybeDrawGraphs()
>+ // We can't just use _popen here because graphviz-for-windows
>+ // doesn't set up its stdin stream properly, sigh.
Just drop this comment.
Minusing because we should still always have the logging in DEBUG_CC builds.
Attachment #444484 -
Flags: review?(peterv) → review-
Reporter | ||
Comment 33•14 years ago
|
||
All comments addressed
Attachment #462562 -
Flags: review?(peterv)
Reporter | ||
Updated•14 years ago
|
Attachment #444484 -
Attachment is obsolete: true
Assignee | ||
Comment 34•14 years ago
|
||
Comment on attachment 462562 [details] [diff] [review]
debugging patch v2
>diff --git a/xpcom/base/nsCycleCollector.cpp b/xpcom/base/nsCycleCollector.cpp
>--- a/xpcom/base/nsCycleCollector.cpp
>+++ b/xpcom/base/nsCycleCollector.cpp
> NS_IMETHODIMP_(void)
> GCGraphBuilder::DescribeNode(CCNodeType type, nsrefcnt refCount,
> size_t objSz, const char *objName)
> {
> #ifdef DEBUG_CC
> mCurrPi->mBytes = objSz;
>- mCurrPi->mName = PL_strdup(objName);
>+#endif
>+#ifdef DEBUG_CC_GRAPHS
>+ if (WantDebugInfo()) {
>+ mCurrPi->mName = PL_strdup(objName);
>+ }
> #endif
I'm ok with doing this, but the graphs will not be very useful without the name I think. I think I'd be equally ok with dropping DEBUG_CC_GRAPHS altogether.
> GCGraphBuilder::NoteScriptChild(PRUint32 langID, void *child)
> {
>-#ifdef DEBUG_CC
> nsCString edgeName(mNextEdgeName);
> mNextEdgeName.Truncate();
nsCString edgeName;
if (WantDebugInfo()) {
edgeName.Assign(mNextEdgeName);
mNextEdgeName.Truncate();
}
>-#endif
>@@ -2429,16 +2422,22 @@ nsCycleCollector::Collect(PRUint32 aTryC
>+ PRBool drawGraphs = mParams.mDrawGraphs;
>+ mParams.mDrawGraphs |= aDrawGraph;
>+#ifdef DEBUG_CC
>+ mParams.mDrawGraphs = PR_TRUE;
>+#endif
After looking at the file before and after I have to say I was wrong. We don't always draw in DEBUG_CC build, so drop these three lines again. Sorry.
Attachment #462562 -
Flags: review?(peterv) → review+
Reporter | ||
Updated•14 years ago
|
Whiteboard: [needs landing]
Reporter | ||
Comment 35•14 years ago
|
||
(In reply to comment #34)
> I think I'd be equally ok with dropping DEBUG_CC_GRAPHS altogether.
If we drop DEBUG_CC_GRAPHS altogether then release builds will have to copy edge-name strings around. That sounds like a perf hit we don't want. Instead I will wrap DEBUG_CC_GRAPHS around all the graph drawing stuff so that release builds can never draw graphs, since the graphs would be useless with all unknown edge labels
Reporter | ||
Comment 36•14 years ago
|
||
Attachment #383846 -
Attachment is obsolete: true
Attachment #462562 -
Attachment is obsolete: true
Comment 37•14 years ago
|
||
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Updated•14 years ago
|
Assignee: nobody → roc
Whiteboard: [needs landing]
Updated•14 years ago
|
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Reporter | ||
Comment 38•14 years ago
|
||
This was landed but was backed out. There were a few problems:
1) build bustage where in some configurations dlldeps.cpp calls nsCycleCollector_collect(), and needed to pass a value for aDrawGraphs
2) we were only calling the destructor for mEdgeNames #ifdef DEBUG_CC, so we appeared to leak a lot (although the leak wasn't real, since the array was always empty)
3) a number of tests call DOMWindowUtils.garbageCollect, and failed because they weren't passing a parameter.
Reporter | ||
Comment 39•14 years ago
|
||
1) fixed call in dlldeps.cpp
2) made parameter to DOMWindowUtils.garbageCollect optional
3) made mEdgeNames only defined #ifdef DEBUG_CC_GRAPHS, made us iterate through and call Destroy on all nodes #ifdef DEBUG_CC_GRAPHS. This fixes the leak.
Attachment #465040 -
Attachment is obsolete: true
Attachment #465141 -
Flags: review?(peterv)
Reporter | ||
Updated•14 years ago
|
blocking2.0: --- → ?
Whiteboard: [needs review]
Assignee | ||
Comment 40•14 years ago
|
||
(In reply to comment #35)
> If we drop DEBUG_CC_GRAPHS altogether then release builds will have to copy
> edge-name strings around. That sounds like a perf hit we don't want.
So your previous patch already did that, my comment was about object names. Given that we wrap everything in |WantDebugInfo()|, I don't see why we care about the performance hit. Graphs without descriptions of nodes and edges sound pretty useless, no?
Reporter | ||
Comment 41•14 years ago
|
||
You're right. But there's also a cost for conditional branches (including where we destroy empty nsTArrays etc), and a space cost in PtrInfo (which is called out as being something that needs to be kept small) --- we're adding two pointers to it.
I really don't want to hurt cycle collector performance. Wrapping everything in DEBUG_CC_GRAPHS means I don't have to worry about it, otherwise I think we're going to have to measure it, and I don't want to do that right now.
Reporter | ||
Comment 42•14 years ago
|
||
Oh, the other performance cost is that in DEBUG_CC_GRAPHS we have to iterate through all PtrInfos to destroy them all. I bet that's measurable. I suppose we could optimize that away though.
I suppose it would be pretty useful to be able to dump graphs from release builds. Maybe we could even expose the graph in an extension or something.
Reporter | ||
Comment 43•14 years ago
|
||
OK, here's the version which enables labelled graphs in release builds.
With this patch, you can get a graph in any build by opening the Error Console and evaluating
window.QueryInterface(Components.interfaces.nsIInterfaceRequestor).getInterface(Components.interfaces.nsIDOMWindowUtils).garbageCollect(true)
Attachment #465533 -
Flags: review?(peterv)
Reporter | ||
Comment 44•14 years ago
|
||
It would be cool if we could have an nsICycleCollectorGraphSink interface that we could pass in as a parameter to garbageCollect(), then extensions could capture the graph and dump it or do other things with it.
Assignee | ||
Comment 45•14 years ago
|
||
Comment on attachment 465533 [details] [diff] [review]
debugging patch, with graphs always enabled
> GCGraphBuilder::NoteScriptChild(PRUint32 langID, void *child)
> {
>-#ifdef DEBUG_CC
> nsCString edgeName(mNextEdgeName);
> mNextEdgeName.Truncate();
>-#endif
nsCString edgeName;
if (WantDebugInfo()) {
edgeName.Assign(mNextEdgeName);
mNextEdgeName.Truncate();
}
I'm ok with this. Adding two pointers to PtrInfo is unfortunate, but if we do want graphs in all builds then we need to bite the bullet. The only other option I see is moving everything behind |#ifdef DEBUG|.
Attachment #465533 -
Flags: review?(peterv) → review+
Assignee | ||
Comment 46•14 years ago
|
||
Comment on attachment 465141 [details] [diff] [review]
debugging patch v4
This is wrong in any case, as it doesn't put all of the edgename code behind DEBUG_CC_GRAPHS.
Attachment #465141 -
Flags: review?(peterv) → review-
Comment 47•14 years ago
|
||
We could avoid adding to PtrInfo if we dump some of the information as we're building the graph. It would require a little more postprocessing, though.
Comment 48•14 years ago
|
||
(For example, one could dump all the nodes (with names), edges (with names), and reference counts while building the graph, and then the internal-refs and colors at the end.)
Reporter | ||
Comment 49•14 years ago
|
||
Should I do that?
Reporter | ||
Comment 50•14 years ago
|
||
I landed the patch, but it caused leaks on all debug boxes so was backed out again.
I'm really too busy and frustrated to work on this more right now. But we really need the GMail bug fixed. This needs to land ASAP so we can get the data to do that. I hope someone who understands the cycle collector will be able to do this.
Assignee: roc → nobody
Whiteboard: [needs review]
Updated•14 years ago
|
Assignee: nobody → peterv
Assignee | ||
Comment 52•14 years ago
|
||
I've put a better logging patch in bug 466157. It already had a reviewed patch for logging, but I've incorporated some of the ideas from this bug (added an interface to listen to the building of the graph, implemented a logging component in all builds without too much overhead).
Updated•14 years ago
|
Whiteboard: softblocker
Updated•14 years ago
|
Whiteboard: softblocker → [softblocker]
Updated•14 years ago
|
Whiteboard: [softblocker]
Updated•14 years ago
|
Whiteboard: [softblocker]
Reporter | ||
Comment 54•14 years ago
|
||
Alex, are you still seeing this? We found one leak related to FAYT and fixed it. I'm not sure I'm seeing this anymore. (I'm not sure I'm NOT seeing it, either.)
Comment 55•14 years ago
|
||
** PRODUCT DRIVERS PLEASE NOTE **
This bug is one of 7 automatically changed from blocking2.0:final+ to blocking2.0:.x during the endgame of Firefox 4 for the following reasons:
- it was marked as a soft blocking issue without a requirement for beta coverage
blocking2.0: final+ → .x+
Comment 56•14 years ago
|
||
(In reply to comment #54)
> Alex, are you still seeing this? We found one leak related to FAYT and fixed
> it. I'm not sure I'm seeing this anymore. (I'm not sure I'm NOT seeing it,
> either.)
As best I can tell (from having the browser open all day with GMail open as an app tab), this bug is alive and well.
Blocks: mlk-fx5+
Reporter | ||
Comment 57•14 years ago
|
||
I agree there is some kind of GMail-related memory issue. Right now I have a session where with nothing but GMail open, I have a 1GB working set.
Reporter | ||
Comment 58•14 years ago
|
||
about:memory:
win32/privatebytes930,041,856
win32/workingset950,538,240
js/gc-heap112,197,632
js/string-data5,179,774
js/mjit-code15,819,988
storage/sqlite/pagecache52,659,808
storage/sqlite/other1,674,216
gfx/d2d/surfacecache778,092
gfx/d2d/surfacevram17,437,844
images/chrome/used/raw0
images/chrome/used/uncompressed588,916
images/chrome/unused/raw0
images/chrome/unused/uncompressed0
images/content/used/raw230,721
images/content/used/uncompressed199,740
images/content/unused/raw6,082
images/content/unused/uncompressed9,648
layout/all5,427,542
layout/bidi652
gfx/surface/image876,292
gfx/surface/win320
content/canvas/2d_pixel_bytes230,400
shmem/allocated1,765,376
shmem/mapped1,765,376
CC dump:
http://people.mozilla.org/~roc/gmail-cc-edges.log.bz2
Peterv, are the tools you use to analyze that log available anywhere? I'm not sure what to do with it.
I grabbed a full crash dump as well.
Comment 59•14 years ago
|
||
I've been running gmail in a new profile in a single tab for a few days, and I'm trying to see if I can use the OS X leaks tool to determine what's happening here. The first real leak that I've found in this effort is bug 644070.
Reporter | ||
Comment 60•14 years ago
|
||
Good on you!
I'm writing a tool to analyze Windows crashdumps and figure out as much as possible about how the memory is being used.
Comment 61•14 years ago
|
||
Also, whatever is the cause of the major leak here is not reported in about:memory. Currently, Firefox is using 1.05GB memory on my system, and about:memory shows ~150MB committed memory. The number 150MB has been stable for me for the past few days.
Comment 62•14 years ago
|
||
I think I've exhausted all of the things that the leaks tool could help me find here. I ended up excluding quite a bit of stuff because of masked pointers, and such:
leaks -exclude GetPropertyTreeChild -exclude js_NewRegExp -exclude InitExnPrivate -exclude 'CSSParserImpl::ParseStyleAttribute(nsAString_internal const&, nsIURI*, nsIURI*, nsIPrincipal*, nsICSSStyleRule**)' -exclude 'nsAttrValue::ParseAtomArray(nsAString_internal const&)' -exclude define_JavaPackage -exclude 'nsAttrValue::EnsureEmptyMiscContainer()' -exclude 'nsRuleNode::ConvertChildrenToHash()' -exclude 'nsRuleNode::Sweep()' -exclude 'nsXULDocument::AddBroadcastListenerFor(nsIDOMElement*, nsIDOMElement*, nsAString_internal const&)' -exclude 'nsRuleNode::Transition(nsIStyleRule*, unsigned char, unsigned char)' -exclude ProcessCharSet -exclude 'js::PropertyTree::insertChild(JSContext*, js::Shape*, js::Shape*)' -exclude 'nsXBLProtoImplMethod::AppendBodyText(nsAString_internal const&)' -exclude 'nsCSSExpandedDataBlock::Compress(nsCSSCompressedDataBlock**, nsCSSCompressedDataBlock**)' -exclude 'nsHtml5Portability::newStringFromBuffer(unsigned short*, int, int)' -exclude 'gfxMacFont::CreatePlatformShaper()' -exclude 'PICLinker::init(JSContext*)' -exclude 'js::PropertyTree::removeChild(js::Shape*)' -exclude 'NewArguments(JSContext*, JSObject*, unsigned int, JSObject&)' -exclude 'nsAlertsService::Init()' -exclude 'nsXULPrototypeElement::SetAttrAt(unsigned int, nsAString_internal const&, nsIURI*)' -exclude 'nsStyledElement::ParseStyleAttribute(nsAString_internal const&, nsAttrValue&, int)' -exclude 'CSSStyleRuleImpl::DeclarationChanged(mozilla::css::Declaration*, int)' -exclude 'nsIDOMCSS2Properties_Set(JSContext*, JSObject*, long, int, nsCSSProperty, unsigned long long*)' -exclude 'TryToStartImageLoad(nsCSSValue const&, nsIDocument*, nsCSSProperty)' -exclude 'nsIDOMCSSStyleDeclaration_SetCssText(JSContext*, JSObject*, long, int, unsigned long long*)' -exclude 'nsAttrValue::ToString(nsAString_internal&) const' -exclude 'nsAttrValue::ToString(nsAString_internal&) const' -exclude 'js_NewFlatClosure(JSContext*, JSFunction*, JSOp, unsigned long)' -exclude 'nsXBLProtoImplMethod::SetLineNumber(unsigned int)' -exclude 'nsXBLProtoImplMethod::AddParameter(nsAString_internal const&)' -exclude TSMCurrentKeyboardInputSourceRefCreate -exclude 'nsCheapStringSet::Put(nsAString_internal const&)' 18569 > mlk
Process 18569: 17707 leaks for 1042896 total leaked bytes.
We're dealing with a cycle here for sure. That's my conclusion for the day.
Assignee | ||
Comment 63•14 years ago
|
||
(In reply to comment #58)
> Peterv, are the tools you use to analyze that log available anywhere?
Yup, in bug 466157. We've got someone to work on better tools, so we should soon have documentation (and hopefully even better tools).
What I've done so far is first reverse the edges using attachment 468816 [details]:
./reverse-edges.pl gmail-cc-edges.log > out-gmail-cc-edges.log
and then look for interesting stuff using attachment 468818 [details]:
./find-roots.pl out-gmail-cc-edges.log google
which gives:
0x2cb77708 nsDocument (xhtml) https://mail.google.com/mail/?ui=2&view=bsp&ver=ohhl
0x10ece780 nsDocument (xhtml) https://mail.google.com/mail/?ui=2&view=btop&ver=1s4
0x34a78880 nsDocument (xhtml) https://mail.google.com/mail/?ui=2&view=bsp&ver=ohhl
...
and
./find-roots.pl out-gmail-cc-edges.log 0x2cb77708
which gives:
0x2cb77708 [nsDocument (xhtml) https://mail.google.com/mail/?ui=2&view=bsp&ver=ohhl]
0x192ecdc8 [JS Object (Range) (global=305d2ed8)]
--[xpc_GetJSPrivate(obj)]-> 0x33fadea0 [XPCWrappedNative (Range)]
--[]-> 0x27ec6f28 [nsRange]
--[mRoot]-> 0x2cb77708 [nsDocument (xhtml) https://mail.google.com/mail/?ui=2&view=bsp&ver=ohhl]
known edges:
0x0f0189d8 JS Object (Object) (global=fdb58c0) --[]--> 0x192ecdc8
so looks like more problems with ranges.
(One can also do |./find-roots.pl out-gmail-cc-edges.log all google| which will look at all those nsDocuments, but that'll take forever).
We are missing some information though, the problem is that for black JS objects there's no good way to know if they are a root or just marked by a root. So for object 0x0f0189d8 the tool thinks it's a root, which might be true or it might just have a root as an ancestor in the graph. I'm trying to change the tool so it'll at least keep going backwards if there are edges into the black JS object.
Assignee | ||
Comment 64•14 years ago
|
||
This is a log for 0x2cb77708 taking into account all incoming edges into JS objects. It is a bit overwhelming though. Some interesting stuff, a number of live timeouts keeping that object alive. Also:
0x11c73c08 [JS Object (XMLHttpRequest) (global=fdb58c0)]
--[]-> 0x0fdb58c0 [JS Object (Window) (global=fdb58c0)]
...
--[mRoot]-> 0x2cb77708 [nsDocument (xhtml) https://mail.google.com/mail/?ui=2&view=bsp&ver=ohhl]
known edges:
JS objects with no known incoming edges are either on the stack or kept alive by traditional JS rooting.
Attachment #465141 -
Attachment is obsolete: true
Attachment #465533 -
Attachment is obsolete: true
Reporter | ||
Comment 65•14 years ago
|
||
The range's start and end nodes are the body element of the 0x2cb77708 document, so we can't just cut that edge somehow.
Reporter | ||
Comment 66•14 years ago
|
||
The window 0x0fdb58c0 is https://mail.google.com/mail/?ui=2&view=js&name=main, so I think that definitely should be alive. The question is, why is it keeping that Range alive?
All paths to the Range go through the object 19D45C60. Too bad the CC-log doesn't label the edges coming out of JS objects... (can we fix that?) I'll see what I can figure out.
Reporter | ||
Comment 67•14 years ago
|
||
The edge labels are
--[]-> 0x19d45c60 [JS Object (Object) (global=fdb58c0)]
--["Ca"]-> 0x0f018a20 [JS Object (Object) (global=fdb58c0)]
--["xa"]-> 0x0f0189d8 [JS Object (Object) (global=fdb58c0)]
--["aa"]-> 0x192ecdc8 [JS Object (Range) (global=305d2ed8)]
Nasty obfuscated names :-(.
I wonder whether, say, Chrome keeps the same set of documents alive. Right now I have a Chrome session with nothing but GMail, and the Chrome dev tools report 23 extant HTMLDocuments, and 11 Range objects. I wonder if in fact GMail just leaks, but each leaked document costs us more for some reason.
Comment 68•14 years ago
|
||
I can reproduce this in GMail with a google chat window open and the cursor blinking inside of it.
blocking2.0: .x+ → ---
OS: Mac OS X → Windows 7
Comment 69•14 years ago
|
||
Did you mean to remove the blocking flag? :-)
Reporter | ||
Comment 70•14 years ago
|
||
Google people agree that it looks like a GMail bug. I'm trying to help them narrow it down.
Reporter | ||
Comment 71•14 years ago
|
||
We narrowed it down to a bug in GMail; when an email editing widget is dismissed, some stuff isn't unlinked from a global object that should be. It's up to them to take it from here.
I'd still like to figure out why we leak so much more memory than Chrome though in the same situation.
Comment 72•14 years ago
|
||
As to the additional memory usage, there are probably other bugs, like Bug 645633
Comment 73•14 years ago
|
||
Maybe in chrome, keeping a range alive doesn't keep the whole document alive? (Fixing bug 335998 will make this sort of problem even more common in Gecko.)
Reporter | ||
Comment 74•14 years ago
|
||
No, I confirmed that Chrome was leaking documents. The documents that are leaked are small too.
Comment 75•14 years ago
|
||
roc, should we close this or leave it open until Google's fix is seen and checked?
Reporter | ||
Comment 76•14 years ago
|
||
The latter.
Reporter | ||
Comment 78•12 years ago
|
||
Not currently.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago → 12 years ago
Flags: needinfo?(roc)
Resolution: --- → WORKSFORME
Updated•6 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•