Closed
Bug 391318
Opened 17 years ago
Closed 16 years ago
Leak EM-related stuff after bug 382963
Categories
(SeaMonkey :: UI Design, defect)
Tracking
(Not tracked)
VERIFIED
FIXED
seamonkey2.0a3
People
(Reporter: ajschult784, Assigned: kairo)
References
(Blocks 1 open bug)
Details
(Keywords: memory-leak, regression, Whiteboard: [Workaround: comment 8])
Attachments
(3 files, 2 obsolete files)
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
neil
:
review+
neil
:
superreview+
|
Details | Diff | Splinter Review |
From bug 382963,
comment 14:
with this landed, [I] see 5616 bytes leaked on bloatcycle.html
comment 16:
We now load EM on startup, so if it leaks, then so do we. File a bug on EM?
--------------------------------
Further investigation shows that
Before bug 382963, loading the EM (Tools->Add-on Manager) did not cause a leak.
Firefox leaks nothing whether I opened the Add-on Manager or not.
If I load the Add-on Manager in SeaMonkey now, I leak ~240K.
It seems that the change from 382963 is keeping something alive.
Comment 1•17 years ago
|
||
> It seems that the change from 382963 is keeping something alive.
From looking at the leak stats, could this be the RDF Service/Datasources?
> Firefox leaks nothing whether I opened the Add-on Manager or not.
Did you try leaving closing the Add-on manager until last? That way the RDF datasource will hopefully be freed/closed in a similar way as when we shut the browser down.
Reporter | ||
Comment 2•17 years ago
|
||
> Did you try leaving closing the Add-on manager until last? That way the RDF
> datasource will hopefully be freed/closed in a similar way as when we shut the
> browser down.
That also does not leak in firefox.
Comment 3•17 years ago
|
||
If you start with a non-browser window, open EM, open browser, does that leak?
Assignee | ||
Comment 4•17 years ago
|
||
After the followup fix for bug 383116, we seem to leak even more, at least from what the nye test results tell us.
Reporter | ||
Comment 5•17 years ago
|
||
> If you start with a non-browser window, open EM, open browser, does that leak?
No. Opening addressbook, then EM, then browser, I don't get a leak.
Comment 6•17 years ago
|
||
Please can you try to see which combinations (if any) of these statements leak:
(using ./seamonkey -silent -jsconsole)
top.em=Components.classes['@mozilla.org/extensions/manager;1'].getService(Components.interfaces.nsIExtensionManager);
top.ds=Components.classes['@mozilla.org/rdf/rdf-service;1'].getService(Components.interfaces.nsIRDFService).GetDataSource('rdf:extensions');
If they leak, do they leak in Firefox and/or Thunderbird too? Thanks.
Reporter | ||
Comment 7•17 years ago
|
||
> Please can you try to see which combinations (if any) of these
> statements leak:
no leak in seamonkey or firefox
Comment 8•17 years ago
|
||
Curiouser and curiouser.
If you open the browser, then open the JS console, then evaluate this, does that still leak?
this.ds=top.opener.document.getElementsByAttribute('datasources','rdf:extensions')[0].database;while(this.ds.GetDataSources().hasMoreElements())this.ds.RemoveDataSource(this.ds.GetDataSources().getNext())
If you change datasources="rdf:extensions" to datasources="rdf:local-store" in navigatorOverlay.xul does that still leak? (Unlikely)
Comment 9•17 years ago
|
||
I saw a random comment on another bug that mentioned datasources, so I don't know if its anything to do with this problem or not but just in case:
https://bugzilla.mozilla.org/show_bug.cgi?id=392959#c1
Reporter | ||
Comment 10•17 years ago
|
||
> If you open the browser, then open the JS console, then evaluate
> this, does that still leak?
No
> If you change datasources="rdf:extensions" to datasources="rdf:local-
> store" in navigatorOverlay.xul does that still leak? (Unlikely)
No
Comment 11•17 years ago
|
||
(In reply to comment #10)
>>If you open the browser, then open the JS console, then evaluate
>>this, does that still leak?
>No
Interesting that manually removing the EM datasource from the database fixes the leak; I thought that the composite datasource specifically participates in cycle collection to avoid leaks!
Comment 12•17 years ago
|
||
(In reply to comment #11)
> Interesting that manually removing the EM datasource from the database fixes
> the leak; I thought that the composite datasource specifically participates in
> cycle collection to avoid leaks!
Just because an object participates in cycle collection doesn't mean that any cycle involving it will be collected. A cycle will only be collected if all objects that (directly or indirectly) hold a strong reference to an object in the cycle participate in cycle collection themselves. You can always manually break a cycle that the cycle collector can't collect, but you want to look for references that the cycle collector doesn't know about to really fix this.
Comment 13•17 years ago
|
||
(In reply to comment #12)
>Just because an object participates in cycle collection doesn't mean that any
>cycle involving it will be collected. A cycle will only be collected if all
>objects that (directly or indirectly) hold a strong reference to an object in
>the cycle participate in cycle collection themselves. You can always manually
>break a cycle that the cycle collector can't collect, but you want to look for
>references that the cycle collector doesn't know about to really fix this.
Then why should breaking a reference between two cycle-collected objects work?
Comment 14•17 years ago
|
||
(In reply to comment #13)
> Then why should breaking a reference between two cycle-collected objects work?
Not really sure what you mean, breaking a cycle usually makes all the objects in the cycle go away, whether they're participating in cycle collection or not.
Comment 15•17 years ago
|
||
(In reply to comment #14)
>(In reply to comment #13)
>>Then why should breaking a reference between two cycle-collected objects work?
>Not really sure what you mean, breaking a cycle usually makes all the objects
>in the cycle go away, whether they're participating in cycle collection or not.
Well, we have an RDF template that uses the rdf:extensions data source. This apparently leaks 5616 bytes on on shutdown, however if we remove the data source from the template database and quit then we don't leak those bytes.
Comment 16•17 years ago
|
||
(In reply to comment #15)
> Well, we have an RDF template that uses the rdf:extensions data source. This
> apparently leaks 5616 bytes on on shutdown, however if we remove the data
> source from the template database and quit then we don't leak those bytes.
So something else is holding on to an object in that cycle. Could be a second cycle through some objects unknown to the GC, or just an edge into the cycle that is not part of a cycle itself.
Comment 17•17 years ago
|
||
OK, so I spent some fruitless time trying to track down this leak.
When starting up with the browser window and shutting down, the data source has a refcount of 2 (including 1 in case of weak references) before shutdown.
When starting up with the browser window, disconnecting the composite DS, and shutting down, the data source is deleted on shutdown.
When starting up with the JS console, opening EM, opening browser, the data source has a refcount of 4 before shutdown!
Comment 18•17 years ago
|
||
So, during a straight browser startup/shutdown, there are three references to the extensions datasource:
* one from the JS component loader
* one from the service manager
* one from the composite data source
The first two references get removed during shutdown so there are no uncollectable references to the extensions datasource. I haven't looked for uncollectable references to the composite data source as yet.
Comment 19•17 years ago
|
||
OK, so on shudown the RDF composite data source has two outstanding references. I'm assuming one here is the EM datasource; the other appears to be the local store, which unfortunately isn't cycle collected yet...
Comment 20•17 years ago
|
||
OK, so it's not the local store; that doesn't support RDF observers.
Comment 21•17 years ago
|
||
Having been unable to find any non-JS references to the EM DS, I tried looking at the composite DS instead.
The composite DS is created as part of creating the template builder. When the composite DS is created, it obviously only has the one reference (nsXULTemplateQueryProcessorRDF::GetDatasource::compDB). We then call compDB->AddDataSource(ds); (where ds is the EM DS). compDB calls aDataSource->AddObserver(this); which passes the compDB info xpconnect (since the EM DS is a JS object). It gets QI'd into XPCWrappedNative::GetNewOrUsed::identity (1+1=2). A new XPCWrappedNative is created (2+1=3). It gets QI'd by XPCWrappedNative::InitTearOff (3+1=4). XPCWrappedNative::GetNewOrUsed::identity gets destroyed (4-1=3). The EM DS then calls this._inner.AddObserver(observer), which makes XPCConvert::JSObject2NativeInterface QI it (3+1=4). _inner is an XML DS, which passes the compDS to its mInner, which is a memory DS, which adds compDS to its mObservers nsCOMArray (4+1=5). XPCWrappedNative::CallMethod then releases a ref to balance XPCConvert::JSObject2NativeInterface (5-1=4). nsXULTemplateQueryProcessorRDF::GetDatasource then copies compDS to its db local, since there's no infer data source (4+1=5), then QI's it to its out parameter (5+1=6), then releases its locals (-2=4). nsXULTemplateBuilder::LoadDataSourceUrls (which owns that out parameter as its mDataSource nsCOMPtr) then copies that to its mCompDB and mDB nsCOMPtrs (4+2=6). The nsXULDocument then checks the hookup a couple of times, but that doesn't effectively alter the reference count.
After the document load, frames are constructed and script is evaluated (the fields evaluate at an unsafe time bug). This triggers a GC, which sweeps the XPCWrappedNativeTearOff created earlier (-1=5).
When the window is closed, the cycle collector collects the nsXULTemplateBuilder object thus releasing its three references (-3=2).
Summary of references:
1. nsXULTemplateQueryProcessorRDF::GetDatasource::compDB
Local nsCOMPtr, added by do_CreateInstance, relased on return
2. XPCWrappedNative::GetNewOrUsed::identity, local nsCOMPtr, added by
do_QueryInterface, released on return
3. XPCWrappedNative::mIdentity, member raw ptr, added by
XPCWrappedNative::XPCWrappedNative
4. XPCWrappedNativeTearOff::mNative, member raw ptr, previously added by
QueryInterface in XPCWrappedNative::InitTearOff, released by GC
5. dummy reference, added by XPCConvert::JSObject2NativeInterface, released by
XPCWrappedNative::CallMethod
6. InMemoryDataSource::mObservers, member nsCOMArray, added by AddObserver
7. nsXULTemplateQueryProcessorRDF::GetDatasource::db, local nsCOMPtr, added by
assignment, relased on return
8. nsXULTemplateBuilder::mDataSource, member nsCOMPtr, added by outparam,
released by Unlink
9. nsXULTemplateBuilder::mCompDB, member nsCOMPtr, added by do_QueryInterface,
released by Unlink
10.nsXULTemplateBuilder::mDB, member nsCOMPtr, added by do_QueryInterface,
released by Unlink
Note that references 3 and 6 are still outstanding on shutdown.
Cycle for reference 3:
The composite DS's mObservers[1] refers to a (wrapped) JS object that contains an array that contains a wrapped native for the composite DS.
Cycle for reference 6:
The composite DS's mObservers[2] refers to a (wrapped) JS object that refers to an RDFXMLDataSource that refers to an InMemoryDataSource that contains an nsCOMArray that contains a reference to the composite DS.
Comment 22•17 years ago
|
||
Do you know why we don't collect those cycles? I think all those objects participate and the edges you mention are known to the cycle collector, so it must be something else holding onto one of the objects in one of those 2 cycles. Debugging this will be painful because by definition we're looking for an edge, into one of a number of objects, that the cycle collector doesn't know about. I think you could try calling nsCycleCollector_DEBUG_shouldBeFreed on the composite DS after the cycle collector has unlinked the nsXULTemplateBuilder and look at the debugging output.
Comment 23•17 years ago
|
||
(In reply to comment #22)
>Do you know why we don't collect those cycles?
ajschult, are you quite sure the EM itself doesn't leak? It's the only thing holding on to the composite DS.
Reporter | ||
Comment 24•17 years ago
|
||
> ajschult, are you quite sure the EM itself doesn't leak? It's the only thing
> holding on to the composite DS.
Do you mean, "Does just opening the EM (without opening a navigator window) cause leaks" or "Is the EM one of the things that gets leaked when just opening a navigator window"?
I'm certain that the answer to the first is "no". I don't know how to diagnose the second. I don't know what it would show up as in the leak log.
Reporter | ||
Comment 25•17 years ago
|
||
OK, I still don't know whether the EM itself is leaked, but I verified that ExtensionManager._shutdown() is called and I added |this._ds = null;| so that it would explicitly drop its reference to the data source (this is what you're referring to as the "composite DS", right?). This did not fix the leak.
Comment 26•17 years ago
|
||
(In reply to comment #25)
>OK, I still don't know whether the EM itself is leaked, but I verified that
>ExtensionManager._shutdown() is called and I added |this._ds = null;| so that
>it would explicitly drop its reference to the data source (this is what you're
>referring to as the "composite DS", right?). This did not fix the leak.
No, that's what I'm referring to as the EM DS.
Comment 27•17 years ago
|
||
So I'm seeing several leaks of in-memory and xml data sources caused by the reference added in RDFServiceImpl::GetDataSourceBlocking never being released. So it's looking to me like some of the JS stuff here leaks and entrains things...
Comment 28•17 years ago
|
||
In particular, doing:
this._ds._inner = null;
this._ds = null;
in _shutdown cleans up some but not all of the leaks. So the ExtensionManager JSObject must be leaking somehow.
This part of the JS heap dump is rather suspicious:
0xafdec800 global_for_XPCJSContextStack_SafeJSContext 8a14fc8 via global object
Comment 29•17 years ago
|
||
I added CTOR/DTOR logging to XPCJSContextStack and we don't seem to leak those. So I'm not sure hwo we leaked a global_for_XPCJSContextStack_SafeJSContext here...
Comment 30•17 years ago
|
||
Is it possible that the last CC run happens before ~XPCJSContextStack?
Comment 31•17 years ago
|
||
And in fact that is EXACTLY what happens. I added printfs to both the mOwnsJSContext case in ~XPCJSContextStack and
nsCycleCollector::Shutdown and one of the former happens after the latter.
And it looks like we kill that stack from ~nsXPConnect, which doesn't happen until CapsModuleDtor calls nsScriptSecurityManager::Shutdown. I wonder whether we leak things through this safe context somehow...
Comment 32•17 years ago
|
||
Changing caps doesn't help, because then ~nsXPConnect happens under nsXPConnect::ReleaseXPConnectSingleton called from xpcModuleDtor.
That's where the GC dump happens too (before the release). But I don't see anything in the JS heap dump that would obviously point to thigns we care about being leaked through the global_for_XPCJSContextStack_SafeJSContext...
Comment 33•17 years ago
|
||
We are leaking an XPCSafeJSObjectWrapper function wrapper through that. Not sure whether that matters.... For certain, but the time we do the shutdown GC heap dump the only JS GC roots left are the nsXPCWrappedJS[nsIRDFDataSource,0x86fbbf0].mJSObj and the global_for_XPCJSContextStack_SafeJSContext.
Assignee | ||
Comment 34•17 years ago
|
||
Do we have any progress on that? It would be nice to have tinderbox leaks numbers return to something reasonable...
Comment 35•17 years ago
|
||
This should probably get moved to a core component, or at least depend on some core bugs. Until we fix the issue in comment 31, not much else we can do here...
Assignee | ||
Comment 36•17 years ago
|
||
I know too little about the details here, but I guess it's best to file comment 31 as a new core bug against CC or so and make this depend on that one. Could you do that?
Comment 37•17 years ago
|
||
Filed bug 406914.
Updated•16 years ago
|
OS: Linux → All
Whiteboard: [Workaround: comment 8]
Comment 38•16 years ago
|
||
[Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.1a2pre) Gecko/20080823140037 SeaMonkey/2.0a1pre] (home, optim default) (W2Ksp4)
Ftr, leaks 8557 bytes from 383 objects.
Updated•16 years ago
|
Blocks: SmTestLeak
Comment 39•16 years ago
|
||
[Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.1a2pre) Gecko/20080823180458 SeaMonkey/2.0a1pre] (home, debug default) (W2Ksp4)
Ftr, leaks 9369 bytes from 386 objects.
Fwiw,
this is a little worse than the Windows optim build:
+ 1 nsBaseAppShell, + 1 nsRunnable, + 1 nsThread;
(and 'Per-Inst' size is increased a little on some of the objects).
NB: I removed the additional debug lines from the report.
Updated•16 years ago
|
Blocks: SmTestFail
Updated•16 years ago
|
No longer blocks: SmTestFail
Updated•16 years ago
|
Assignee: jag → nobody
QA Contact: ui-design
Comment 40•16 years ago
|
||
Maybe trivial cases from bug 457440 could give a hint for this bug ?
Assignee | ||
Comment 41•16 years ago
|
||
I don't think so, as that's just a symptom. We already know we're leaking the full EM RDF, so anything that changes that info stored by EM in size changes that size of the leak, but we need to break the cycle or whatever it is that causes it in the first place.
Assignee | ||
Comment 43•16 years ago
|
||
This bug is so disturbing all our leak testing :(
There must be a solution how we can break this cycle when shutting down, i.e. when we don't need to use the EM RDF any more. We can put a listener for app shutdown into nsSuiteGlue and remove the connection to that datasource in some way from there, would that help?
The other possibility is to rework the menu and build it with JS instead of RDF templates, which avoids the problem altogether.
Who would be able to work on either of those solutions, given they work?
Comment 44•16 years ago
|
||
I attached a patch and tool in bug 466157 that helps me in finding the objects that have missing edges for cycle collection. Need to apply the patch and build with DEBUG_CC. Don't know if it'll help.
Comment 45•16 years ago
|
||
[Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.2a1pre) Gecko/20081217 SeaMonkey/2.0a3pre] (home, optim default) (W2Ksp4)
(http://hg.mozilla.org/mozilla-central/rev/4a4a42520901
+http://hg.mozilla.org/comm-central/rev/877154dd96a2 + bug 469606 patch)
I just noticed that
bug 465556 reduced this leak by 7 kB on all 3 platforms :-)
mochitest: 22.5 -> 15.5 kB
chrome : 8.5 -> 1.5 kB
browser : 8.5 -> 1.5 kB
(See bug 450983 where I will add more detailed figures.)
Attachment #335181 -
Attachment is obsolete: true
Comment 46•16 years ago
|
||
(In reply to comment #43)
> This bug is so disturbing all our leak testing :(
>
> There must be a solution how we can break this cycle when shutting down, i.e.
> when we don't need to use the EM RDF any more. We can put a listener for app
> shutdown into nsSuiteGlue and remove the connection to that datasource in some
> way from there, would that help?
> The other possibility is to rework the menu and build it with JS instead of RDF
> templates, which avoids the problem altogether.
>
> Who would be able to work on either of those solutions, given they work?
Is there a reason you can't use the code from comment 8 in an unload handler for any window using the theme menu?
Assignee | ||
Comment 47•16 years ago
|
||
This patch reduces leaks from a mochitest chrome run by 1460 bytes on my machine (Linux debug, self-built), and no RDF stuff is left over, so I guess it really fixes that one (what I have left locally seems to be related to some plugin things, at least to a part).
This is basically what comment #8 had, but giving the element an explicit ID so that calling it more more straight-forward.
Assignee: nobody → kairo
Status: NEW → ASSIGNED
Attachment #353909 -
Flags: superreview?(neil)
Attachment #353909 -
Flags: review?(neil)
Comment 48•16 years ago
|
||
Comment on attachment 353909 [details] [diff] [review]
remove datasource on unload
>+ while(extDS.GetDataSources().hasMoreElements())
>+ extDS.RemoveDataSource(extDS.GetDataSources().getNext())
My quick-and-dirty code wasn't well written - you should of course only call GetDataSources once and there should be a space after the while.
Another possibility that comes to mind is to set the datasources attribute to "rdf:null" which should also clear out the database.
Assignee | ||
Comment 49•16 years ago
|
||
This patch should fix Neil's comments. The alternative variant seems not to work for me.
Attachment #353909 -
Attachment is obsolete: true
Attachment #353917 -
Flags: superreview?(neil)
Attachment #353917 -
Flags: review?(neil)
Attachment #353909 -
Flags: superreview?(neil)
Attachment #353909 -
Flags: review?(neil)
Updated•16 years ago
|
Attachment #353917 -
Flags: superreview?(neil)
Attachment #353917 -
Flags: superreview+
Attachment #353917 -
Flags: review?(neil)
Attachment #353917 -
Flags: review+
Comment 50•16 years ago
|
||
Comment on attachment 353917 [details] [diff] [review]
remove datasource on unload, v1.1
>+ extDB.RemoveDataSource(extDS.getNext())
Nit: missing trailing semicolon
Assignee | ||
Comment 51•16 years ago
|
||
Pushed as http://hg.mozilla.org/comm-central/rev/a36eb2e47cb2 - we should not report this leak any more!
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Flags: wanted-seamonkey2?
Resolution: --- → FIXED
Comment 52•16 years ago
|
||
I filed bug 470709 and bug 470710 about the leaks that comment 8 workaround hides but that the patch did not fix.
V.Fixed.
Status: RESOLVED → VERIFIED
Target Milestone: --- → seamonkey2.0a3
Comment 53•16 years ago
|
||
(In reply to comment #52)
> I filed bug 470709 and bug 470710 about the leaks that comment 8 workaround
> hides but that the patch did not fix.
Er, actually I don't know if that's true for MacOSX bug 470710...
You need to log in
before you can comment on or make changes to this bug.
Description
•