Closed Bug 60729 Opened 24 years ago Closed 14 years ago

Investigate odd cycle involving js objects causing leak of nsBookmarksService and InternetSearchService

Categories

(SeaMonkey :: Bookmarks & History, defect, P4)

defect

Tracking

(Not tracked)

RESOLVED FIXED
Future

People

(Reporter: alecf, Assigned: jag+mozilla)

Details

(Keywords: memory-leak, regression)

Attachments

(3 files)

When fixing the orange build on friday, I had to change the mTimer members of these classes to be member variables, since they are COMPtrs...in doing so, I somehow seem to have introduced a leak. I've done some playing with the refcount balancer, but I have had no luck figuring out how to fix this. it looks like there may be a number of objects holding references to the bookmarks service. the most obvious one is the "mInner" member, which is an in-memory datasource... we do some crazy stuff in nsBookmarksService::Release() to make sure to release the inner early, but it doesn't look like that's ever being hit. I'm adding waterson to the CC because the output from find-leaks.pl lists nsBookmarkService as having one extra reference, but make-tree.pl always has nsBookmarksService with 4 refs... I'm not really sure what's going on..
I wrote on n.p.m.builds this weekend: I don't understand how changing the static nsCOMPtr to a member made the leak stats go up. Static nsCOMPtrs aren't released until after the leak stats are printed, so I don't see how it would cause fewer releases (before the leak stats). What other change made this an issue? Or am I missing something?
Keywords: mlk, regression
bug 60754 was just filed. Do bookmarks work at all anymore?
that other bug sounds like some bad JS, which probably has nothing to do with this.. dbaron: I didn't know that. Maybe it's just coincidence that the 2 leaks started that day? or maybe we didn't previously instantiate these services, but we are today?
alecf: What was the purpose of your changes on Friday? What was the issue with timers that required the changes? FWIW: The previously static nsCOMPtr<> mTimer is cancelled/cleared in nsBookmarkService's destructor.
The reason for the changes was that Linux went orange -- it was crashing on shutdown. (One possible explanation for such a crash could be that someone started leaking the bookmark service so that the static nsCOMPtr wasn't cleared by the dtor, but was instead cleared on the library unload after something it pointed to was no longer valid. But I'm just guessing...)
dbaron: That sounds like a fair guess... which means that the checkins right before the orange began should be examined (for things like XUL document leaks.)
I think this is what was happening before: - the bookmarks service was never getting destroyed, because something was holding a reference to it. - the static nsCOMPtr<> was getting destroyed, and thus releasing the timer, which might have caused the bookmarks service to get destroyed. The only thing that's odd about that was that the bookmarks service should have been reported as a leak, anyway....:(
Hmmm... the timer doesn't hold an owning reference to the bookmarks service.
No, I don't think the timer was holding the bookmarks service, and there's no need for the release of the time to trigger the release of the bookmarks service to do something that shouldn't happen. (What exactly was it that crashed?) So what's causing this nsXULDocument not to be garbage collected? Which leaked JS root is the root of the problem? (And when did the problem start?) It's interetsing to note that the leak stats *were* working while tinderbox was orange, since the crash was after they were printed. (I wish it showed them in spite of being orange!) And the leak was occurring during the orange. So probably the checkin that caused the leak and the one that caused the orange were the same.
Spot the cycle :-)
r=dbaron
And have a=alecf, will check in as soon as the tree opens
reassign to jag so he gets credit :)
Assignee: rjc → disttsc
Can someone diagram the cycle? /be
alecf: hey, some credit vs. most of the blame ;-) brendan: I thought there was a cycle here, but now I'm not so sure anymore. It definitely shows up when storing the |bar| element in the Browser object's |barEl| attribute. So what I think was going on: BrowserInstance has a reference to the XULBrowser object created at onload, this object has a reference to the urlbar element and thus(?) the document, which holds the global JS var appCore which is the BrowserInstance we started with. But like I said, I'm not so sure anymore.
Here's a guess at part of the cycle: The piece jag created is that the XULBrowser object has a pointer in JS to an element of the XUL document (the document in which the script is executing), which keeps the document's script object, and thus the document alive. I'm not sure of the rest, but I imagine something in the document is pointing to the XULBrowser object. Are there good ways to debug such things (to figure out if it's something we could clean up in GlobalWindowImpl::CleanUp)?
Fix was checked in btw. Keeping this bug open pending open questions.
Netscape Nav Triage Team: marking fixed
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Reopening, since there are unresolved questions. Why is the Nav triage team touching the bugs of non-Netscape employees, anyway?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
We should test if this still creates a leak if you insert the same code now (after hyatt's XBL fix).
Yeah, it still leaks. I've reduced it to the following two attachments. Removing the last rule from xul.css stops the leaking. Replace chrome/toolkit/content/global/xul.css with this one and put nav.xul in chrome/comm/content/navigator/, then start with mozilla -chrome chrome://navigator/content/nav.xul Please make sure you're not working with a JAR build, or otherwise edit installed-chrome.txt and remove all "jar:" and ".jar!".
Attached file [testcase] nav.xul (deleted) —
Attached file [testcase] xul.css (deleted) —
So if I take this nav.xul, and the original xul.css, and remove these two bits, the leaking stops: Line 27: *[collapsed="true"], Line: 497: -moz-binding: url("chrome://global/content/xulBindings.xml#slider"); It doesn't stop the leaking for the complete navigator.xul + rest though.
How on earth did you decide to remove THOSE? I can't figure out how they would be relevant :(
You just reduce your testcase to where it's so small that removing anything else will stop the problem from appearing. That led me to these things.
Also see related bug 65377.
Assignee: jag → jaggernaut
Status: REOPENED → NEW
->moz1.0
Target Milestone: --- → mozilla1.0
Blocks: 92580
No longer blocks: 92580
p2/097
Priority: P3 → P2
Target Milestone: mozilla1.0 → mozilla0.9.7
Since we're not leaking currently, and know what link was completing the cycle (and thus how to avoid it), I'm moving this to future, since there is still a theoretical interest (mine) to see if this cycle has been broken elsewhere in the mean time.
Severity: critical → normal
Summary: Leaking nsBookmarksService and InternetSearchService → Investigate odd cycle involving js objects causing leak of nsBookmarksService and InternetSearchService
Target Milestone: mozilla0.9.7 → Future
Priority: P2 → P4
filter keyword: nothingnessless
Product: Browser → Seamonkey
Attachment #19525 - Attachment description: [patch] Oops, introduced a cycle... Fix → [patch] Oops, introduced a cycle... Fix [Checkin: Comment 19]
QA Contact: claudius → bookmarks
This bug is being marked EXPIRED as it has seen no activity in a very long time. If you think that the issue reported might still be relevant, please test with a recent release of SeaMonkey and if the problem persists feel free to re-open the report. Thank you. http://www.seamonkey-project.org/
Status: NEW → RESOLVED
Closed: 24 years ago16 years ago
Resolution: --- → EXPIRED
Bulk reopening incorrectly expired bugs - no activity does not constitute no bug - these need proper checking.
Status: RESOLVED → REOPENED
Resolution: EXPIRED → ---
I'm going to close this since the original issue was fixed. Any additional leaks should be filed as new bugs.
Status: REOPENED → RESOLVED
Closed: 16 years ago14 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: