Closed Bug 368773 Opened 18 years ago Closed 18 years ago

Add a bunch of DOM objects to cycle collection

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: peterv, Assigned: peterv)

References

Details

Attachments

(5 files, 4 obsolete files)

I focused mainly on objects that keep nodes, JS contexts and global objects alive. Still need to figure out unlinking for a lot of these, and probably provide a macro for derived cycle collector helper classes.
Attached patch v1 (obsolete) (deleted) — Splinter Review
Attached patch v1.1 (obsolete) (deleted) — Splinter Review
Applies on top of the patch for bug 368549.
Attachment #253427 - Attachment is obsolete: true
Blocks: 348156
Attached patch v1.2 (obsolete) (deleted) — Splinter Review
Especially interested in jst's review of the changes in nsGlobalWindow. Note that I often didn't implement unlink in the element classes because we will remove children in nsGenericElement::Unlink which will end up removing the descendants from relevant arrays/lists in the derived element class. In other cases I'm being a bit conservative.
Attachment #254039 - Attachment is obsolete: true
Attachment #255717 - Flags: superreview?(jst)
Attachment #255717 - Flags: review?(jonas)
Comment on attachment 255717 [details] [diff] [review]
v1.2

sr=jst for the dom/ changes, and I skimmed over the rest too and that looks fine to me as well.
Attachment #255717 - Flags: superreview?(jst) → superreview+
How close does this get us to having the leak stats go back down to where they were before the cycle collector landed?
(In reply to comment #5)
> How close does this get us to having the leak stats go back down to where they
> were before the cycle collector landed?
> 

Doesn't make a dent. Or rather: doesn't make *the* dent that closes the cycle and causes all the leaks to go away. Certainly traverses more of the graph.
(In reply to comment #6)
> Doesn't make a dent. Or rather: doesn't make *the* dent that closes the cycle
> and causes all the leaks to go away. Certainly traverses more of the graph.

What do you call a dent? Here's what I see with just this patch in a debug build on OS X, starting up and quitting:

          |<----------------Class--------------->|<-----Bytes------>|
                                              Per-Inst   Leaked 
Without patch:
   0 TOTAL                                          24   672744 
With patch:
   0 TOTAL                                          24   273335 

There will be no magic bullets until we've added most of the DOM objects to the cycle collector and that's what this patch tries to do.
Status: NEW → ASSIGNED
Apologies. I guess my feeling is that we're missing something fundamental, because we're still leaking full windows. I am surprised and a little skeptical that so many edges in the graph would actually be cycle-members; but I guess it's possible. It's very hard to get a perfect picture for why it's not collecting things. I didn't mean to discourage this patch from landing. It's definitely worthwhile.
Given that you need to find all the edges of all the nodes that participate in a cycle, wouldn't you expect many edges? For example, many elements hold an eventlistener manager that holds a JS event listener that holds the element. Simple cycle, but to collect it you need to also find all the edges into the element, which can be a lot. If you fail to release that cycle the JS event listener holds its global object alive, so we leak a window.
I am seeing windows being collected in a build that has this patch and a couple of other changes, but I'm also seeing windows leaking where I have found 430 of the 431 edges. I'm really not sure what fundamental issue we would be missing.
(In reply to comment #9)
> but I'm also seeing windows leaking where I have found 430 of
> the 431 edges. I'm really not sure what fundamental issue we would be missing.

Brute force logging, if not the existing refcnt-balancer stuff, should be able to smoke out the missing edge, shouldn't it?

/be 

Combining refcnt-balancer logs with xpconnect GC logs makes it possible to identify cycle culprits, but it is a pretty arduous task. Dbaron has done it successfully and taught me how once, and I managed to do it but I forget the details.
Yes, that's how I got some of the 430 others.
BTW, that's 431 JS edges into the JS global object for a chrome window, which holds the nsGlobalWindow alive.
For what it's worth (probably not much), the above patch needs very slight merging since bug 371417 landed; the above patch contains the same change that already landed.
Comment on attachment 255717 [details] [diff] [review]
v1.2

>Index: dom/src/base/nsGlobalWindow.cpp

>   // FIXME: somewhere in these commented lines lies a bug that causes
>   // a segfault. So we have disabled them, even though it seems wrong
>   // to do so. Other matters are more pressing at the moment.
> 
>   // Traverse stuff from nsPIDOMWindow
>-  // NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR(mChromeEventHandler)
>-  // NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR(mDocument)
>+  NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR(mChromeEventHandler)
>+  NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR(mDocument)

You should probably remove the "FIXME" comment given that you fixed it.
Attached patch v1.3 (deleted) — Splinter Review
Here's a patch updated to trunk. I also made a small change to the patch, I removed this chunk:

+      void *global = tmp->mScriptGlobals[NS_STID_INDEX(lang_id)];
+      if (global) {
+        cb.NoteScriptChild(lang_id, global);
+      }

from the traversal code for nsXULPDGlobalObject. We already traverse the context, which will traverse the global object and nsXULPDGlobalObject doesn't keep the global object alive itself, so I think that chunk is wrong.
Attachment #255717 - Attachment is obsolete: true
Attachment #256980 - Flags: superreview+
Attachment #256980 - Flags: review?(jonas)
Attachment #255717 - Flags: review?(jonas)
Comment on attachment 256980 [details] [diff] [review]
v1.3

>Index: content/base/src/nsDocument.cpp
>@@ -948,14 +1018,23 @@ NS_IMPL_CYCLE_COLLECTION_TRAVERSE_BEGIN(
> 
>   // Traverse all nsIDocument pointer members.
>   NS_IMPL_CYCLE_COLLECTION_TRAVERSE_RAWPTR(mBindingManager)
>-  NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR(mSecurityInfo)
> 
>   // Traverse all nsDocument nsCOMPtrs.
>+  NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR(mSecurityInfo)

mSecurityInfo still lives in nsIDocument, no?

>Index: content/html/content/src/nsHTMLTextAreaElement.cpp
>+NS_IMPL_CYCLE_COLLECTION_CLASS(nsHTMLTextAreaElement)
>+NS_IMPL_CYCLE_COLLECTION_TRAVERSE_BEGIN_INHERITED(nsHTMLTextAreaElement,
>+                                                  nsGenericHTMLFormElement)
>+  NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR(mControllers)
>+NS_IMPL_CYCLE_COLLECTION_TRAVERSE_END

Why not unlink mControllers too?

>Index: content/xbl/src/nsBindingManager.cpp
>+nsBindingManager::Traverse(nsIContent *aContent,
>+                           nsCycleCollectionTraversalCallback &cb)
>+{
>+  nsXBLBinding *binding = GetBinding(aContent);
>+  if (binding) {
>+    // XXX nsXBLBinding isn't nsISupports but it is refcounted, so we can't
>+    //     traverse it.
>+    cb.NoteXPCOMChild(aContent);

You could here traverse all objects pointed to by the nsXBLBinding. However we might as well wait until we can traverse native objects.


>+  nsISupports *value;
>+  if (mContentListTable.ops &&
>+      (value = LookupObject(mContentListTable, aContent))) {
>+    cb.NoteXPCOMChild(aContent);
>+    cb.NoteXPCOMChild(value);
>+  }
>+  if (mAnonymousNodesTable.ops &&
>+      (value = LookupObject(mAnonymousNodesTable, aContent))) {
>+    cb.NoteXPCOMChild(aContent);
>+    cb.NoteXPCOMChild(value);
>+  }

To make this effective you also need to add CC code to the nsAnonymousContentList class.


at nsXMLDocument.cpp
(In reply to comment #17)
> mSecurityInfo still lives in nsIDocument, no?

Yes, mistake. I'll undo this.

> Why not unlink mControllers too?

Yes, that looks safe enough. Added.

> You could here traverse all objects pointed to by the nsXBLBinding. However we
> might as well wait until we can traverse native objects.

Since nsXBLBinding is refcounted itself we must traverse it explicitly, not as part of other objects holding onto it. Otherwise we could unlink too soon.

> To make this effective you also need to add CC code to the
> nsAnonymousContentList class.

Doesn't make much sense to do nsAnonymousContentList here, it holds only non-XPCOM objects so part of bug 368774.
Comment on attachment 256980 [details] [diff] [review]
v1.3

>Index: content/xul/templates/src/nsRDFQuery.cpp
>+NS_IMPL_CYCLE_COLLECTION_CLASS(nsRDFQuery)
>+NS_IMPL_CYCLE_COLLECTION_UNLINK_BEGIN(nsRDFQuery)
>+  NS_IMPL_CYCLE_COLLECTION_UNLINK_NSCOMPTR(mQueryNode)
>+NS_IMPL_CYCLE_COLLECTION_UNLINK_END
>+NS_IMPL_CYCLE_COLLECTION_TRAVERSE_BEGIN(nsRDFQuery)
>+  NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR(mQueryNode)
>+NS_IMPL_CYCLE_COLLECTION_TRAVERSE_END

NS_IMPL_CYCLE_COLLECTION_CLASS_1(mQueryNode) ?

Same thing in nsXULTemplateResultRDF.cpp

>Index: dom/src/base/nsJSEnvironment.cpp

> // QueryInterface implementation for nsJSContext
>+NS_IMPL_CYCLE_COLLECTION_CLASS(nsJSContext)
>+// XXX Should we call ClearScope here?

I'm sort of tempted to say yes here.

It would be great if the cycle collector would assert if it found a cycle but were unable to collect it. That is, if it identified a cycle, but calling unlink doesn't destroy any of the elements.

With that, and assuming jst checked that the DOM parts were ok, r=me
Attachment #256980 - Flags: review?(jonas) → review+
Lets land this asap (would be great if you could do it tomorrow CET). We can look into the ClearScope thing separately.
Attached patch Patch that was checked in (deleted) — Splinter Review
Here's the patch that I checked in (with bustage fixes). Two notable changes in addition to the ones from the review comments:

* Made nsXBLDocGlobalObject not traverse its mJSObject, same as the change for
  nsXULPDGlobalObject (see comment 16)
* nsDOMScriptObjectHolder::traverse was not needed anymore,
  nsDOMScriptObjectHolder is only used as stack-based objects (result of fixing
  bug 370265)

Didn't do the ClearScope change.
This seems to have had no effect on Rlk on fxdbug-linux-tbox, but it did cause a drop in Rlk (366KB->199KB) and Lk (4.57MB->4.00MB) on nye. Not sure what could explain that.
> Not sure what could explain that.

One's Seamonkey, one is Firefox. The Firefox UI has a lot more interdependent and complicated JS, with C++ holding more pointers to JS objects and vice versa.

On other words, Firefox generally has more cycles.  ;)
Though earlier versions of the patch helped Firefox leaks a lot.  Maybe one of the revisions broke something?
Depends on: 373201
Depends on: 373219
In nsXBLResourceLoader.cpp:

    for (i = 0; i < count; ++i) {
      cb.NoteXPCOMChild(tmp->mBoundElements->ElementAt(i));
    }

Leaks everything in the array, no?  ElementAt() addrefs.
Depends on: 373480
Attached patch Fix mBoundElements leak (obsolete) (deleted) — Splinter Review
Sigh :-(. Thanks for catching that.
Attachment #258187 - Flags: superreview?(bzbarsky)
Attachment #258187 - Flags: review?(bzbarsky)
Could we just use nsCOMArray?
Attachment #258195 - Flags: superreview?(peterv)
Attachment #258195 - Flags: review?(peterv)
Comment on attachment 258195 [details] [diff] [review]
nsISupportsArray -> nsCOMArray<nsIContent>

Sure, we can do that too.

>Index: content/xbl/src/nsXBLResourceLoader.cpp
>===================================================================

>+  PRInt32 i, count = tmp->mBoundElements.Count();
>+  for (i = 0; i < count; ++i) {
>+    cb.NoteXPCOMChild(tmp->mBoundElements.ObjectAt(i));
>   }

NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMARRAY
Attachment #258195 - Flags: superreview?(peterv)
Attachment #258195 - Flags: superreview+
Attachment #258195 - Flags: review?(peterv)
Attachment #258195 - Flags: review+
Attachment #258187 - Attachment is obsolete: true
Attachment #258187 - Flags: superreview?(bzbarsky)
Attachment #258187 - Flags: review?(bzbarsky)
I'll fix mCommandObserversTable in nsCommandManager.
Comment on attachment 258195 [details] [diff] [review]
nsISupportsArray -> nsCOMArray<nsIContent>

Checked in (using
the macro).
Comment on attachment 257817 [details] [diff] [review]
Patch that was checked in

>Index: content/html/content/src/nsHTMLObjectElement.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/content/html/content/src/nsHTMLObjectElement.cpp,v
>retrieving revision 1.102
>diff -u -p -r1.102 nsHTMLObjectElement.cpp
>--- content/html/content/src/nsHTMLObjectElement.cpp	30 Jan 2007 00:06:35 -0000	1.102
>+++ content/html/content/src/nsHTMLObjectElement.cpp	8 Mar 2007 10:36:23 -0000
>@@ -109,6 +109,9 @@ public:
> 
>   virtual nsresult Clone(nsINodeInfo *aNodeInfo, nsINode **aResult) const;
> 
>+  NS_DECL_CYCLE_COLLECTION_CLASS_INHERITED_NO_UNLINK(nsHTMLObjectElement,
>+                                                     nsGenericHTMLElement)
>+
> private:
>   /**
>    * Calls LoadObject with the correct arguments to start the plugin load.
>@@ -153,11 +156,17 @@ nsHTMLObjectElement::DoneAddingChildren(
>   return NS_OK;
> }
> 
>+NS_IMPL_CYCLE_COLLECTION_CLASS(nsHTMLObjectElement)
>+NS_IMPL_CYCLE_COLLECTION_TRAVERSE_BEGIN_INHERITED(nsHTMLObjectElement,
>+                                                  nsGenericHTMLFormElement)
>+  tmp->Traverse(cb);
>+NS_IMPL_CYCLE_COLLECTION_TRAVERSE_END
>+
> NS_IMPL_ADDREF_INHERITED(nsHTMLObjectElement, nsGenericElement) 
> NS_IMPL_RELEASE_INHERITED(nsHTMLObjectElement, nsGenericElement) 
> 
>-NS_HTML_CONTENT_INTERFACE_MAP_BEGIN(nsHTMLObjectElement,
>-                                    nsGenericHTMLFormElement)
>+NS_HTML_CONTENT_CC_INTERFACE_MAP_BEGIN(nsHTMLObjectElement,
>+                                       nsGenericHTMLFormElement)
>   NS_INTERFACE_MAP_ENTRY(nsIDOMHTMLObjectElement)
>   NS_INTERFACE_MAP_ENTRY(imgIDecoderObserver)
>   NS_INTERFACE_MAP_ENTRY(nsIRequestObserver)
OK, so this confuses me for two reasons: sometimes you used nsGenericHTMLElement and sometimes you used nsGenericHTMLFormElement yet neither one specifically implements their own cycle collection class...
So it'll just forward to the first baseclass that does implement cycle collection. And if we ever need to add cycle collection code to one of those two base classes we don't need to search for every class that inherits from them and potentially forget to change one.
Using nsCOMArray is the way to go.  The only way to make nsSupportsArray play nice with cycle collection is to make it participate in it, since multiple people can have strong refs to the same nsISupportsArray, and none of them really "owns" the things in the array per se...

Not an issue for the hashtable, since those are not refcounted.
Peter, do you want a separate bug on mCommandObserversTable?

Maybe I should just try to get my local changes that make ElementAt() return an already_AddRefed<nsISupports> landed...
Attachment #258521 - Flags: superreview?(bzbarsky)
Attachment #258521 - Flags: review?(bzbarsky)
Comment on attachment 258521 [details] [diff] [review]
nsISupportsArray -> nsCOMArray<nsIObserver>

Looks good!
Attachment #258521 - Flags: superreview?(bzbarsky)
Attachment #258521 - Flags: superreview+
Attachment #258521 - Flags: review?(bzbarsky)
Attachment #258521 - Flags: review+
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Depends on: 374415
It seems this checkin caused firefox crash at starup when accessibility enabled.

Please refer to tinderbox Firefox-Ports, where accessibility is enabled.
http://tinderbox.mozilla.org/showbuilds.cgi?tree=Firefox-Ports

core stack:

core '/export/home/mrbld/tinderbox/SunOS_5.11_Depend/mozilla//dist/bin/core' of 13755:	/export/home/mrbld/tinderbox/SunOS_5.11_Depend/mozilla//dist/bin/firef
-----------------  lwp# 1 / thread# 1  --------------------
 fed98567 _lwp_kill (1, b) + 7
 fed520f2 raise    (b) + 22
 fec8a8ea void nsProfileLock::FatalSignalHandler(int) (b, 0, 8044914) + e6
 fed9791f __sighndlr (b, 0, 8044914, fec8a804) + f
 fed8cf9b call_user_handler (b, 0, 8044914) + 2b8
 fed8d142 sigacthandler (b, 0, 8044914) + c2
 --- called from signal handler with signal 11 (SIGSEGV) ---
 feb4fedd PL_DHashTableOperate (814fee4, fa3c51fd, 0) + 1d
 fa7001e0 unsigned nsCommandManager::AddCommandObserver(nsIObserver*,const char*) (814fed0, 814f20c, fa3c51fd) + 44
 fa34149b unsigned nsDocAccessible::AddEventListeners() (814f1b0, 8044c68, fa3e8248, 8044c6c, fa348a96, 8044c6c) + 13f
 fa340db9 unsigned nsDocAccessible::Init() (814f1b0) + 35
 fa348aa4 unsigned nsAccessibilityService::CreateRootAccessible(nsIPresShell*,nsIDocument*,nsIAccessible**) (8c0f448, 8aae148, 8aa3b10, 8044e9c) + 1f4
 fa34b1f4 unsigned nsAccessibilityService::GetAccessible(nsIDOMNode*,nsIPresShell*,nsIWeakReference*,nsIFrame**,int*,nsIAccessible**) (8c0f448, 8aa3ba0, 8aae148, 8aada58, 8044f24, 8044f28) + 21c
 fa34ae2d unsigned nsAccessibilityService::GetAccessibleInShell(nsIDOMNode*,nsIPresShell*,nsIAccessible**) (8c0f448, 8aa3ba0, 8aae148, 80450cc) + 55
 fa362cc5 unsigned nsRootAccessible::HandleEventWithTarget(nsIDOMEvent*,nsIDOMNode*) (82605e0, 81450c8, 8aa3ba0) + 45d
 fa39160d unsigned nsRootAccessibleWrap::HandleEventWithTarget(nsIDOMEvent*,nsIDOMNode*) (82605e0, 81450c8, 8aa3ba0) + 35
 fa362837 unsigned nsRootAccessible::HandleEvent(nsIDOMEvent*) (82605e0, 81450c8) + 63
 fadd7c2a unsigned nsEventListenerManager::HandleEventSubType(nsListenerStruct*,nsIDOMEventListener*,nsIDOMEvent*,nsISupports*,unsigned) (86f60c0, 8c0f468, 8260698, 81450c8, 862dcb0, 4) + 246
Attached patch patch to fix the crashes (deleted) — Splinter Review
init the hashtable before use it.
Comment on attachment 258976 [details] [diff] [review]
patch to fix the crashes


>+  mObserversTable.Init();

Check the return value of the .Init(), then r=me
Attachment #258976 - Flags: review+
committed as NS_ENSURE_TRUE(mObserversTable.Init(), NS_ERROR_OUT_OF_MEMORY);
Thanks.
Seems to me that adding traversal of nsXULDocument::mElementMap wasn't actually necessary, since it doesn't add refs for the elements it references.
(In reply to comment #41)
> Seems to me that adding traversal of nsXULDocument::mElementMap wasn't actually
> necessary, since it doesn't add refs for the elements it references.

Looking at http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/content/xul/document/src/nsElementMap.h&rev=1.15&mark=83#80 I think it does.
(In reply to comment #31)
> So it'll just forward to the first baseclass that does implement cycle
> collection. And if we ever need to add cycle collection code to one of those
> two base classes we don't need to search for every class that inherits from
> them and potentially forget to change one.
Surely if cycle collection is added to nsGenericHTMLFormElement collection then the forwarder from nsHTMLObjectElement to nsGenericHTMLElement will be wrong?
nsHTMLObjectElement forwards to nsGenericHTMLFormElement.
Ah, hmm, in the declaration it forwards to nsGenericHTMLElement. That looks wrong, yes.
(In reply to comment #42)
> Looking at
> http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/content/xul/document/src/nsElementMap.h&rev=1.15&mark=83#80
> I think it does.

Oh right.

It shouldn't really need to hold strong refs though, since nsHTMLDocument's id table doesn't.
Yeah, though I would double-check that we're reliable in removing nodes from that table when they are removed out of the doc. Hisorically the XUL DOM has had a lot of issues.

One thing to check is that cloned nodes can't result nodes not being in the Doc but being in that hash.
if getElementById can return nodes that aren't in the document, isn't that a bug?
getElementById in XUL documents is very very buggy, for what it's worth.  It'll return all sorts of stuff that doesn't even have those IDs.
Roc: Yes, but I expect the XUL DOM to have lots of bugs :( It's more ok if those bugs just cause behavioral quirks, but it's not ok if those bugs cause exploitable crashes.

That said, i think we should make the refs weak and make sure that we don't put out-of-doc nodes in the hash.

Then there's the issue that bz brings up, that nodes without an given ID ends up in the hash. This is due to us treating 'ref' exactly as 'id'. I think this is more ok as long as we do it correctly.

Looking at the current code it looks like you can end up with a "dangling" pointer in the following situations:

1. <xul:hbox ref="foo"> is created and inserted into the DOM
2. Someone calls hbox.setAttribute("ref", "bar");
3. The hbox is removed from the DOM.

I'm sure there are other situations like this unfortunately :(
I'm reworking the XUL element map in a way that should fix that particular bug. What I'm more worried about is situations where XUL does weird things so the normal notifications don't fire. We should probably defer this discussion until my patch is ready for review.
Actually, never mind, it does look like that scenario should be ok. But I'm sure there are other cases where it's not :)
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: