153519 - [FIX]XML elements lost when no references left to the (top XML) document

Reporter

Description

•

22 years ago

From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv:1.0.0) Gecko/20020530 BuildID: 2002053012 The JavaScript in the above example loads an XML file of employees. The script keeps references to all the "person" elements in the XML file, but does not keep a reference the the top document. It then adds the ids of the "person" elements to a select list, and when you select an id from the window, the name gets written into the "text" input area. Reproducible: Always Steps to Reproduce: 1. Open the above URL. 2. Wait for the id's "1", "2", and "3" to appear in the select menu. 3. Select ids. When an id is selected, Mozilla should display a name in the "text" field. Actual Results: Sometimes, Mozilla works the first few times, but then fails. Often, it files right off the start. Each new id selection should show a different name (1="Monty Burns", 2="Wayland Smithers", and 3="Carl Carlson".) Expected Results: When an id is selected, Mozilla should display a name in the text field. There is a relatively simple work-around - keep a global reference to the document that was loaded. In this example, uncomment the lines: //var parsedDoc; and //parsedDoc=xmldoc; and the file works. It's not clear to me at all which part of the product is at fault here - is it JavaScript that is doing the wrong thing, or the DOM handlers? Obviously, there are no further references to the document, so JavaScript thinks it can delete the document, even though it still has references to the *internal* elements of the document.

Boris Zbarsky [:bzbarsky]

Assignee

Comment 1

•

22 years ago

The problem is your use of getElementsByTagName... this returns an HTMLCollection object which is _live_. That is, removing nodes from the document should change the contents of the |employees| collection. Detroying the whole document should leave the collection empty. If it does not do so in IE, that seems to be a bug in IE. What you want here is to copy over the |employees| collection into an array that will hold references to the nodes...

Thomas Andrews

Reporter

Comment 2

•

22 years ago

Hmmm, I agree that changes to the document are supposed to be represented by the collection (or NodeList - my JavaScript book says it is a NodeList, not an HTMLCollection, but NodeLists have the same "live" feature.) But I don't consider it a "change" to the document that you've dropped one reference. In other words, I consider the HTMLCollection and NodeList to implicitly include a reference to the document. Dropping a reference to a document doesn't *change* the document, so the NodeList/HTMLCollection doesn't logically change. But Mozilla's behavior is inconsistent - sometimes it does what I'd want it to do for a while, while other times it does not. When I add "alert()" calls, for some reason it works better. I'm of the opinion that the timing of Garbage Collection shouldn't affect the behavior of a scripting language. So if, as you say, the removal of the last direct reference to the document implicitly calls for the destruction of the document and clearing of the NodeList/HTMLCollection, then it should clear the NodeList/HTMLCollection immediately. As it is, the behavior is quite confusing.

Boris Zbarsky [:bzbarsky]

Assignee

Comment 3

•

22 years ago

Well... Garbage collection in a scripting language is no different from memory deallocation in something like C. You're basically holding a stale pointer, and when it starts to point to something bogus is not exactly determined.... Confirming to get a decision on whether nodelists should force nodes or documents to stay in memory, though.

Status: UNCONFIRMED → NEW

Ever confirmed: true

Keywords: qawanted

OS: Windows 2000 → All

Hardware: PC → All

Thomas Andrews

Reporter

Comment 4

•

22 years ago

I disagree. Garbage Collection should *avoid* the oddities of malloc() and free(). If I have an interator through a Hash, and the iterator is meant to reflect changes in the hash, and the last direct reference to the Hash is removed, I still consider the iterator to be a reference to the original Hash. [ To clarify my last comment, of course it is a NodeList and not an HTMLCollection - it is a list of XML nodes, not HTML. Not sure how I missed that. ] I've added two further examples to the same directory, with links in the original file: bugex2.html In this example, an "alert()" is called, and for some reason the Javascript "works," at least for a bit. workaround.html The simple workaround. I certainly don't think the IE behavior is "wrong" - at worst it is a different reading of the same standard. In reality, the IE behavior is more consistent and predictable. The DOM Standard is language independent, so I doubt it addresses garbage collection issues, but I could be wrong. Clearly, when a document is deleted, the related NodeList(s) are cleared, as well, but in garbage-collected languages, deletion occurs when the last reference is removed. Does the standard imply that the NodeList object does or does not include a reference to the Document? Certainly, I can imagine implementations which don't, and I can imagine implementations which do. Implementations which do include references behave more consistently, though.

Robert Ginda

Comment 5

•

22 years ago

> Well... Garbage collection in a scripting language is no different from memory > deallocation in something like C. You're basically holding a stale pointer, > and when it starts to point to something bogus is not exactly determined.... eh? Script shouldn't get into a state that "is not exactly determined". What kind of exposure are we talking about here? Could we crash because we held a node list longer than the source document? I think it's a no brainer that holding on to a node list should also hold on to whatever objects the nodelist needs to function properly. At least we should throw an exception if the list is no longer valid. Scripters shouldn't be expected to have to deal with the garbage collecter at all. Also, I'm a little confused as to why the anonymous lambda assigned to xmldoc.onload doesn't root xmldoc. Seems to me that the scope capture should keep it alive. Does the assignment of a lambda through XPConnect involve some trickery?

Johnny Stenback (:jst)

Comment 6

•

22 years ago

We should make nodelists (i.e. the result of calling x.getElementByTagName() or some such function) always be parented at x, that way a reference to the node list would keep x alive and as long as x is in a document x would keep the document alive. This would be a JS only solution, but I believe this is good enough for now since C++ callers are hardly ever hit by this, and they do indeed need to worry about lifetime issues anyway, so this somewhat odd behaviour is not that odd after all from a C++/XPCOM point of view. IIRC bz had some ideas on this one too, shouldn't be hard to do, the JS part of this should be to do something like what's done in nsNodeSH::PreCreate() in the scriptable helper for the node lists... As for setting xmldoc.onload to a function not holding the document alive, I can't see how that would cause any references in the function->xmldoc direction (and there shouldn't be any XPConnect hackery involved there, at least not on the JS reference level), so setting xmldoc.onload should IMO not hold xmldoc around for the lifetime of the function that it's set to.

Robert Ginda

Comment 7

•

22 years ago

> As for setting xmldoc.onload to a function not holding the document alive, I > can't see how that would cause any references in the function->xmldoc direction > (and there shouldn't be any XPConnect hackery involved there, at least not on > the JS reference level), so setting xmldoc.onload should IMO not hold xmldoc > around for the lifetime of the function that it's set to. It's not just that xmldoc.onload was set to a function, but that the function in question should have captured the scope it is defined in, and *that scope* holds a reference to the xmldoc.

Boris Zbarsky [:bzbarsky]

Assignee

Comment 8

•

22 years ago

> Could we crash because we held a node list longer than the source document? Not easily. Document destruction notifies the nodelist, which clears itself. You can get around this by creating the list on a root node, removing the node from the doc, then tearing down the doc or removing nodes... I tried that and only managed to get a NOINTERFACE error, but I'm sure it could lead to a crash... ;) My patch in bug 104603 fixes most of those issues (not all, see the XXX comment in that patch).

Johnny Stenback (:jst)

Comment 9

•

22 years ago

Rob, you're right in that the anonymous function would capture the scope where it's declared, but assuming I understand this correctly that doesn't mean that the scope would be kept around if there's nothing that holds on to any of the things that hold on to the scope. I.e. we have these refereces: scope -> xmldoc -> anon_function (xmldoc.onload) -> scope but nothing outside that scope holds on to that scope or anything in that scope (once we exit that scope) so it'll all get collected when the GC runs...

Sivakiran Tummala

Updated

•

22 years ago

QA Contact: desale → stummala

Boris Zbarsky [:bzbarsky]

Assignee

Comment 10

•

22 years ago

Attached patch Proposed fix (deleted) — Details — Splinter Review

Johnny Stenback (:jst)

Comment 11

•

22 years ago

Comment on attachment 90266 [details] [diff] [review] Proposed fix Nitpicks of the day... - In nsContentListSH::PreCreate(): ... + jsval v; + + nsresult rv = WrapNative(cx, ::JS_GetGlobalObject(cx), native_parent, + NS_GET_IID(nsISupports), &v); Loose the empty line above... - In the declaration of the class nsContentListSH: ... + static nsIClassInfo *doCreate(nsDOMClassInfoData* aData) + { + return new nsContentListSH(aData); + } + +}; Loose the empty line before the closing brace... sr=jst

Attachment #90266 - Flags: superreview+

Boris Zbarsky [:bzbarsky]

Assignee

Comment 12

•

22 years ago

taking

Assignee: jst → bzbarsky

Keywords: qawanted

Priority: -- → P1

Summary: XML elements lost when no references left to the (top XML) document → [FIX]XML elements lost when no references left to the (top XML) document

Target Milestone: --- → mozilla1.1beta

Peter Van der Beken [:peterv]

Comment 13

•

22 years ago

Comment on attachment 90266 [details] [diff] [review] Proposed fix What jst said. While you're there, please correct the comment "FomrControlList helper" just below. r=peterv regardless.

Attachment #90266 - Flags: review+

Boris Zbarsky [:bzbarsky]

Assignee

Comment 14

•

22 years ago

checked in on the trunk

Status: NEW → RESOLVED

Closed: 22 years ago

Resolution: --- → FIXED

:Ms2ger (he/him; ⌚ UTC+1/+2)

Updated

•

14 years ago

Component: DOM: Abstract Schemas → DOM

QA Contact: stummala → general

Nobody; OK to take it and work on it

Updated

•

6 years ago

Component: DOM → DOM: Core & HTML

Bugzilla

[FIX]XML elements lost when no references left to the (top XML) document

Categories

(Core :: DOM: Core & HTML, defect, P1)

Tracking

()

People

(Reporter: mozilla, Assigned: bzbarsky)

References

(
URL
)

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Updated

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Updated

Updated

Attachment

General

Description

File Name

Content Type