Closed Bug 50999 Opened 24 years ago Closed 24 years ago

Closing a window containing a large page is extremely slow

Categories

(Core :: Layout, defect, P3)

defect

Tracking

()

VERIFIED FIXED

People

(Reporter: sfraser_bugs, Assigned: waterson)

References

()

Details

(Keywords: perf, Whiteboard: [dogfood+])

Attachments

(5 files)

Load the above URL in a browser window. Wait for it to load. Then close the window. Closing the window takes forever; I timed 12 secs on a fast machine (debug build). I'll attach a sample stack from this time -- it looks like calling SetDocument() on all contents nodes is really taking time here.
Attached file Macsbug log of stack (deleted) —
You know, I filed the exact same bug 4 minutes after you did! I *may* know how to fix this. It comes from the fix for bug 45676.
*** Bug 51001 has been marked as a duplicate of this bug. ***
It's the frame stuff that's taking the time.
keyword love
Assignee: waterson → dbaron
Keywords: dogfood, nsbeta3, perf
Severity: normal → major
OS: Mac System 8.5 → All
Priority: P3 → P1
Hardware: Macintosh → All
Putting on [dogfood-] radar.
Whiteboard: [dogfood-]
What? I can't debug with this bug in place. Just try it! removing status whiteboard for reconsideration. Note that in my test case it took 12 (twelve) seconds to close the window. That's sure dogfood for me -- it means I'll never use Mozilla to look at cvs blame pages.
Whiteboard: [dogfood-]
Putting on [dogfood+] radar.
Whiteboard: [dogfood+]
OK... one idea didn't work. I have one other idea.
brendan says that a good chunk of the time closing windows is in GC, we do it multiple times (once for each webshell that closes or something like that). perhaps that is related here. he says jst is working on that, cc'ing him.
A profile I took shows this as the fix for bug 45676 -- time spent in the frame manager. This one only shows up for really big pages.
My profile says: 58% of the time is spent in nsCSSFrameConsctutor::FindFrameWithContent (70% including descendents). Interestingly, this same routine is involved with serious selection performance problems. 2.9% is spent in js_MarkGCThing 2.7% in QI 2.6% in js_Mark 2.0% in nsCOMPtr<nsIContent>::assign_from_helper 1.7% in nsFrame::Destroy ... These data are when destroying an lxr CVS blame page. The numbers for the bugzilla query page (lots of widgets) are different: 5.4% in js_MarkGCThing 5.1% in js_Mark 5.0% in nsFrame::Destroy 4.0% in nsGenericElement::SetDocument ... Alas, these profiles don't allow me to see who's calling nsCSSFrameConsctutor::FindFrameWithContent, but it's probably called by GetPrimaryFrameFor.
Duh, actually the stack in the attachment shows that GetPrimaryFrameFor is being called from nsGenericElement::SetDocument. I wonder why.
Because of the fix for bug 45676.
OK... I'm going to give this to waterson since I really don't know what to do about it. Also cc: brendan. Here's the deal, as I understand it (which may be wrong...). The problem with bug 45676 was that anonymous content created by scrollbars wasn't being unrooted. Whenever content gets a script object that script object is rooted until the content gets a SetDocument(null,...) call. Content within scrollbars gets a script object when it gets a mouseover event because the tooltip JS stuff looks at it. So, we need to propogate the SetDocument(null, ...) call to the anonymous content within the scrollbar. The scrollbars are constructed something like this. Some code somewhere decides that things that have 'overflow' 'auto' or 'scroll' should have a GfxScrollFrame (which implements nsIAnonymousContentCreator). This GfxScrollFrame, which could correspond to any content, creates two scrollbar elements. These scrollbar elements create the rest of the anonymous content using XBL. XBL happily propogates the SetDocument from the scrollbar elements to their content. The problem is getting it *to* the scrollbar elements. There are basically two cases: * the GfxScrollFrame is for the canvas, and doesn't correspond to any content. This is handled in nsDocument::SetScriptGlobalObject. * the GfxScrollFrame is for something else, in which case the frame manager thinks it is the primary frame for that element. This is what's handled in nsGenericElement::SetDocument and where the time is being spent. We look at every element's primary frame to see if it is an nsIAnonymousContentCreator. (Why is it so slow? Does the frame manager not scale well?) Some thoughts on possible ways to fix this: * store a list of anonymous content created by all nsIAnonymousContentCreators (or maybe just the nsGfxScrollFrames) somewhere, and unroot it when we close the document. This has the disadvantage that it could create a lot of bloat if content is being created and destroyed dynamically, since it will all stay around until the document goes away. However, maybe that problem could be solved by unrooting in the Destroy methods of the frames too. I'm not sure... * do something similar to the above for the rooting of the script objects * make the frame manager faster (how bad is it?) * hack the GetFrameForPoint for GfxScrollFrame so the scrollbars never actually get events (and thus script objects) and everything is handled from above.
Assignee: dbaron → waterson
* bite the bullet and convert GFX scroll frames to XBL. :)
Actually, converting it to XBL is just too high-risk. We could possibly enhance the nsIBindingManager to be able to store the anonymous content from C++-generated objects. Then in the teardown (SetDocument to null), the binding manager can hand you back those objects so you can call SetDocument to null on them.
*** Bug 50085 has been marked as a duplicate of this bug. ***
I've been looking into 50085, which makes me think this is a general layout problem, not just a chrome issue. I think Hyatt is right. We need a mechanism for caching the anonymous content nodes in a list somewhere, and we need to walk that list at tear-down time rather than crawl from content to frame back to anonymous content, which is very expensive and rarely needed. We don't need to cache every anonymous content node, just the roots of the anonymous content subtree. The trick will be to keep this cache in synch with frame-level changes. If this turns out to be too much work for the time we have left, another solution would be to remove the offending call to GetPrimaryFrameFor and instead detect the tear-down case and walk the frame tree from the root to find the anonymous content. This would be a single walk over the entire frame tree, which isn't ideal, but is a heck of a lot better than calling GetPrimaryFrameFor on every single content node. I'm available to help.
How hard would it be to get walking the frame tree to work for removal of content and re-creation of frames? (I'm not sure the current code handles the latter.) Storing a list seems like the better solution, and it doesn't seem that hard. (I would think it just requires registering the anonymous content when it is created, unregistering it in frame destruction, and calling SetDocumentForAnonymousContent(nsnull, ...) during frame destruction (we should do this now, I think!) and from SetScriptGlobalObject.)
I still think the binding manager could be used to hold the lists. :)
*** Bug 51137 has been marked as a duplicate of this bug. ***
Hyatt said he's willing to do this.
Assignee: waterson → hyatt
When did he say that? We're trying to get stuff off of his plate...
P3. I don't think this really fits the profile for nsbeta3; it is an obscure worst case scenario that few users will ever encounter (I estimate 3,000 links on that page).
Priority: P1 → P3
Target Milestone: --- → M18
I guess I'll give this back to layout then, so that they can fix it if they want to.
Assignee: hyatt → waterson
It *is* still dogfood...
uh, ok.
Status: NEW → ASSIGNED
Talked to hyatt and evaughan about this and will start working on it.
The only thing in this change that I'm not sure of is the change to pass the root element instead of null in the call to BeginBuildingScrollFrame(). cc'ing evaughan, who may have more clues than I.
I still don't completely understand this, but here are some comments... Have you verified that the leaks (hover over input, hover over scrollbar) are still fixed? I wonder if there are places where you should be calling SetAnonymousContentFor(..., nsnull)? (What happens on re-frame? What does NS_ERROR do?) Some of the comments (and names of functions/variables) seem to imply that anonymous content and XBL are two different things. I thought XBL created some anonymous content, and is the preferred way of doing so. Maybe the comments should make it clearer that we're dealing with nsIAnonymousContentCreator-related anonymous content? Is it better to have the binding manager store the anonymous content, or have the binding manager just store the content nodes that have anonymous content and call SetDocumentForAnonymousContent on the frames for that content? (What happens on re-frame? Could we miss multiple instances of anonymous content for a given content node? What happens now?) Otherwise, should you remove SetDocumentForAnonymousContent? Also, nsCOMPtr<nsIAnonymousContentCreator> is evil because nsIAnonymousContentCreator is used for frames (right??). Should that pattern stop being used? (And should it be relected in a clearer comment in nsIAnonymousContentCreator.h?)
Attached file test case to force a reframe (deleted) —
I've verified that we're not leaking by using XPCOM_MEM_BLOAT_LOG and trying the old "wiggle the scrollbar" and "type into the URL bar" tests. With respect to a reframe (which the above test case will force), sho'nuf, we end up bonking into the NS_ERROR() case in nsBindingManager::SetAnonymousContentFor(). A way to deal would be to just dump the old elements at this point, copying the SetDocument(nsnull, ...) loop from ChangeDocumentFor(). I haven't been able to figure out how to get a script object created on these anonymous elements using just vanilla HTML. Anyone have any ideas on how to do that?
The better fix on a reframe would be to do as you suggest... detect the presence of anonymous content before you set the new anonymous content and make sure you set document to null on the old anonymous content.
I made a fix recently that probably plugged all of these leaks anyway. :) I made anonymous content properly retarget during DOM event bubble, which means that event.target can no longer refer to anonymous content, which means script objects don't get created during normal HTML use anyway. A better way to get the leak now would be to use XUL, since there are plenty of places in XUL where we use event.originalTarget in order to make sure the user didn't click on a scrollbar. Dragging a tree widget scrollbar should cause the leak.
I didn't mean to say "cause the leak". I meant to say "cause a script object to be created for anonymous content."
Incorporated feedback from hyatt and dbaron; verified that no leakage with tree widget scrolling, and verified as best I could that no problems with HTML reframe (cf. bug 53251).
Looks good.
fix checked in.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Marking verfied fixed in the Sept 28th build.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: