Closed Bug 25963 Opened 25 years ago Closed 24 years ago

excessive string conversion resolving :visited style

Categories

(Core :: CSS Parsing and Computation, defect, P1)

defect

Tracking

()

VERIFIED FIXED

People

(Reporter: waterson, Assigned: dbaron)

References

()

Details

(Keywords: perf, Whiteboard: [HAVE FIX] [nsbeta3+])

Attachments

(5 files)

The style system code that resolves whether or not a link has the :visited pseudo-attribute uses a version of NS_MakeAbsoluteURI() that does some pretty hefty string twiddling. Specifically, it allocates temporary storage from the heap, and does a pile of one-to-two-byte conversion (and vice versa). Since the anchor's HREF is gonna be in double-byte, and history expects a single-byte string, we're screwed and have to do this at least once; however, we should make sure that we don't do it -more- than that. I'll post quantify data on how much time this is taking.
Keywords: perf
Changing component to Style System
Assignee: troy → pierre
Component: Layout → Style System
QA Contact: petersen → chrisd
The description isn't very clear. NS_MakeAbsoluteURI() is Necko code: changed component and reassigned to owner.
Assignee: pierre → gagan
Component: Style System → Networking
QA Contact: chrisd → tever
pierre: i'll look at this. the problem is that we are converting URLs back and forth between single- and double-byte representations. The Necko APIs that the style system chose are hog-tied and have no choice but to allocate from the heap. So, we need to fix the style system (and possibly the nsILinkHandler interface) to make better API choices.
Assignee: gagan → waterson
Component: Networking → Style System
Thanks Chris. As you guessed, I'd prefer to leave you the initiative on that problem. When a solution is defined, replacing the calls to NS_MakeAbsoluteURI() shouldn't be a big deal.
Status: NEW → ASSIGNED
Target Milestone: M15
Target Milestone: M15 → M16
After checking in changes to use nsIURI objects instead of raw strings, we still see that nsCSSStyleSheet::SelectorMatches() is dominated by NS_NewURI() (~50%) and nsWebShell::GetLinkState() (~25%). I'm going to push this bug to M18, and revisit when we get the new string APIs landed. This will be an obvious candidate for using the new APIs.
Target Milestone: M16 → M18
We used to see a lot of time going to getting the IO service. I presume that's gone now. Another thing to investigate is converting nsIURI over to talk about wstrings. That might help minimize the number of translations.
We're still using the global to avoid hitting the service manager. I didn't dig into why NS_NewURI() was expensive (I assume string operations). The 25% in GetLinkState() is dominated (80%) by nsIURI::GetSpec(); I didn't dig further to see why.
*** Bug 36953 has been marked as a duplicate of this bug. ***
One thought about reducing the need for matching :link and :visited : Some of the rules in ua.css apply to *both* :link and :visited, which makes the check for whether the link is visited somewhat pointless. Perhaps you could create a new pseudo-class called ":-moz-any-link" and change all the rules in ua.css except the ones that need to differentiate to use this new pseudo-class and avoid the :link/:visited tests. If you want to do this, I could figure out the necessary ua.css changes (and how many :link/:visited selectors would be needed). It might also be possible to make this optimization in the style system itself so that it could apply to web pages that use :link and :visited - by looking (during parsing?) at rules with :link and :visited to see if they can be condensed.
probably can fix with 12493.
Assignee: waterson → pierre
Status: ASSIGNED → NEW
Pushing to M20 like bug 12493
Status: NEW → ASSIGNED
Target Milestone: M18 → M20
This bug is a dogfood bug for me. It causes Mozilla to lock up my machine for minutes at at time when loading CVS blame pages in LXR, like: <http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/editor/base/ nsHTMLEditor.cpp> So I can't use Mozilla for lxr and friends, basically.
Keywords: dogfood
I checked in changes to nsCSSStyleSheet.cpp for bug that push the canonifying code into the anchor tag, to be done once per link. This probably goes some of the way towards fixing this bug (see http://bugzilla.mozilla.org/show_bug.cgi?id=29611). We could continue to improve things by creating a special "nsILinkElement" interface (or something) that avoids copying strings altogether. I'll need to re-profile and see how much string conversion is actually going on now. pierre: I'm stealing this back for now.
Assignee: pierre → waterson
Status: ASSIGNED → NEW
sfraser - let us know itf the partial waterson checked in helps enough to take this off of dogfood for you. thanks!
Sorry, no bananas. Loading < http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/ editor/base/nsHTMLEditor.cpp> still makes my mac unusable for about 2 minutes. Offending stack is: ... 095F0670 PPC 1C296448 CSSRuleProcessor::RulesMatching(nsIPresContext*, nsIAtom*, nsIContent*, nsIStyleContext*, nsISupportsArray*)+00170 095F05D0 PPC 1C28BDD8 RuleHash::EnumerateAllRules(nsIAtom*, nsIAtom*, const nsVoidArray&, void (*)(nsICSSStyleRule*, void*), void*)+00278 095F0530 PPC 1C2961A8 ContentEnumFunc(nsICSSStyleRule*, void*)+0004C 095F04E0 PPC 1C295454 SelectorMatches(nsIPresContext*, nsCSSSelector*, nsIContent*, int)+00A28 095F0110 PPC 1C328720 nsHTMLAnchorElement::GetAttribute(int, nsIAtom*, nsString&) const+00030 095F00D0 PPC 1C31C224 nsGenericHTMLElement::GetAttribute(int, nsIAtom*, nsString&) const+00270 095F0010 PPC 1CBF5E08 nsString::Length() const+014E8 095EFEF0 PPC 1CC1BA9C nsString::SetLength(unsigned int)+00038 095EFEB0 PPC 1CC1BB30 nsString::SetCapacity(unsigned int)+00038 095EFE70 PPC 1CC1142C nsStr::GrowCapacity(nsStr&, unsigned int)+00060 095EFE10 PPC 1CC1132C nsStr::EnsureCapacity(nsStr&, unsigned int)+00038 095EFDC0 PPC 1CC12A9C nsStr::Realloc(nsStr&, unsigned int)+00058 095EFD60 PPC 1CC12930 nsStr::Alloc(nsStr&, unsigned int)+0007C 095EFD10 PPC 1CBED6A8 nsAllocator::Alloc(unsigned int)+00064 095EFCD0 PPC 1CBED4F8 nsAllocatorImpl::Alloc(unsigned int)+00014 095EFC90 PPC 1D7EBDC0 PR_Malloc+00014 095EFC50 PPC 1D852758 malloc+0007C I think we really need to start sharing strings here. We also need to to be yeilding to the user event queue much more frequently here, to make the app more responsive. While this page is loading in 4.x, I can keep typing in another app. Loading it in mozilla just brings my whole machine to its knees.
Putting on [dogfood+] radar.
Whiteboard: [dogfood+]
Well, I guess this had better be M17 then.
Status: NEW → ASSIGNED
Priority: P3 → P1
Target Milestone: M20 → M17
It turns out 12% of the time is spent in nsVoidArray::InsertElementAt(), which should probably use a better allocation algorithm. I'm investigating.
Whiteboard: [dogfood+] → [dogfood+] investigating
Talked with sfraser about this: apparently pink's checkin fixed the problem, which was Mac specific. We *still* are wasteful here with string conversion; moving out to M20 when NEW_STRING_APIS will be better defined and we'll have better tools to deal with this. Marking nsbeta3. sfraser: if this is still dogfood, let me know. also, there are a bevvy of other bugs related to flowed documents that are too long that i'm looking into right now: see 19051, 26028, 26030.
Keywords: dogfoodnsbeta3
Whiteboard: [dogfood+] investigating
Target Milestone: M17 → M20
I just did a jprof profile of loading a huge hiprof profile, and decided to look into this since SelectorMatches took almost a third of the total time spent (I had to kill the process before it finished, because I don't know if it would have finished). I'm not sure if any string conversion is needed. nsHTMLAnchorElement::GetHref converts one-byte to two-byte, and returns the value. nsWebShell::GetLinkState then converts back to one-byte. Keeping the string one-byte throughout this whole chain should save a good bit of time (hopefully it won't even need to be duplicated). Almost 20% of the time in SelectorMatches is spent in AppendWithConversion (nsString and nsCString), and about 10% in constructors and (mostly) destructors for the strings created. There are also some things in IsValidLink() that ought to be cleaned up (I don't think the author of the function expected it to be run quite as much as it is -- so that every little bit counts). I should probably post this profile somewhere...
David, your idea to prevent unnecessary one-two-one byte string conversion is an excellent one. We should try this out and compare the profiles (something you are probably already doing, right?). Also, I am curious what you think should be cleaned-up in the IsValidLink method... I indeed expected it to be run pretty often, but it has a lot of work to do when there IS a link. Please advise if you have ideas on how to improve it.
It's hard to tell from the hiprof profile testcase exactly how much this helps. The main problem is that the page won't finish loading (it crashes first). The first profile I ran, I stopped after a few minutes. It seems like SelectorMatches doesn't start getting called until about 3 minutes have passed, and in my baseline profile I stopped the load during the phase where it was being called. In any case, the profile is a good bit cleaner and it looks like a good bit (40% ??) of the time in SelectorMatches has been eliminated. (This figure is based on the assumption that, since I cut off the first run before the end of the styling phase, the time in nsGlobalHistory::GetLastVisitDate would stay constant. If I use other assumptions, I get smaller improvements, but I think that one may be more reliable.) That figure is still only for an obscure edge case - a page where *all* the elements are links (within one huge PRE). For more normal pages, the improvement would be *much* smaller. If I'm in the mood for backing these diffs out of my tree sometime, I'll try nsCSSFrameConstructor.cpp in LXR as a less demanding testcase...
- How about "nsILink". Since QI() is expressing an "is-a" relationship, it makes sense to say that "an nsHTMLAnchorElement is-a nsILink". nsILinkable makes it sound like you can "link against it" or something weird. - Use "const char**" rather than "const char*&"; use "PRBool*" rather than "PRBool&". It indicates that these are clearly mutable out parameters in the calling code. - In nsCSSStyleSheet, since we're touching "nsIAtom* contentTag" anyway, why not make it an nsCOMPtr and avoid the inevitable flow-of-control problems that will ensue later. Otherwise, this looks good!
One problem I see is that the nsHTMLLinkElement::GetHrefBuffer sets the mCanonicalBuffer to an empty-string if there is no HREF, but in IsValidLink you check for it to be null to determine if there is no HREF. I think that is going to resurface bug 23209. I cannot agree with Chris' second point, but I do agree with his other two points. Also, I agree that the changes look good (excepting the comment above).
Chris: 1) done. Hopefully nobody else will want nsILink for something more significant... (Should we use nsIHTMLLink??) 2) I also prefer pointers over reference variables (since I agree that reference parameters make code hard to understand), but I was trying to stick to the style already existing in the code. In nsCSSStyleSheet IsValidLink and IsSimpleXLink were actually the only functions I could see (except maybe one other) that used reference params for out params, so I changed those. However, I left GetHrefBuffer the way it was to match everything else in content nodes. 3) I guess. I'll have to check that nsCOMPtr stuff doesn't show up in the profile... Marc: I'm using the empty string as a dummy value to say that we have a cached value but that cached value is null. Even when it's set to an empty string, I still return null. Actually, a bigger change in the changes is to make Area and Link elements return the canonical HREF. I tested 4.x's DOM, and this is what AREA elements do in the 4.x DOM, so I think it's right. I still need to get the test to work in Mozilla. This probably also means there are places where NS_MakeAbsoluteURI could be removed elsewhere...
David, I see what you mean, I mesread the statement: if (*mCanonicalHref) { aBuf = mCanonicalHref; } Given that you are checking for a null-string here an not a null-ptr as I originally thought, is a check for a null mCanonicalHref necessary here since the preceeding call to strdup may return null?
Taking this bug so that I remember that I ought to check this stuff in when the tree opens for nsbeta3. Marc: I added the additional null check in my tree. I realize the another thing that we could be doing (that would speed things up even more) is caching the *visited state* of the link in the link element. If we don't do that, we should definitely work on improving the performance of nsGlobalHistory::GetLastVisitDate (waterson said that there is a new Mork API that allows this to be done faster). What do others think? Caching the visited state could make it harder to implement re-coloring of links while a page is shown, but then again, that seems hard to implement anyway without big performance problems...
Assignee: waterson → dbaron
Status: ASSIGNED → NEW
filed bug 45668 for global history rework.
Target Milestone: M20 → M18
[HAVE FIX] for perf issue. PDT please approve checkin.
Whiteboard: [HAVE FIX]
Keywords: review
dbaron: did you add eLinkState_Unknown to nsILinkHandler, as well?
Attached patch forgot to include these diffs (deleted) — Splinter Review
With these changes, plus the changes from bug 19051, total time to layout "a long LXR page" http://lxr.mozilla.org/mozilla/source/rdf/content/src/nsXULTemplateBuilder.cpp Drops from 35s to 20s. (With 19051 alone, it drops from 35s to 25s, so this shaves off another 5s.) For reference, 4.x lays out the page in ~18s; IE4 lays it out in ~16s. I looked at the changes, and feel "familiar enough" to say r= for the whole lot of them. There are changes to embedding, HTML content, and style, so if valeski, jst, attinasi, or pierre wants to double-check, please speak up now! Great work, dbaron!
Since we are caching the link state now are we at risk of not reverting a visited link to a non-visited link when the user clears their history (while he same page is up)? More generally, it seems that by not going to the history to get the link status we loose the ability to be 'in-synch' with the history (out-of-date may act differently too if the link goes out-of-date while the page is up). That said, I think the performance benefit outweighs the picky correctness issue with clearing history while a page is up, so I'm all for it.
Status: NEW → ASSIGNED
Keywords: reviewapproval
a=waterson. I've been running with these changes for a while now and they rock. Like Cleveland.
plusing, run forrest, run!
Whiteboard: [HAVE FIX] → [HAVE FIX] [nsbeta3+]
Fix checked in, 2000-07-27 16:17.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
marking verified per engineer's comments
Status: RESOLVED → VERIFIED
*** Bug 12713 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: